Help
  • Explore Community
  • Get Started
  • Ask the Community
  • How-To & Best Practices
  • Contact Support
Notifications
Login / Register
Community
Community
Notifications
close
  • Forums
  • Knowledge Center
  • Events & Webinars
  • Ideas
  • Blogs
Help
Help
  • Explore Community
  • Get Started
  • Ask the Community
  • How-To & Best Practices
  • Contact Support
Login / Register
Sustainability
Sustainability

We Value Your Feedback!
Could you please spare a few minutes to share your thoughts on Cloud Connected vs On-Premise Services. Your feedback can help us shape the future of services.
Learn more about the survey or Click here to Launch the survey
Schneider Electric Services Innovation Team!

Random time-out on SNMP queries

APC UPS Data Center & Enterprise Solutions Forum

Schneider, APC support forum to share knowledge about installation and configuration for Data Center and Business Power UPSs, Accessories, Software, Services.

cancel
Turn on suggestions
Auto-suggest helps you quickly narrow down your search results by suggesting possible matches as you type.
Showing results for 
Show  only  | Search instead for 
Did you mean: 
  • Home
  • Schneider Electric Community
  • APC UPS, Critical Power, Cooling and Racks
  • APC UPS Data Center & Enterprise Solutions Forum
  • Random time-out on SNMP queries
Options
  • Subscribe to RSS Feed
  • Mark Topic as New
  • Mark Topic as Read
  • Float this Topic for Current User
  • Bookmark
  • Subscribe
  • Mute
  • Printer Friendly Page
Invite a Co-worker
Send a co-worker an invite to the portal.Just enter their email address and we'll connect them to register. After joining, they will belong to the same company.
You have entered an invalid email address. Please re-enter the email address.
This co-worker has already been invited to the Exchange portal. Please invite another co-worker.
Please enter email address
Send Invite Cancel
Invitation Sent
Your invitation was sent.Thanks for sharing Exchange with your co-worker.
Send New Invite Close
Top Experts
User Count
BillP
Administrator BillP Administrator
5060
voidstar_apc
Janeway voidstar_apc
196
Erasmus_apc
Sisko Erasmus_apc
112
Teken
Spock Teken
110
View All

Invite a Colleague

Found this content useful? Share it with a Colleague!

Invite a Colleague Invite
Solved Go to Solution
Back to APC UPS Data Center & Enterprise Solutions Forum
Solved
Anonymous user
Not applicable

Posted: ‎2021-07-01 05:07 AM . Last Modified: ‎2024-03-05 01:46 AM

0 Likes
6
2330
  • Mark as New
  • Bookmark
  • Subscribe
  • Mute
  • Subscribe to RSS Feed
  • Permalink
  • Print
  • Email to a Friend
  • Report Inappropriate Content

Link copied. Please paste this link to share this article on your social media post.

Posted: ‎2021-07-01 05:07 AM . Last Modified: ‎2024-03-05 01:46 AM

Random time-out on SNMP queries

Good morning.

At work we have 12 x ACRC103 units, an UPS and some PDU's and an Ifastruxure server

We have a check_mk server that check periodically these units (every two minutes) with a custom python script (I will publish it here in next days for who needs) that simply perform some SNMP query to detect some values.

We noticed that always there is one or more server (but never more than 2-3) that does not reply in time and give timeout. The strange thing is that are not always the same server, but they are rotating every 5-10 minutes.

I tried to call snmpwalk manually and I noticed that sometime it happens that the snmpwalk simply hangs and don't answer anymore.

What we did to debug this problem:

1) Changed IP address to find out if was an IP conflict

2) Changed check_mk server to find out if it was a server problem

3) Increate the SNMP timeout up to the max possible

4) Reduce the ckeck frequency

My suspect is that when the Infastruxure server contact the units, the unit simply hang and does not answer anymore to snmp queries. Is it possible?

Another thing I noticed when I was checking the firmware version is that seems there are two different version installed in the same unit. Is this ok?

Thank you in advance

Michele

Attachments
Labels
  • Labels:
  • UPS Management Devices & PowerChute Software
Reply

Link copied. Please paste this link to share this article on your social media post.

  • All forum topics
  • Previous Topic
  • Next Topic

Accepted Solutions
BillP
Administrator BillP Administrator
Administrator

Posted: ‎2021-07-01 05:07 AM . Last Modified: ‎2024-03-05 01:45 AM

0 Likes
0
2329
  • Mark as New
  • Bookmark
  • Subscribe
  • Mute
  • Subscribe to RSS Feed
  • Permalink
  • Print
  • Email to a Friend
  • Report Inappropriate Content

Link copied. Please paste this link to share this article on your social media post.

Posted: ‎2021-07-01 05:07 AM . Last Modified: ‎2024-03-05 01:45 AM

Hi Michele.

I would definitely recommend to upgrade the Central server to the newest version.You would need a software support contract to receive the link for upgrade.

Is it possible to post the python script for review?

In your initial post, you indicate that 2-3 servers do not reply and you get timeouts. Where does it show these timeouts? On Central? On another polling system? Please specify.

At the time of the timeouts, are you able to connect to the devices using a different SNMP Utility? What does this show? If the device is not reachable via SNMP (at the time of the timeout/comms loss) and there are no events on the device, then it sounds like it could point to a network issue.

What utility are you using for your SNMPWalk? Did you carry out an SNMPWalk on the APC devices? Do these devices timeout?

Do you get any timeouts/comms loss alarms on Central for the devices that are listed above? If so, is it possible to send in the logs? If would also be helpful to send in the alarm history on Central for those devices that are timing out.

What is the timeout/retries specified in Central? What is the scan interval set to?

How many devices in total are you monitoring via Central?

Are you using a proxy?

Sometimes if you are using other applications to poll the same devices, this could be interfering with our polling. When did you first notice the issue? Any changes been made to the Network? If it possible as a test to disable the other polling applications and just use Central to poll the devices as a test?

What happens when you directly poll this devices? Is there any timeouts then?

Are all the devices on the same subnet? Is there heavy traffic on this subnet that might be causing issues/timeouts? Have you tried to move one of the devices to a different subnet as a test?

Are the devices losing comms at the same time or a different times? Is it the same time every day? How many times a day? Is it random?

Is it possible to run a packet capture? If you can, we should be able to tell if the Central is polling the devices or not and if they are responding.

Regards,

B

See Answer In Context

Reply

Link copied. Please paste this link to share this article on your social media post.

Replies 6
Anonymous user
Not applicable

Posted: ‎2021-07-01 05:07 AM . Last Modified: ‎2024-03-05 01:46 AM

0 Likes
0
2330
  • Mark as New
  • Bookmark
  • Subscribe
  • Mute
  • Subscribe to RSS Feed
  • Permalink
  • Print
  • Email to a Friend
  • Report Inappropriate Content

Link copied. Please paste this link to share this article on your social media post.

Posted: ‎2021-07-01 05:07 AM . Last Modified: ‎2024-03-05 01:46 AM

Good morning.

At work we have 12 x ACRC103 units, an UPS and some PDU's and an Ifastruxure server

We have a check_mk server that check periodically these units (every two minutes) with a custom python script (I will publish it here in next days for who needs) that simply perform some SNMP query to detect some values.

We noticed that always there is one or more server (but never more than 2-3) that does not reply in time and give timeout. The strange thing is that are not always the same server, but they are rotating every 5-10 minutes.

I tried to call snmpwalk manually and I noticed that sometime it happens that the snmpwalk simply hangs and don't answer anymore.

What we did to debug this problem:

1) Changed IP address to find out if was an IP conflict

2) Changed check_mk server to find out if it was a server problem

3) Increate the SNMP timeout up to the max possible

4) Reduce the ckeck frequency

My suspect is that when the Infastruxure server contact the units, the unit simply hang and does not answer anymore to snmp queries. Is it possible?

Another thing I noticed when I was checking the firmware version is that seems there are two different version installed in the same unit. Is this ok?

Thank you in advance

Michele

Reply

Link copied. Please paste this link to share this article on your social media post.

Anonymous user
Not applicable

Posted: ‎2021-07-01 05:07 AM . Last Modified: ‎2024-03-05 01:46 AM

0 Likes
0
2330
  • Mark as New
  • Bookmark
  • Subscribe
  • Mute
  • Subscribe to RSS Feed
  • Permalink
  • Print
  • Email to a Friend
  • Report Inappropriate Content

Link copied. Please paste this link to share this article on your social media post.

Posted: ‎2021-07-01 05:07 AM . Last Modified: ‎2024-03-05 01:46 AM

Hello Angela.

Here there are the required informations:

1) We have an InfraStruxure Central - Version 6.2.0

2) We have a dedicated private lan for these devices:

    12 x ACRC103  (application acrc v.3.7.0, os aos 3.7.3 - Is it ok to have different versions?)

     1 x 0G-9354-01 - PDU  (application xrdp v.3.7.0, os aos v.3.7.3)

     2 x AP7957 - switched rack pdu (application rpdu v.3.7.0, os aos v.3.7.0)

     1 x AP7853 - rack pdu (application rpdu v.2.6.5, os aos v.2.6.4)

     1 x AP7853 - metered rack pdu (application rpdu v.3.5.5, os aos v.3.5.6)

3) SNMP versiomn: v1 (v3 is disabled)

4) I don't have to restart anything. After some time the one that was in timeout is running back again but another (or other 2-3) show the same problem.

5) There are quite a lot OID requested: 9 OID for PDU (every OID returns 22 entries), 30 OID for ACRC103 (every OID returns just one value)

Thank you in advance

Michele

Reply

Link copied. Please paste this link to share this article on your social media post.

BillP
Administrator BillP Administrator
Administrator

Posted: ‎2021-07-01 05:07 AM . Last Modified: ‎2024-03-05 01:45 AM

0 Likes
0
2329
  • Mark as New
  • Bookmark
  • Subscribe
  • Mute
  • Subscribe to RSS Feed
  • Permalink
  • Print
  • Email to a Friend
  • Report Inappropriate Content

Link copied. Please paste this link to share this article on your social media post.

Posted: ‎2021-07-01 05:07 AM . Last Modified: ‎2024-03-05 01:45 AM

QueenB - does this sound related to ISX Central at all (since it is several revisions behind the current)? I was leaning towards no since it sounds like Michele Renda has attached to the APC LAN and done an SNMPWalk on the devices which don't respond either but they come back to life by themselves.

On the AOS and APP versions, it's OK to have different numbers. Sometimes they are the same, sometimes they are not. AOS v3.7.4 and rpdu v3.7.4 are the latest versions for rpdu. v2.X.X is quite old (like 10 years old!!) and is there a reason you have not updated your AP78XX or AP79XX devices to newer versions? Your xrdp ISX PDU device is only one or two revisions behind. I certainly cannot vouch for SNMP stability on the v2.X.X firmwares but it should be OK on most of the 3.X.X firmware level stuff. I'd still consider upgrading anything that can use upgrading if you wanted to in order to rule out any issues. If there is still a problem, we at least know it is occurring on the latest production revs we offer.

Are there any APC devices on the APC LAN that are not experiencing this issue in the same way? And from what you said, it's random, correct? So it's not like they all timeout at the same time?

When they don't respond, does it get say it's a timeout completely or do devices return partial values just really slowly..

From what I see, check_mk is a nagios plug in? Is that just a computer you have on the APC LAN that pings these devices or retrieves a particular set of OIDs too? (in addition to the status that ISX Central can provide?)

Reply

Link copied. Please paste this link to share this article on your social media post.

Anonymous user
Not applicable

Posted: ‎2021-07-01 05:07 AM . Last Modified: ‎2024-03-05 01:45 AM

0 Likes
0
2329
  • Mark as New
  • Bookmark
  • Subscribe
  • Mute
  • Subscribe to RSS Feed
  • Permalink
  • Print
  • Email to a Friend
  • Report Inappropriate Content

Link copied. Please paste this link to share this article on your social media post.

Posted: ‎2021-07-01 05:07 AM . Last Modified: ‎2024-03-05 01:45 AM

Hello Angela.

Thank you for your answers. I followed your suggestion and I updated the firmware of almost all the units (I have still something to complete) and the problem is still there.

Now, the only thing I needs to complete is the "InfraStruxure Central - Version 6.2.0" unit. I tried to look for a firmware update but no success. Do you know if there is a public available update in apc.com site? We don't have (anymore) a support contracrt with APC.

Thank you very much for your support and have a nice day.

Regards

Michele

Reply

Link copied. Please paste this link to share this article on your social media post.

BillP
Administrator BillP Administrator
Administrator

Posted: ‎2021-07-01 05:07 AM . Last Modified: ‎2024-03-05 01:45 AM

0 Likes
0
2330
  • Mark as New
  • Bookmark
  • Subscribe
  • Mute
  • Subscribe to RSS Feed
  • Permalink
  • Print
  • Email to a Friend
  • Report Inappropriate Content

Link copied. Please paste this link to share this article on your social media post.

Posted: ‎2021-07-01 05:07 AM . Last Modified: ‎2024-03-05 01:45 AM

Hi Michele,

You'll need to have a paid support contract as far as I know in order to obtain an update for your server..

Reply

Link copied. Please paste this link to share this article on your social media post.

BillP
Administrator BillP Administrator
Administrator

Posted: ‎2021-07-01 05:07 AM . Last Modified: ‎2024-03-05 01:45 AM

0 Likes
0
2330
  • Mark as New
  • Bookmark
  • Subscribe
  • Mute
  • Subscribe to RSS Feed
  • Permalink
  • Print
  • Email to a Friend
  • Report Inappropriate Content

Link copied. Please paste this link to share this article on your social media post.

Posted: ‎2021-07-01 05:07 AM . Last Modified: ‎2024-03-05 01:45 AM

Hi Michele.

I would definitely recommend to upgrade the Central server to the newest version.You would need a software support contract to receive the link for upgrade.

Is it possible to post the python script for review?

In your initial post, you indicate that 2-3 servers do not reply and you get timeouts. Where does it show these timeouts? On Central? On another polling system? Please specify.

At the time of the timeouts, are you able to connect to the devices using a different SNMP Utility? What does this show? If the device is not reachable via SNMP (at the time of the timeout/comms loss) and there are no events on the device, then it sounds like it could point to a network issue.

What utility are you using for your SNMPWalk? Did you carry out an SNMPWalk on the APC devices? Do these devices timeout?

Do you get any timeouts/comms loss alarms on Central for the devices that are listed above? If so, is it possible to send in the logs? If would also be helpful to send in the alarm history on Central for those devices that are timing out.

What is the timeout/retries specified in Central? What is the scan interval set to?

How many devices in total are you monitoring via Central?

Are you using a proxy?

Sometimes if you are using other applications to poll the same devices, this could be interfering with our polling. When did you first notice the issue? Any changes been made to the Network? If it possible as a test to disable the other polling applications and just use Central to poll the devices as a test?

What happens when you directly poll this devices? Is there any timeouts then?

Are all the devices on the same subnet? Is there heavy traffic on this subnet that might be causing issues/timeouts? Have you tried to move one of the devices to a different subnet as a test?

Are the devices losing comms at the same time or a different times? Is it the same time every day? How many times a day? Is it random?

Is it possible to run a packet capture? If you can, we should be able to tell if the Central is polling the devices or not and if they are responding.

Regards,

B

Reply

Link copied. Please paste this link to share this article on your social media post.

Preview Exit Preview

never-displayed

You must be signed in to add attachments

never-displayed

 
To The Top!

Forums

  • APC UPS Data Center Backup Solutions
  • EcoStruxure IT
  • EcoStruxure Geo SCADA Expert
  • Metering & Power Quality
  • Schneider Electric Wiser

Knowledge Center

Events & webinars

Ideas

Blogs

Get Started

  • Ask the Community
  • Community Guidelines
  • Community User Guide
  • How-To & Best Practice
  • Experts Leaderboard
  • Contact Support
Brand-Logo
Subscribing is a smart move!
You can subscribe to this board after you log in or create your free account.
Forum-Icon

Create your free account or log in to subscribe to the board - and gain access to more than 10,000+ support articles along with insights from experts and peers.

Register today for FREE

Register Now

Already have an account? Login

Terms & Conditions Privacy Notice Change your Cookie Settings © 2025 Schneider Electric

This is a heading

With achievable small steps, users progress and continually feel satisfaction in task accomplishment.

Usetiful Onboarding Checklist remembers the progress of every user, allowing them to take bite-sized journeys and continue where they left.

of