EcoStruxure IT forum
Schneider Electric support forum about installation and configuration for DCIM including EcoStruxure IT Expert, IT Advisor, Data Center Expert, and NetBotz
Link copied. Please paste this link to share this article on your social media post.
Posted: 2020-07-04 03:10 PM . Last Modified: 2024-04-05 02:24 AM
Getting lost communication alerts for multiple UPS at the same time every night from Struxureware DCE. When I look at the network management card there are no communication errors. Any ideas?
(CID:129406023)
Link copied. Please paste this link to share this article on your social media post.
Link copied. Please paste this link to share this article on your social media post.
Posted: 2020-07-04 03:11 PM . Last Modified: 2024-04-05 02:24 AM
Hi Todd,
If you were to get communications alarms on the cards themselves, that would indicate a communications issue between the card and the UPS. It would be very strange to get that on multiple units at the same time.
If you're seeing multiple UPS units report as lost comm in DCE, that would more likely be a network communications issue. There are many reasons this can happen but when you have multiple units doing this at the same time, that is usually due to a network outage or a component having issues or being rebooted. If the units are all on the same switch or router, I would look at those components to see if they've been rebooting.
Another thing to look at is when these issues occur, what is happening on the network or on the DCE system itself? If this is happening every night at the same time, is there some type of backup going on? If this is a VM DCE, what is happening on the host?
Thanks,
Steve
(CID:129406029)
Link copied. Please paste this link to share this article on your social media post.
Link copied. Please paste this link to share this article on your social media post.
Posted: 2020-07-04 03:11 PM . Last Modified: 2024-04-05 02:24 AM
Hi Todd,
If you were to get communications alarms on the cards themselves, that would indicate a communications issue between the card and the UPS. It would be very strange to get that on multiple units at the same time.
If you're seeing multiple UPS units report as lost comm in DCE, that would more likely be a network communications issue. There are many reasons this can happen but when you have multiple units doing this at the same time, that is usually due to a network outage or a component having issues or being rebooted. If the units are all on the same switch or router, I would look at those components to see if they've been rebooting.
Another thing to look at is when these issues occur, what is happening on the network or on the DCE system itself? If this is happening every night at the same time, is there some type of backup going on? If this is a VM DCE, what is happening on the host?
Thanks,
Steve
(CID:129406029)
Link copied. Please paste this link to share this article on your social media post.
Link copied. Please paste this link to share this article on your social media post.
Posted: 2020-07-04 03:11 PM . Last Modified: 2024-04-05 02:24 AM
it's happened every evening since last Wednesday at 6:06 pm and comes back at 06:11. I verified the switch port did not restart, nothing on the router logs. There are 4 UPS' at that site. One night it was one of them, another night 2, last night it was all 4. There is nothing on the management card that states the card rebooted or anything. Is it possible that site is losing ping to the Struxureware server and those device just happen to check in at that time?
(CID:129406185)
Link copied. Please paste this link to share this article on your social media post.
Link copied. Please paste this link to share this article on your social media post.
Posted: 2020-07-04 03:11 PM . Last Modified: 2024-04-05 02:24 AM
The devices are being polled presumably every 5 minutes using SNMP. The 5 minute poll time would be logical due to the issue happening every night at 6:06. If the devices are all at remote sites and only that one remote site is having an issue, it would appear that it is not something to do with DCE or the site it is on but rather the site these devices are on.
It is more likely SNMP than ping but it's the same theory. Are there any logs stating that something other than DCE is polling them via SNMP? Is it possible that a network or more specifically and SNMP traffic storm blocked access to the device?
Steve
(CID:129406198)
Link copied. Please paste this link to share this article on your social media post.
Link copied. Please paste this link to share this article on your social media post.
Posted: 2020-07-04 03:11 PM . Last Modified: 2024-04-05 02:24 AM
I increased the timeout and the number of retries on the device scan settings in DCE and will see if that helps.
(CID:129406257)
Link copied. Please paste this link to share this article on your social media post.
Link copied. Please paste this link to share this article on your social media post.
Posted: 2020-07-04 03:11 PM . Last Modified: 2024-04-05 02:24 AM
Increasing the timeout settings seems to be working.
(CID:129406769)
Link copied. Please paste this link to share this article on your social media post.
Link copied. Please paste this link to share this article on your social media post.
Posted: 2020-07-04 03:11 PM . Last Modified: 2024-04-05 02:24 AM
Great! must have been traffic related at that time.
(CID:129406776)
Link copied. Please paste this link to share this article on your social media post.
Link copied. Please paste this link to share this article on your social media post.
Posted: 2020-07-04 03:11 PM . Last Modified: 2024-04-05 12:41 AM
That's what i'm thinking. Thanks for your help.
(CID:129406784)
Link copied. Please paste this link to share this article on your social media post.
Link copied. Please paste this link to share this article on your social media post.
Posted: 2020-07-04 03:11 PM . Last Modified: 2024-04-05 12:41 AM
Where do I change that setting to increase the timeout?
(CID:137110005)
Link copied. Please paste this link to share this article on your social media post.
Link copied. Please paste this link to share this article on your social media post.
Posted: 2020-07-04 03:12 PM . Last Modified: 2024-04-05 12:41 AM
Hi Jim,
Under DCE's main menu, go to Device–>SNMP Device Communications Settings–>Device Scan Settings.
Choose the device or devices you want to edit then hit the button to Edit Device Scan Settings.
Timeouts and retries are listed there:
Steve
(CID:137110022)
Link copied. Please paste this link to share this article on your social media post.
Link copied. Please paste this link to share this article on your social media post.
Posted: 2020-07-04 03:12 PM . Last Modified: 2023-10-22 09:32 PM
This question is closed for comments. You're welcome to start a new topic if you have further comments on this issue.
Link copied. Please paste this link to share this article on your social media post.
Create your free account or log in to subscribe to the board - and gain access to more than 10,000+ support articles along with insights from experts and peers.