EcoStruxure IT forum
A support forum for Data Center Operation, Data Center Expert, and EcoStruxure IT product users to share knowledge on installation, configuration, and general product use.
Posted: 2022-08-31 12:59 AM
Hi Team,
we have two UPSs in DCE 7.8.1. Both are the same model and with the same firmware version. But one of them reports network communication error in DCE.
We have done a snmpwalk in DCE web page and it worked. We have also logged into the UPS web page to check that everything is ok in the configuration, and both UPSs have the same configuration parameters.
But we do not know why one fails, when it responds to PING and SNMPWALK from DCE server web page. We have tried to delete the device and discover again and the error appears again.
Thanks in advance
Posted: 2022-09-08 11:51 AM
Is the UPS using SNMPv1 or SNMPv3? If it's v3, I would try rebooting DCE and see if it comes back online. Has this ever worked? If not, maybe try deleting and re-adding it? Is there anything else different between the 2 UPS'? Are they on the same firmware?
Posted: 2022-09-09 05:17 AM
Hi,
both UPSs have the same firmware version, the same SNMP v3 settings and both were working previously in DCE. I have rebooted DCE but the second UPS says that is offline, but I can enter to this UPS by its web page and also a SNMP walk works perfectly from DCE to this UPS.
Regards
Posted: 2022-09-09 09:03 AM
can you delete and re-add it via snmpv3 or even better snmpv1 and see if it works?
Although rare, it's possible the devices aren't correctly handling the EngineBoots & EngineTime values in snmpv3. This will mean that snmpwalk works because it's a one-off, there's no persistent communication between the two snmp engines - it can't get out of sync if there's nothing to keep in sync. But because the DCE & the end-devices both have persistent engines, they need to agree on these values.
Unfortunately if this is the issue, there's little we can do to resolve it on the DCE side - rejecting these values is the correct defined behaviour support RFC3414, section 3.2(7) rfc3414 section 3.2
To confirm this behaviour:
- Remove the affected devices from the DCE.
- Reboot the DCE.
- Discover the devices again. If the issue is as I believe, this will appear to behave correctly (for now).
- Once you're happy the devices are communicating correctly, reboot one of the devices.
- The DCE should lose communication with this device, and fail to re-establish communication even though the device appears to be available.
- Removing and re-discovering the device will have no affect until next time the DCE is rebooted. (which is the position most the devices find themselves in currently).
The above was a response from our engineering team the last time I encountered this issue.
Create your free account or log in to subscribe to the forum - and gain access to more than 10,000+ support articles along with insights from experts and peers.