EcoStruxure IT forum
Schneider Electric support forum about installation and configuration for DCIM including EcoStruxure IT Expert, IT Advisor, Data Center Expert, and NetBotz
Link copied. Please paste this link to share this article on your social media post.
Posted: 2020-07-04 02:04 PM . Last Modified: 2024-04-05 02:45 AM
I just updated to 7.5.0 about 10 days and one day to another multiple APC devices alerted communication lost error:
As you can see the devices are different models, types but not all alerted; some AP8853 are offline but another is online. The same models have the same firmware (6.x.y).
I have other remote site which shows the same problem.
It is definitely not a network issue because I did the following troubleshooting in DCE web interface:
I also checked in the server log but nothing tells more on the communication lost.
I rebooted the server and all the communication lost devices came back! But after 2-3 hours the devices went offline (at the moment the server uptime is 5 hours)
Anyway the default scan interval is 5 minutes globally.
Can you please help me? I have that fearing I will lose more and more devices from monitoring.
(CID:128754698)
Link copied. Please paste this link to share this article on your social media post.
Link copied. Please paste this link to share this article on your social media post.
Posted: 2020-07-04 02:04 PM . Last Modified: 2024-04-05 02:45 AM
Hi Mate how many devices monitored ?
what is the config cpu ram hw of your Dce server?
ed
(CID:128754721)
Link copied. Please paste this link to share this article on your social media post.
Link copied. Please paste this link to share this article on your social media post.
Posted: 2020-07-04 02:04 PM . Last Modified: 2024-04-05 02:45 AM
Hi Mate
Not sure why my comment was down voted but happy to discuss if you or someone else would care to discuss.
Sounds like you've been given good advise re firmware version and SNMP V3. How are things progressing now?
The only other DCE server perf area that I've had issues with in the past is the IO performance, measured in IOPS. Poor IO perf can adversely affect DCE performance in larger sites, e.g. 2000+ nodes but this might be an issue at your site. Although judging by the VM config and the latency and throughput, this isn't the issue.
Regards
Ed
(CID:128755261)
Link copied. Please paste this link to share this article on your social media post.
Link copied. Please paste this link to share this article on your social media post.
Posted: 2020-07-04 02:05 PM . Last Modified: 2024-04-05 02:45 AM
Hi Ed,
I did not down voted your answer.
Yes, SNMPv3 and firmware version 6.4.6 caused the problem. However it was pretty strange that these devices worked well more than 2 weeks without any problem.
By the way I read thru the mentioned K-base FA305661 article and I had devices where the following statement was true:
And these devices were also alerted communication lost.
Anyway thank you for the quick help and suggestion and I will take care more on firmware versions. But I have bad experiencies with it because some devices (NMC) can be into hung state during the upgrade and only physical reset is solving the problem. I monitoring devices in more than 40 remote sites where remote hand support is not always available.
Regards,
Mate
(CID:128757840)
Link copied. Please paste this link to share this article on your social media post.
Link copied. Please paste this link to share this article on your social media post.
Posted: 2020-07-04 02:05 PM . Last Modified: 2024-04-05 02:44 AM
I have 653 monitored physical devices and 89 virtual sensors (total 742 devices)
DCE is running on VMware: 4 vCPU 16GB RAM with 18+255GB HDD. The repository shows 242.3 GB as size and 6.6GB is in use. Here is a screenshot from VMware resources:
(CID:128757845)
Link copied. Please paste this link to share this article on your social media post.
Link copied. Please paste this link to share this article on your social media post.
Posted: 2020-07-04 02:05 PM . Last Modified: 2024-04-05 02:44 AM
Dear Mate Fekete,
As you can see the devices are different models, types but not all alerted; some AP8853 are offline but another is online. The same models have the same firmware (6.x.y).
...
...the device scan option which matched with device settings (using SNMPv3).
From your screenshot: what version of the firmware are you using on your UPS and rPDU? To solve the SNMPv3 problem for the UPS, firmware v6.5.0 is required, and for rPDU, firmware v6.5.2 is required.
With respect.
(CID:128754737)
Link copied. Please paste this link to share this article on your social media post.
Link copied. Please paste this link to share this article on your social media post.
Posted: 2020-07-04 02:05 PM . Last Modified: 2024-04-05 02:44 AM
Dear Mate Fekete,
I started my answer to you from the firmware versions because v6.4.6 with active use of SNMPv3 should be avoided in every possible way. This has already been discussed many times here on EcoStruxure IT, for example in topic . See Steven Marchetti answer on this issue:
...So the reason it was not added to the DCE firmware catalog is an issue with SNMP version 3. I thought about this after my previous answer but wanted to verify and get the k-base number before saying anything. K-base FA305661 outlines the issue which could cause problems with people using SNMP version 3. So as not to break their systems, it was decided not to add it to the DCE catalog.
Therefore, if the device does not have a firmware higher than v6.4.6, then to solve the problem you need to downgrade the firmware to the recommended version of v6.4.4. This is always possible using a special utility.
If there are more questions, please ask.
With respect.
(CID:128754821)
Link copied. Please paste this link to share this article on your social media post.
Link copied. Please paste this link to share this article on your social media post.
Posted: 2020-07-04 02:05 PM . Last Modified: 2024-04-05 02:44 AM
Thank you. I updated the firmware and it solved the problem on some devices but not at all. Here is another screenshot:
The latest firmware for AP7822B is the 6.4.6
As you can see it is pretty strange because the PDU with 6.4.4 are working properly but 6.4.6 are failed and not all!
(CID:128757851)
Link copied. Please paste this link to share this article on your social media post.
Link copied. Please paste this link to share this article on your social media post.
Posted: 2020-07-04 02:05 PM . Last Modified: 2023-10-22 01:11 AM
This question is closed for comments. You're welcome to start a new topic if you have further comments on this issue.
Link copied. Please paste this link to share this article on your social media post.
Create your free account or log in to subscribe to the board - and gain access to more than 10,000+ support articles along with insights from experts and peers.