EcoStruxure IT forum
Schneider Electric support forum about installation and configuration for DCIM including EcoStruxure IT Expert, IT Advisor, Data Center Expert, and NetBotz
Link copied. Please paste this link to share this article on your social media post.
Posted: 2020-07-02 09:21 AM . Last Modified: 2024-04-10 01:15 AM
Is anyone experiencing DCE VM (colo) edition intermittently losing communications with sensors and/or meters? The error "lost communications" with some PDUs appears every couple of nights around 1am. The DCE instance is running on a VMware platform which is connected to a separate VLAN than the sensors/meters. Does DCE need to be on the same VLAN to stop this happening - or are there any other suggestions?
(CID:93192917)
Link copied. Please paste this link to share this article on your social media post.
Link copied. Please paste this link to share this article on your social media post.
Posted: 2020-07-02 09:21 AM . Last Modified: 2024-04-10 01:15 AM
I had the same issue, here are the steps which I took to resolve it.
All devices have the IP Addresses set statically
I removed all the devices from DNS and added them manually with static IP addresses.
Also I have excluded the IP addresses from DHCP (the devices have them set manually but think that DHCP was trying to issue them hence loosing connection).
From my understanding, DCE doesn't have to be on the same VLAN, as long as the VLAN it is on can ping the Netbotz and sensor pods.
Hope this helps.
(CID:93192941)
Link copied. Please paste this link to share this article on your social media post.
Link copied. Please paste this link to share this article on your social media post.
Posted: 2020-07-02 09:21 AM . Last Modified: 2024-04-10 01:15 AM
Hi Mike,
Without having network packet captures during the time of the lost comms, it's tough to know exactly what the reason would be. If this seems to be a specific subset of PDUs (and they're SNMP), you can try to up the timeout or retries on these devices and see if that addresses the problem. I would start with timeout since we already have 3 retries by default.
The setting is under the Device -> SNMP Device Communication Settings -> Device Scan Settings.
If they're Modbus PDUs, there is also a timeout setting available under the "Device" menu.
The fact that they're on a separate VLAN should not be a problem, especially since you've already discovered them and are getting sensor data 99% of the time. If you were not able to discover them initially then I'd say you'd need to look more closely at the network configuration.
(CID:93192965)
Link copied. Please paste this link to share this article on your social media post.
Link copied. Please paste this link to share this article on your social media post.
Posted: 2020-07-02 09:21 AM . Last Modified: 2024-04-10 01:15 AM
Thanks Martin & Scott - we'll look into both suggestions. Appreciate the responses.
(CID:93193044)
Link copied. Please paste this link to share this article on your social media post.
Link copied. Please paste this link to share this article on your social media post.
Posted: 2020-07-02 09:21 AM . Last Modified: 2024-04-10 01:15 AM
(CID:93193242)
Link copied. Please paste this link to share this article on your social media post.
Link copied. Please paste this link to share this article on your social media post.
Posted: 2020-07-02 09:21 AM . Last Modified: 2024-04-10 01:15 AM
(CID:93193262)
Link copied. Please paste this link to share this article on your social media post.
Link copied. Please paste this link to share this article on your social media post.
Posted: 2020-07-02 09:21 AM . Last Modified: 2024-04-10 01:15 AM
(CID:93193263)
Link copied. Please paste this link to share this article on your social media post.
Link copied. Please paste this link to share this article on your social media post.
Posted: 2020-07-02 09:21 AM . Last Modified: 2024-04-10 01:15 AM
Has anyone made any progress on this problem? We have several hundred monitored devices (Rack PDUs (both APC and other vendors), APC RDUs, NetBotz 450, and some environmental monitors by IMCI). We frequently get "Communication Lost" alerts from random devices.
I did try and modify the retries and timeouts, but this didn't appear to bypass the problem. I've also done continuous pings to the devices, and the devices respond to the pings 100% of the time.
Thanks
(CID:118000823)
Link copied. Please paste this link to share this article on your social media post.
Link copied. Please paste this link to share this article on your social media post.
Posted: 2020-07-02 09:22 AM . Last Modified: 2024-04-10 01:15 AM
You can add me to the list of customers experiencing Coms Loss. It happens for (usually) one poll and then recovers on the next. Setting the timeout or poll rate differently does not affect the problem (only effects how often you see if it you slow the poll down). We have had Schneider look at it but they only fiddle with the poll interval and time out which does not solve the problem.
The cause is that DCE alarms on the first occurrence of Coms Loss and while we don't know what the cause is we do know it will happen.
What we have asked is for a "State for Time" setting be added to the Communication Status threshold as an enhancement. But in the mean time you may need to get creative.
My 2 cents.
(CID:96043083)
Link copied. Please paste this link to share this article on your social media post.
Link copied. Please paste this link to share this article on your social media post.
Posted: 2020-07-02 09:22 AM . Last Modified: 2024-04-10 01:15 AM
Out of interest, does the DCE server perform automatic backups? We have suspected issues with comms being lost on a few devices at about the time the backups are taking place - by default 01:00 Hrs I think...
(CID:96043104)
Link copied. Please paste this link to share this article on your social media post.
Link copied. Please paste this link to share this article on your social media post.
Posted: 2020-07-02 09:22 AM . Last Modified: 2024-04-10 01:14 AM
We do have about 300 APC devices monitored with DCE and since we added the last 50 devices we are also getting frequent "communication lost" alert from random devices... we tried to increase poll rate, timeout delay and retries, but without any luck... i'd be interested in getting a solution for that.
(CID:124526893)
Link copied. Please paste this link to share this article on your social media post.
Link copied. Please paste this link to share this article on your social media post.
Posted: 2020-07-02 09:22 AM . Last Modified: 2023-10-31 11:21 PM
This question is closed for comments. You're welcome to start a new topic if you have further comments on this issue.
Link copied. Please paste this link to share this article on your social media post.
Create your free account or log in to subscribe to the board - and gain access to more than 10,000+ support articles along with insights from experts and peers.