Schneider Electric support forum about installation and configuration for DCIM including EcoStruxure IT Expert, IT Advisor, Data Center Expert, and NetBotz
Send a co-worker an invite to the portal.Just enter their email address and we'll connect them to register. After joining, they will belong to the same company.
You have entered an invalid email address. Please re-enter the email address.
This co-worker has already been invited to the Exchange portal. Please invite another co-worker.
Please enter email address
Send InviteCancel
Invitation Sent
Your invitation was sent.Thanks for sharing Exchange with your co-worker.
Link copied. Please paste this link to share this article on your social media post.
Posted: 2020-08-1108:06 AM
DCE offline alert
I have multiple DCE VM's running simultaneously on separate clusters. Had them for 5 years or more without much trouble. Recently a cluster has been failing and we have lost a DCE server for a period of time (obviously all the monitoring is lost or not collected for this period too). The first time it happened I asked the monitoring team to implement a simple ping check to see if the server is still operating. I was happy at this point.
But barely had I washed the sand out of my toes from holiday and I come back to work learning that the cluster has failed again and the ping test failed because the VM was still responding to a ping even though the service had crashed.
My question then is : How can I get an alert to tell me a DCE VM is no longer online or working? (There is something ironic about my monitoring service not being monitored itself).
Here is the guff from the virtualisation team:
Many operating systems will still respond to a ping, even if the service has crashed. This is a linux feature where is makes the file system read-only, but keep the network stack up to allow you to connect to it.
This is where a service check is required. Something that test for a running service, or a port that’s only open when the service is running.
A port scan of the o/s might be able to pick up what is open, and that could be requested as a check.
The other alternative is to ask the vendor if there is something that can alert from within the o/s to advise there has been an issue.
This is merely another way to get data from DCE. Assuming this answers a polling request I still can not say 100% however that this will tell you that all functions of the system are operational. There are multiple processes that must be running for the entire system to be working. Still, it's one of the few ways we can verify data is still being received and available
We do have an enhancement request already in the system for what you're asking. Basically a health check of the system. I am adding your post to that request as the more requests that we get on any issue, the more likely it is that engineering and product management will look into adding or fixing any features such as this.