EcoStruxure IT forum
Schneider Electric support forum about installation and configuration for DCIM including EcoStruxure IT Expert, IT Advisor, Data Center Expert, and NetBotz
Link copied. Please paste this link to share this article on your social media post.
Posted: 2020-07-03 05:57 AM . Last Modified: 2024-04-08 11:11 PM
DCO 7.5 SP4 server was in hung up state after 50 days uptime. I was able to connect only via local console but I saw the following:
Reboot command did not worked as well therefore left behind only force reboot from hypervisor. After 30 mins server uptime Operation service is still not running. I already uploaded the application logs just please tell me who can work on that and I will share with him.
(CID:108825069)
Link copied. Please paste this link to share this article on your social media post.
Link copied. Please paste this link to share this article on your social media post.
Posted: 2020-07-03 05:57 AM . Last Modified: 2024-04-08 11:11 PM
Hi Mate, Have you checked the memory settings? it seems you setup/system has run out of memory. It says "cannot allocate memory". It would be great if you could share your server logs (and client application log) with me. Server logs can be downloaded from server webmin interface (StuxureWare DC Operation > Download Log Files > "Download log files") Kind regards
(CID:108825077)
Link copied. Please paste this link to share this article on your social media post.
Link copied. Please paste this link to share this article on your social media post.
Posted: 2020-07-03 05:57 AM . Last Modified: 2024-04-08 11:11 PM
Hi, I uploaded the logs and shared with you. Regards, Mate
(CID:108825138)
Link copied. Please paste this link to share this article on your social media post.
Link copied. Please paste this link to share this article on your social media post.
Posted: 2020-07-03 05:57 AM . Last Modified: 2024-04-08 11:11 PM
Hi Mate, Many thanks for providing the logs - will get back to you as soon as possible, Kind regards
(CID:108825142)
Link copied. Please paste this link to share this article on your social media post.
Link copied. Please paste this link to share this article on your social media post.
Posted: 2020-07-03 05:57 AM . Last Modified: 2024-04-08 11:11 PM
Hi Mate,
According to the provided log files, there is a trace of long start up, that seems to be caused by UCS integration. During start up DCO 7.5 will fetch device data for the connected UCS managers. DCO is not completely started until data is fetched and persisted (log files indicate that it hangs on persisting the data). Scheduled jobs that perform similar tasks during operation does not seem to take several minutes.
In order to debug it might make sense to have a backup to test on.
Note: In DCO 8.0 data is not fetched during startup and should not prevent the server from starting.
Did the server start eventually?
Out of memory issue. Programs seems have used 7-8 GB more memory before server was restarted compared to right after the restart. It could indicate a memory leak. We cannot deduct what the leak is from the server log files. We could use a new set of log files in order to check if memory is currently being leaked.
Also the file (jobs.log) created by the following command could help debugging the issue:
ps -faux > jobs.log
If we can see from the new set of log files, that server is still leaking, then we might need an online session in order to debug.
Kind regards
(CID:108825408)
Link copied. Please paste this link to share this article on your social media post.
Link copied. Please paste this link to share this article on your social media post.
Posted: 2020-07-03 05:57 AM . Last Modified: 2024-04-08 11:10 PM
Hi Jef,
Thank you for the investigation and feedback.
Yes, we have couple of UCS infrastructure which are using dedicated Managers. It would be welcome if I can use UCS Central integration instead of individual UCS Manager 😀
Did the server start eventually? No, I did not apply manual server start/reboot. It had 50 days uptime before I faced issue with the memory leakage problem. Just an additional information: DCO virtual machine can use 32GB allocated RAM and 4 vCPU.
I know about 8.0 and we already test it in our Lab and I am also planning to upgrade it in this year.
I uploaded "ps -faux" output however the file export was not worked for some reason. You can also see the latest DCO backup on appbox.
Regards,
Mate
(CID:108825467)
Link copied. Please paste this link to share this article on your social media post.
Link copied. Please paste this link to share this article on your social media post.
Posted: 2020-07-03 05:58 AM . Last Modified: 2024-04-08 11:10 PM
Hi Mate, You are welcome & many thanks for the info. I will get the files and get back to asap. Kind regards
(CID:108825468)
Link copied. Please paste this link to share this article on your social media post.
Link copied. Please paste this link to share this article on your social media post.
Posted: 2020-07-03 05:58 AM . Last Modified: 2024-04-08 11:10 PM
Do you have any update? Most of the users reported that facing performance issue with in DCO client which is unresponsive for 30-40 seconds and makes impossible the daily operation. I am open for a WebEx session anytime for further investigation.
I already uploaded one client log to the same store. I can see it is full with the following errors:
Error (Connection lost, response code:null) ExternalWebServiceService.getCollectedMetricDataFor() @ http://
(CID:108825655)
Link copied. Please paste this link to share this article on your social media post.
Link copied. Please paste this link to share this article on your social media post.
Posted: 2020-07-03 05:58 AM . Last Modified: 2024-04-08 11:10 PM
Hi Mate, Sorry for the delay, The setup contains a lot of UCS manager integration's and from thread logs it seems the server is doing a lot of (Hibernate) flushing during startup which can explain the long startup time. I had a brief discussion with our main developer, apparently the reason the start behaves differently from a normal UCS manager is because they are all done in the same go. In that way up to 10000 external items end up in the Hibernate 1st level cache, and may end with a lot of flushing. This is not an issue on DCO 8.0 since each job will do less flushing. So it is highly recommended to upgrade the product to the latest version (which contains many enhancements). Kind regards
(CID:108825662)
Link copied. Please paste this link to share this article on your social media post.
Link copied. Please paste this link to share this article on your social media post.
Posted: 2020-07-03 05:58 AM . Last Modified: 2023-10-31 10:47 PM
This question is closed for comments. You're welcome to start a new topic if you have further comments on this issue.
Link copied. Please paste this link to share this article on your social media post.
Create your free account or log in to subscribe to the board - and gain access to more than 10,000+ support articles along with insights from experts and peers.