APC UPS Data Center & Enterprise Solutions Forum
Schneider, APC support forum to share knowledge about installation and configuration for Data Center and Business Power UPSs, Accessories, Software, Services.
Posted: 2021-06-30 01:37 AM . Last Modified: 2024-03-11 12:56 AM
Link copied. Please paste this link to share this article on your social media post.
Posted: 2021-06-30 01:37 AM . Last Modified: 2024-03-11 12:56 AM
Hi Everyone ,
I have 4 APC 8853 and I do statistics (Ampers, Temperature, Humidity) using Cacti.
Almost everything is fine. But I get random gaps in graphs (like every 6,7 or 12 hours ).
I'm using SNMP to get all data. I checked cacti logs and It looks like there is a problem with SNMP connection with APC. Gap has period from 5 to 15 min on graphs.
Here is a log message:
CMDPHP: Poller[0] WARNING: SNMP GetNext Timeout for Host:'10.11.33.4', and OID:'.1.3'
I checked host where is cacti and all scripts and everything is fine , even network connections there was no problems in time when gap apeared. I think it is something with APC but I don't know what.
Is any one who can help me please with this ??
=============================================
Details:
Cacti -> http://www.cacti.net/
APC -> http://www.apc.com/products/family/?id=136#
APC info:
Model Number: AP8853
Hardware Revision: 02
Manufacture Date: 05/05/2011
Network Management Card
Model Number: AP9537
Serial Number: ZA1118012982
Hardware Revision: 05
Manufacture Date: 05/02/2011
Application Module
Name: rpdu2g
Version: v5.1.1
Date: Dec 13 2010
Time: 17:23:54
APC OS (AOS)
Name: aos
Version: v5.1.4
Date: Jun 17 2010
Time: 13:46:21
APC Boot Monitor
Name: bootmon
Version: v1.0.2
Date: Jan 21 2010
Time: 13:35:57
I'm using SNMP v1 public access
Collects data every 5 min
Link copied. Please paste this link to share this article on your social media post.
Link copied. Please paste this link to share this article on your social media post.
Posted: 2021-06-30 01:37 AM . Last Modified: 2024-03-11 12:55 AM
no problem and same to you!
Link copied. Please paste this link to share this article on your social media post.
Link copied. Please paste this link to share this article on your social media post.
Posted: 2021-06-30 01:37 AM . Last Modified: 2024-03-11 12:56 AM
sure! i'll wait for your post.
Link copied. Please paste this link to share this article on your social media post.
Link copied. Please paste this link to share this article on your social media post.
Posted: 2021-06-30 01:37 AM . Last Modified: 2024-03-11 12:56 AM
• does it start responding on its own again or do you need to do a certain thing?
• when it doesnt respond, are you able to test if it is still pingable - meaning is just the SNMP timing out or is the entire network connection dropping?
• can you post any event logs from the same times SNMP is not responding? (you can use these instructions to get config.ini/data.txt/event.txt - http://nam-en.apc.com/app/answers/detail/a_id/9321 .
feel free to remove any info you don't want to be public.
Link copied. Please paste this link to share this article on your social media post.
Posted: 2021-06-30 01:37 AM . Last Modified: 2024-03-11 12:56 AM
Link copied. Please paste this link to share this article on your social media post.
Posted: 2021-06-30 01:37 AM . Last Modified: 2024-03-11 12:56 AM
Yes SNMP is back to normal by itself after lost connnection. I don't have to do nothing.
Ok will try to check ping for next 48 hours and collect all possible logs.
After that I will reply if I found something or I will put some logs and more info about issue.
Great Thanks
Link copied. Please paste this link to share this article on your social media post.
Posted: 2021-06-30 01:37 AM . Last Modified: 2024-03-11 12:56 AM
Link copied. Please paste this link to share this article on your social media post.
Posted: 2021-06-30 01:37 AM . Last Modified: 2024-03-11 12:56 AM
Hi,
I get all logs. Any way I don't have to use ftp. I can get logs stright away from APC
web panel they are the same. I couldn't get any debug.txt logs from ftp.
So that's what I got :
Gap was at 14:05 to 14:15
Nothing in even logs.
I know that I lost ping also in same time when I couldn't get data from SNMP
Data logs:
12/05/2011 14:11:37 1.72 1.77 4755.6 21.0 28 8.1 8.3 5.3 2.8 5.4 2.9 ^M
12/05/2011 14:08:37 1.70 1.75 4755.6 21.1 29 8.0 8.2 5.3 2.7 5.5 2.8 ^M
12/05/2011 14:05:37 1.72 1.82 4755.5 20.5 29 8.1 8.5 5.3 2.7 5.6 2.9 ^M
12/05/2011 14:02:37 1.72 1.75 4755.4 20.3 29 8.1 8.2 5.3 2.7 5.5 2.8 ^M
12/05/2011 13:59:37 1.72 1.77 4755.3 20.7 28 8.1 8.3 5.3 2.7 5.4 2.9 ^M
12/05/2011 13:56:37 1.70 1.77 4755.2 21.3 29 8.0 8.3 5.3 2.7 5.4 2.9 ^M
So it looks like problem is with network or with APC's network card.
All APC are connected to diffrent cisco switches so I don't think it will be a problem with switches.
And gaps appear on all APC's (not in same time)
Any other clue ?
Thanks
Link copied. Please paste this link to share this article on your social media post.
Link copied. Please paste this link to share this article on your social media post.
Posted: 2021-06-30 01:37 AM . Last Modified: 2024-03-11 12:56 AM
i asked for the logs in .txt format via FTP since the web does not format them well and also you cannot see the full log all in one view. the data log you provided does not really help unfortunately and we'd need to see the event.txt for the most help. the lack of ping can be verified to me via the event.txt so i can also check the events before and after it.
please provide all of the logs in .txt format so we can do a better analysis for you and recommend the next steps.
Link copied. Please paste this link to share this article on your social media post.
Link copied. Please paste this link to share this article on your social media post.
Posted: 2021-06-30 01:37 AM . Last Modified: 2024-03-11 12:56 AM
hm ok. the lack of event log is disappointing since i think it'd be a big help. going by the data log, since it was logging as you showed before, the PDU was online and NMC was online, at least on its own, versus a network connection. is there any other devices on the same switch for instance that we can see if they had the same issue around the same time? do you have other PDUs that you are monitoring? i am skeptical at this point to blame the PDU since there is no evidence that the NMC went offline but the event.txt would help back up that statement if i saw it. i could see if the NMC warmstarted which it will do if it does not see any network traffic as a failsafe incase something like this happens.
are we able to monitor this or set up a ping to the device every 5-10 seconds for instance to try and catch it next time to verify where or not SNMP AND/OR the entire network connection is dropping? i guess thats my biggest concern since you said that Cacti did not go offline at that time and you ruled out a network issue - how did you rule out a network issue? What is your SNMP timeout set to?
Link copied. Please paste this link to share this article on your social media post.
Posted: 2021-06-30 01:37 AM . Last Modified: 2024-03-11 12:55 AM
Link copied. Please paste this link to share this article on your social media post.
Posted: 2021-06-30 01:37 AM . Last Modified: 2024-03-11 12:55 AM
Hi,
Yes a have 4 PDU's each connected to separate switches (all same CISCO) and problem is with all of them same.
On each switch I have connected about 14 other servers constantly monitored by Nagios.
In Cacti I have timedout for SNMP Timedout =500 and Ping Timeout Value=400 (default from Cacti)
Cacti couldn't go offline because I will see gaps on all graphs not only for PDUs.
But ....
6 hours ago I changed SNMP Timedout for 900 and PIng Timedout for 900 and Ping Timeout Value = 2 (was 1)
So far no gaps but I need min 2 days to check that.
Maybe it will be fine.
I will let you know if problem is gone or back again. You just need to give me 2 days.
So far great thanks for all advices.
Link copied. Please paste this link to share this article on your social media post.
Posted: 2021-06-30 01:37 AM . Last Modified: 2024-03-11 12:55 AM
Link copied. Please paste this link to share this article on your social media post.
Posted: 2021-06-30 01:37 AM . Last Modified: 2024-03-11 12:55 AM
Hi,
Yep I can confirm verything is fine no gaps on all PDUs after I change SNMP timeout to 900 currently just in case is 2000ms. Looks like I found soluton with your help.
There was random high ping latency in network which cause timeout for snmp in cacti ( was set for 500 ms).
Latency was some times even 800ms that's why everything was working but in graph was gap becasue of timeoout.
So great thanks for help 🙂
and Happy Christmas
Link copied. Please paste this link to share this article on your social media post.
Link copied. Please paste this link to share this article on your social media post.
Posted: 2021-06-30 01:37 AM . Last Modified: 2024-03-11 12:55 AM
no problem and same to you!
Link copied. Please paste this link to share this article on your social media post.
Create your free account or log in to subscribe to the board - and gain access to more than 10,000+ support articles along with insights from experts and peers.