APC UPS Data Center & Enterprise Solutions Forum
Schneider, APC support forum to share knowledge about installation and configuration for Data Center and Business Power UPSs, Accessories, Software, Services.
Link copied. Please paste this link to share this article on your social media post.
Posted: 2025-01-03 07:38 AM
I have an APC Smart-UPS SRT 5000 which is monitored using SNMPv3 via the UPS Network Management Card 3 using firmware 2.5.3.3. Our monitoring solution is throwing alerts for "Battery lifetime is not okay" for some battery cartridges. This trigger is from the MIB:
name: '{#BATTERY_PACK}.{#CARTRIDGE_INDEX}: Battery lifetime is not okay'
opdata: 'Current bit set: {ITEM.LASTVALUE1}'
priority: WARNING
description: |
The battery cartridge health.
bit 0 Battery lifetime okay
bit 1 Battery lifetime near end, order replacement cartridge
bit 2 Battery lifetime exceeded, replace battery
bit 3 Battery lifetime near end acknowledged, order replacement cartridge
bit 4 Battery lifetime exceeded acknowledged, replace battery
bit 5 Battery measured lifetime near end, order replacement cartridge
bit 6 Battery measured lifetime near end acknowledged, order replacement cartridge
This appears to be because some of the OIDs returned are the wrong value:
iso.3.6.1.4.1.318.1.1.1.2.3.10.2.1.7.1.1.1 = STRING: "1000000000000000"
iso.3.6.1.4.1.318.1.1.1.2.3.10.2.1.7.1.1.2 = STRING: "0000000000000000"
iso.3.6.1.4.1.318.1.1.1.2.3.10.2.1.7.2.1.1 = STRING: "1000000000000000"
iso.3.6.1.4.1.318.1.1.1.2.3.10.2.1.7.2.1.2 = STRING: "1000000000000000"
iso.3.6.1.4.1.318.1.1.1.2.3.10.2.1.7.3.1.1 = STRING: "1000000000000000"
iso.3.6.1.4.1.318.1.1.1.2.3.10.2.1.7.3.1.2 = STRING: "1000000000000000"
iso.3.6.1.4.1.318.1.1.1.2.3.10.2.1.7.4.1.1 = STRING: "1000000000000000"
iso.3.6.1.4.1.318.1.1.1.2.3.10.2.1.7.4.1.2 = STRING: "0000000000000000"
iso.3.6.1.4.1.318.1.1.1.2.3.10.2.1.7.5.1.1 = STRING: "1000000000000000"
iso.3.6.1.4.1.318.1.1.1.2.3.10.2.1.7.5.1.2 = STRING: "1000000000000000"
The GUI shows the status for Battery 1, Cartridge 2 and Battery 4, Cartridge 2 are OK. So why are the SNMP OIDs reporting a 0 in the first bit (Not okay) and 0 for all other bits and not indicating any other fault?
Link copied. Please paste this link to share this article on your social media post.
Link copied. Please paste this link to share this article on your social media post.
Posted: 2025-01-12 02:55 AM
Bits are normally numbered starting from the LSB. i.e. Bit 0 would be 00000001, bit 1 would be 00000010, etc. I'd assume that the high bit set in your example is an internal flag which isn't documented in the MIB.
Cf. ASN-1 "octet: A group of eight consecutive bits, numbered from bit 8 (the most significant bit) to bit 1 (the least significant bit)."
Although the common zero vs. one based numbering confusion is present.
Link copied. Please paste this link to share this article on your social media post.
Link copied. Please paste this link to share this article on your social media post.
Posted: 2025-01-13 11:56 PM
This is not the case. Which bit is first is referred to as endian-ness (e.g little/big endian) and if this were the issue all of the values would be incorrect.
The MIB defines the status using the bits on this word. If the 'flag' is raised then the bits value is set to a 1 otherwise it's a zero. When the first bit is set then the battery lifetime is okay and when it's not set then battery lifetime is not okay. If the second bit is set then the battery lifetime is near it's end and therefore the lifetime is not okay. Then I would expect the value reported to be 0100000000000000. These bits are used as a binary flag and not as a binary representation of a decimal value.
This is a case of the UPS providing the incorrect value for some cartridges. If the first bit is a zero (i.e. not okay) then another bit should be set.
Link copied. Please paste this link to share this article on your social media post.
Link copied. Please paste this link to share this article on your social media post.
Posted: 2025-01-15 02:36 AM
Please don't be condescending. Endianness applies to bytes, not bits. Without more data, it is ambiguous how APC chose to number the bits. One could argue that if their numbering starts with the LSB, there is at least consistency in the results, even if it's incorrect. Which is more likely, incorrect and inconsistent, or simply incorrect?
Even the IETF is inconsistent on bit order. DOCS-IETF-QOS-MIB says "If bit 0 is the least significant bit of the least significant (4th) octet, and if bit number is increased with significance" and BGP4-MIB says "the MSB of the first octet refers to bit 0."
Many of the MIBs I've looked at provide guidance in the OID's description. APC does not.
Link copied. Please paste this link to share this article on your social media post.
Link copied. Please paste this link to share this article on your social media post.
Posted: 2025-01-16 01:16 AM
The value returned is a string containing a binary word (two bytes) where the individual bit positions are binary flags. Endianness is absolutely relevant the position of the bits is critical to their meaning.
It's not ambiguous the definition in the MIB is in the opening post and makes clear what way the data is represented. The data provided by the UPS in SNMP is just incorrect and doesn't match the current state as seen in the GUI.
Link copied. Please paste this link to share this article on your social media post.
Link copied. Please paste this link to share this article on your social media post.
Posted: 2025-01-16 06:04 AM
Hi,
I can't offer any help with the batteries, but just to clarify what we're looking at - in the string representation of a bitfield, bit0 is the left-most bit. Yes, the numerical representation is usually the opposite way around. So it is the 'lifetime okay' flag that @KDrain is seeing frequently-0 without a fault raised to match it.
(As this is only happening on the second CartridgeIndex for each Pack, I suspect the dates for the second index are all showing 01/01/2000 so the 'okay' flag is zero because 25 years old is not okay. But it's not raising as a fault because the first index is the real one. I can't offer a solution, just a hint at what's not adding up there.)
Link copied. Please paste this link to share this article on your social media post.
Link copied. Please paste this link to share this article on your social media post.
Posted: 2025-01-16 10:34 PM
@KDrain wrote:
The value returned is a string containing a binary word (two bytes) where the individual bit positions are binary flags. Endianness is absolutely relevant the position of the bits is critical to their meaning.
It's not ambiguous the definition in the MIB is in the opening post and makes clear what way the data is represented. The data provided by the UPS in SNMP is just incorrect and doesn't match the current state as seen in the GUI.
"Endian" doesn't mean what you think it means.
Link copied. Please paste this link to share this article on your social media post.
Link copied. Please paste this link to share this article on your social media post.
Posted: 2025-01-17 12:58 AM
Thanks Shaun. This description regarding the date makes sense. I've checked a few UPS and they all appear to suffer the same issue on the same cartridges. Do you know of an easier method to escalate this to APC support or is the only option a phone call to the support centre?
Link copied. Please paste this link to share this article on your social media post.
Create your free account or log in to subscribe to the board - and gain access to more than 10,000+ support articles along with insights from experts and peers.