2010-01-20

Bug in SNMP traps sent from vCenter



I encountered this today.

I was trying to configure the alarm for datastore usage.

Relatively simple operation - there is even a built in alarm for this in vCenter

image

So I configured the alarm to fire to test and to send me a notification email and an SNMP trap

image

I received the email

Target: ESX2_NFS
Previous Status: Gray
New Status: Red
Alarm Definition:
([Yellow metric Is above 75%; Red metric Is above 85%])
Current values for metric/state:
Metric Storage Space Actually Used = 86%
Description:
Alarm 'Datastore usage on disk' on ESX2_NFS changed from Gray to Red

and the SNMP trap as well

SNMPv2-MIB::snmpTrapOID.0 SNMPv2-SMI::enterprises.6876.4.3.0.201
SNMPv2-SMI::enterprises.6876.4.3.301 "host"
SNMPv2-SMI::enterprises.6876.4.3.302 ""
SNMPv2-SMI::enterprises.6876.4.3.303 ""
SNMPv2-SMI::enterprises.6876.4.3.304 "Gray"
SNMPv2-SMI::enterprises.6876.4.3.305 "Red"
SNMPv2-SMI::enterprises.6876.4.3.306 "Datastore usage on disk - Metric Storage Space Actually Used = 86%"
SNMP-COMMUNITY-MIB::snmpTrapAddress.0 10.xx.xx.63
SNMP-COMMUNITY-MIB::snmpTrapCommunity.0 "public"

The trap gave all the data except one small but important detail - THE NAME OF THE DATASTORE!

After opening a call with VMware today the response I received:

The symptom you are explaining is a know issue and has been resolved by engineering.

This fix will be issued in VC4.0 update 2. There is no release date for this update as of yet, so I would suggest keeping update to date with the VMware website product downloads as it will be released there.

So if you are not getting the correct data from your SNMP traps - this could be the reason.

2 comments:

Sebastian Kayser said...

Thanks for the info! To second what you are seeing, I just stumbled upon this too and learned that the MIB definitions delivered with VC4.0U1 indeed only carry host/VM details for SNMP traps, i.e. they are not designed for other types of objects (curious people can have a peek at VMWARE-VC-EVENT-MIB.mib and watch out for the definition of vpxdAlarm).

The most recent MIBs available for download (http://kb.vmware.com/kb/1013445) obsolete vpxdAlarm (sigh) and introduce a new trap named vpxdAlarmInfo which is defined a bit different and that knows about "other" objects (which could be datastores). So we just gotta wait for a vCenter which actually sends out such traps and then adjust our monitoring scripts to deal with the new trap types ... Hooray ...

Sune said...

They've fixed it in the latest vCenter update (U2), and simultaneously introduced a new bug:

Before upgrade:

"Datastore usage on disk - Metric Storage Space Actually Used = 87%"

After upgrade:

"Datastore usage on disk - Metric Storage Space Actually Used = 6.045KB"

I.e., the percentage has been replaced with a bogus number. 100% seems to be translated into 10.000 KB, so it's likely a formatting error.