In this section:

Hardware alarms are generated on the AMC348 I/O card to indicate the status of local components on the card.

The alarms are accessed from the alarm dashboard in the Web UI. The alarm dashboard provides an indication of system status by showing the number of critical, major, and minor alarms that are generated by system events. Individual alarms that appear in the dashboard are accessed to determine the origin of the alarm. To access the alarms, see section Alarm Dashboard Web User Interface in the DSC Alarms Guide. To view details of individual alarms, see section View Alarms.

The hardware alarms are generated by hardware sensors on the AMC348. There are two types of hardware sensors on the DSC 8000:

  • Threshold sensors
  • Discrete sensors

Threshold sensors and the corresponding SNMP threshold alarms are described in the following sections:

Discrete sensors and the corresponding SNMP alarms are described in the following sections:

 

Note

Alarm details provided in the Web UI include the hardware sensor's Portal ID and IPMI number, which are dynamically assigned. The ID assignments change when a card is removed or when a new card is inserted in the DSC 8000 chassis; therefore, are not used to identify a hardware alarm.

AMC348 Hardware Threshold Sensors and Alarms

All threshold sensors available on the AMC348 I/O card generate alarms. Threshold sensors are monitored by the MMC (sensor IDs are local to the MMC) and trigger an SNMP alarm when a threshold sensor event occurs. Threshold sensor events consist of voltage or temperature values crossing a pre-defined threshold level.

The following table describes the hardware sensors on the AMC348 I/O cards.

AMC348 Hardware Threshold Sensors available in HWMON

 
IPMI Sensor NameAlias in HWMONDescriptionUnitsSNMP AlarmsAlarm Event
12V12V12V voltage sensor on the AMC348 card.Volts6346 - 6351Alarms generated on critical, major, and minor lower threshold crossings.
MGMT 3.3VMGMT 3.3V3.3V Management (IPMI) Power sensor on the AMC348 card.Volts6346 - 6351Alarms generated on critical, major, and minor lower threshold crossings.
TempIntake TempA temperature sensor on the AMC348 that monitors the airflow temperature at the point of coolest air intake on the card.

C Degrees

6348 - 6351

Alarms on generated on critical and major lower threshold crossings (minor crossings excluded in 15.0)

Mid Board TempA temperature sensor on the AMC348 that monitors the airflow temperature at the mid-point on the card.

 

Interpreting Threshold Sensor Events

Threshold sensor event severity levels are defined as follows:

  • Noncritical: This is a warning that one or more operating specifications are somewhat out of normal range, but there is not yet a problem to be addressed. Noncritical events are for information only, and they do not indicate that the AMC348 is outside of operating limits. In general, no action is required. However, in certain contexts, system/shelf management software may initiate preventive action. For example, if several cards in a shelf report upper noncritical temperature events, the shelf manager may decide to increase fan speed.

  • Critical: The AMC348 is operating within specified tolerances, but one or more specifications are getting close to the critical thresholds. Critical events indicate that the card is still within its operating limits, but it is close to exceeding one of those limits. Possible action in this case is to closely monitor the alarming sensor and take more aggressive action if it approaches the nonrecoverable threshold.

  • Nonrecoverable: The AMC348 is no longer operating within specified tolerances. Nonrecoverable events indicate that the card may no longer be functioning because it is now outside of its operating limits. Action is likely required or has already been taken by the local hardware/firmware. For example, a processor may shut itself down because its maximum die temperature was exceeded, or a shelf manager may deactivate the card because the processor is too hot.

Voltage Sensor Threshold Levels

A threshold sensor triggers an SNMP alarm when a pre-defined voltage threshold level is crossed by the monitored voltage.

The following table shows voltage threshold levels for voltage sensors on AMC348 cards.

AMC348 Voltage Sensor Threshold Levels

 
Sensor Name

Lower Non-Recoverable Threshold
(LNR)

(Alarm 6350)

Lower Critical Threshold
(LC)

(Alarm 6348)

Lower Non-Critical Threshold
(LNC)

(Alarm 6346)

Upper Non-Critical Threshold
(UNC)

(Alarm 6344)

Upper Critical Threshold
(UC)

(Alarm 6342)

Upper Non-Recoverable Threshold
(UNR)

(Alarm 6340)

MGMT 3.3VNA3.0V3.068V3.533V3.6V3.8V
12V9.0V10.0V10.8V13.2V14.0V15.0V

 

Temperature Sensor Threshold Levels

A threshold sensor triggers an SNMP alarm when a monitored temperature value crosses a pre-defined temperature threshold level.

The following table shows temperature threshold levels for the temperature sensors on AMC348 I/O cards.

AMC348 Temperature Sensor Threshold Levels

 
Sensor NameUnits

Lower Non-Recoverable Threshold
(LNR)

(Alarm 6350)

Lower Critical Threshold
(LC)

(Alarm 6348)

Lower Non-Critical Threshold
(LNC)

(Alarm 6346)

Upper Non-Critical Threshold
(UNC)

(Alarm 6344)

Upper Critical Threshold
(UC)

(Alarm 6342)

Upper Non-Recoverable Threshold
(UNR)

(Alarm 6340)

Intake TempC DegreesNANANA60C80CNA
Mid BoardC DegreesNANANA60C80CNA
Caution

Some versions of DSC software generate minor temperature alarms when the monitored temperature value crosses the Upper Non-Critical (UNC) temperature threshold. Minor temperature threshold crossing events are required for stable operation of the cooling sub-system and the resulting alarms are ignored.

To reduce the occurrence of minor alarms, a modified temperature monitoring function is introduced in DSC software Release 15.0. Temperature alarms are only generated when the monitored temperature value crosses the Upper-Critical (UC) and Upper-Non-Recoverable (UNR) temperature thresholds generating major and critical alarms, respectively.

SNMP Threshold Sensor Alarms

The following table lists SNMP alarms that are registered when a threshold sensor event occurs on the AMC348 I/O card.

AMC348 SNMP Threshold Sensor Alarms

 

AMC348 Hardware Discrete Sensors and Alarms

Discrete sensors return values of 'on' and 'off' or 'true' and 'false'. Each entity in the system has a 'Version Change' sensor that reports the entity's FRU state. These states are described in Intelligent Platform Management Interface Specification Second Generation (v2.0) specification.

The following table describes the discrete hardware sensors on the AMC348 I/O card.

AMC348 Discrete Hardware Sensors available in HWMON

 
IPMI Sensor NameAlias in HWMON              DescriptionSNMP Alarms         Alarm Event
WatchdogBMC WatchdogInternal watchdog timer fired.N/ANo alarm is generated on this event.
Hot-swap

Hot Swap_AMC3481        

The hot-swap sensor for the AMC348 I/O card.

6310 slotExtracted

6311 slotInserted

On card extraction, an alarm is raised on detection of state transitions:

  • M0 (not-installed)
  • M1 (inactive)
  • M7 (lost-communication)

On card insertion, an alarm is raised (clearing) on detection of state transition:

  • M4 (active)
Power Good POWER GOOD

A PICMG boolean sensor (false = 1, true = 2) that indicates whether or not the power subsystem on the AMC348 I/O card is healthy.

6320 powerFaultAssert

6321 powerFaultDeassert

A transition to 'false' (Power Not Good) is detected.

A transition to 'true' (Power Good) is detected.

VersionVersion Change2 This sensor reports the FRU state on the AMC348 I/O card (MMC/IPMC firmware is changed). The sensor returns a one bit value assigned to eight possible FRU conditions. For example, if bit 0 is set, the condition defined by the value of 00h is present.N/AInformational sensor only. No alarms are generated.

1 Each entity in the system has a hot-swap sensor that reports the entity's FRU state. These states are described in PICMG® 3.0 AdvancedTCA® Base Specification. The sensor returns a one bit value for each of the eight states, M0 - M7, as defined in the specification. For example, if bit 0 is set, the FRU is in state M0. Similarly, if bit 4 is set, the sensor returns a value of 16 (0001000b), which is the Normal (Active) state, M4.

The state values include:

[7] – 1b: FRU Operational State M7 = Communication Lost

[6] – 1b: FRU Operational State M6 = FRU Deactivation In Progress

[5] – 1b: FRU Operational State M5 = FRU Deactivation Request

[4] – 1b: FRU Operational State M4 = FRU Active

[3] – 1b: FRU Operational State M3 = FRU Activation in Progress

[2] – 1b: FRU Operational State M2 = FRU Activation Request

[1] – 1b: FRU Operational State M1 = FRU Inactive

[0] – 1b: FRU Operational State M0 = FRU Not Installed

2 Each entity in the system has a 'Version Change' sensor that reports  the entity's FRU state.These states are described in Intelligent Platform Management Interface Specification Second Generation (v2.0) specification. The sensor returns a one bit value assigned to eight possible FRU conditions. For example, if bit 0 is set, then the condition defined by the value of 00h is present. The eight conditions include the following:

00h: hardware change detected (informational). This offset does not indicate whether the hardware change was successful or not, only that a change occurred.

01h: firmware or software change detected (informational).

02h: hardware incompatibility detected

03h: firmware or software incompatibility detected

04h: entity has an invalid or unsupported hardware version

05h: entity contains an invalid or unsupported firmware or software version

06h: hardware change detected on entity was successful (de-assertion event = unsuccessful)

07h: software or firmware change detected on entity was successful (de-assertion event = unsuccessful)

SNMP Discrete Sensor Alarms

The following table lists SNMP alarms that are registered when a discrete sensor event occurs on the AMC348 I/O card.

AMC348 SNMP Discrete Sensor Alarms

 SNMP Alarm Number
Alarm Name
Clearing Alarm
6310slotExtracted6311
6311slotInsertedN/A
6320powerFaultAssert6321
6321powerFaultDeassertN/A

 

  • No labels