| 1 | Introduction |
| 1.1 | Alarm Description |
| 1.2 | Prerequisites |
2 | Compatibility |
3 | Procedure |
| 3.1 | Actions for All Causes |
1 Introduction
This instruction concerns alarm handling.
1.1 Alarm Description
The alarm is a primary or secondary alarm. The alarm is issued by component ClusterMonitor of Core Middleware (Core MW), using NTF service.
The alarm is issued in any of the following situations:
- The CLM has lost contact with a node and has been unable to reestablish contact within the set node alarm time-out (default is 15 minutes).
- After a cluster start, the CLM has been unable to establish contact with a node within the set node alarm time-out (default is 15 minutes).
The node alarm time-out can be set using the cmw-node-alarm-timeout command.
The possible alarm causes and fault locations are explained in Table 1.
|
Alarm Cause |
Description |
Fault Reason |
Fault Location |
Impact |
|---|---|---|---|---|
|
Failure of communication with the reported node. |
A node has lost contact with the remaining cluster members for more than the set node alarm time-out (default is 15 minutes). |
Faulty physical Ethernet device. |
Physical Ethernet interface. |
The capacity or redundancy of the cluster is reduced. |
|
Failure of communication with the reported node. |
A node has lost contact with the remaining cluster members for more than the set node alarm time-out (default is 15 minutes). |
The operating system and middleware layer are incorrectly configured. |
Incorrect High Availability (HA) configuration for the cluster. |
- Note:
- The alarm can appear as a result of an upgrade.
The alarm attributes are listed and explained in Table 2.
|
Attribute Name |
Attribute Value |
|---|---|
|
Major Type |
193 |
|
Minor Type |
849346561 |
|
Source |
One of the following: |
|
Specific Problem |
|
|
Event Type |
processingErrorAlarm (4) |
|
Probable Cause |
x736UnspecifiedReason (418) |
|
Additional Text |
|
|
Perceived Severity |
critical (3) |
(1) The "Additional
Text" field can contain additional data.
- Note:
- The uuid for the affected node is included in the alarm if it can be retrieved in the system. Depending on the system configuration, the uuid information (if present) is either appended to the "Additional Text" or can be fetched from the "Additional Info".
1.2 Prerequisites
Before starting this procedure, ensure that the following documents have been read:
- System Safety Information
- Personal Health and Safety Information
- COM Management Guide
2 Compatibility
Compatible to Core MW 3.6 and later.
3 Procedure
This section describes the procedure to follow when this alarm is received.
3.1 Actions for All Causes
Do the following:
- Consult the next level of maintenance support to analyze the cause for why the node does not join the cluster.
- When the cause has been identified, take relevant corrective measures. As a result, the alarm is automatically cleared.
- Confirm that the alarm has ceased.
If the alarm remains, consult the next level of maintenance support. Further actions are outside the scope of this instruction.

Contents