1 Introduction
This document provides Fault Management (FM) information for the Ericsson Centralized User Database (CUDB).
1.1 Document Purpose and Scope
The purpose of this document is to provide a list of application alarms, and describe the alarm management and the application alarm model of the CUDB system. The infrastructure alarms of the system are not in the scope of this document, but are shortly summarized in Infrastructure Alarms.
1.3 Typographic Conventions
Typographic Conventions can be found in the following document:
2 Alarms in the CUDB System
An alarm in the CUDB system is a message sent through the CUDB SNMP interface that informs the operator about a problem in the node which requires attention. The CUDB system can raise two types of alarms:
For more information on the management and alarm model of the application alarms, see Application Alarms. For a brief summary of the management of infrastructure alarms, see Infrastructure Alarms.
2.1 Application Alarms
This section describes the management and alarm model of the CUDB application alarms.
CUDB application components (including the software component, that is the operating system and Core Middleware) send their alarms through the Ericsson SNMP Agent (ESA). The alarms sent to ESA are formatted according to the ERICSSON-SNF-ALARM-MIB and are sent to the Network Management System (NMS). For more information, refer to ESA Fault Management.
2.1.1 Alarm Format and Description
An alarm model is a logical description of the CUDB system described in a tree structure. The alarm model in Figure 1 illustrates the hierarchy of the CUDB application components which are able to raise alarms.
The alarm format used by the CUDB application components is defined by the ERICSSON-SNF-ALARM-MIB. For more information, refer to ESA Fault Management. The standard location for this file, as well as for other mib files used by the CUDB FM interface, is defined by ESA in ESA Setup and Configuration.
Table 1 provides relevant information about the alarms. The <Severity>, <Alarm Event Type> and <Probable Cause> values follow the X.733 International Telecommunications Union (ITU) recommendation, refer to Information Technology - Open Systems Interconnection - Systems Management Alarm Reporting Function ITU-T X.733. CCITT Rec. X.733 (1992 E) .
|
Attribute Name |
Attribute Value |
|---|---|
|
Auto Cease |
|
|
Module |
The CUDB application component that raises the alarm. See Figure 1 under cudb(169). |
|
Error Code |
Assigned number identifying the alarm within a certain module (application component). |
|
Timestamp First |
Date and time when the alarm was raised for the first time. |
|
Repeated Counter |
Number which indicates how many times the alarm was raised. |
|
Timestamp Last |
Date and time of the most recent alarm raise. |
|
Resource ID |
An identifier of the alarming resource. The Object Identifier (OID) derived from the alarm model is used as the base for this identifier. |
|
Alarm Model Description |
A short description of the event. |
|
Alarm Active Description |
A dynamic text with a detailed description of the event. |
|
ITU Alarm Event Type |
A text that describes the type of the selected event, for example
For more information, refer to Information Technology - Open Systems Interconnection - Systems Management Alarm Reporting Function ITU-T X.733. CCITT Rec. X.733 (1992 E). |
|
ITU Alarm Probable Cause |
A text that describes the probable cause of the event. For more information, refer to Information Technology - Open Systems Interconnection - Systems Management Alarm Reporting Function ITU-T X.733. CCITT Rec. X.733 (1992 E). |
|
ITU Alarm Perceived Severity |
The status of the event. One of the following:
For more information refer to Information Technology - Open Systems Interconnection - Systems Management Alarm Reporting Function ITU-T X.733. CCITT Rec. X.733 (1992 E). |
|
Originating Source IP |
Node IP where the alarm was raised. |
|
Sequence Number |
Number which indicates the order in which alarms are raised. |
2.1.2 Alarm Management
The CUDB application does not provide specific management procedures for the mentioned alarms apart from the manual alarm clearing procedure provided by ESA, as described in Clearing Alarms.
CUDB alarm management may be needed in the NMS in certain cases. For example, in case the CUDB System Controllers (SCs) restart simultaneously or a Linux® Distribution Extension (LDE) cluster reboots in the CUDB node, the CUDB alarm table on the NMS might not be in sync with the alarm table on the CUDB node. Depending on the NMS used, alarm synchronization may need to be triggered manually on the NMS.
2.1.2.1 Clearing Alarms
To clear an alarm, use the fmsendmessage command with the clear parameter:
# fmsendmessage -c <module> <errorcode> <resourceid> [<Alarm Active Description>] <originatingsourceip>
where:
Example 1 Clearing an Alarm
To clear the following non-autocease alarm: --------------------------------------------------------------- Module : STORAGE-ENGINE Error Code : 8 Resource ID : .1.3.6.1.4.1.193.169.1.2.8.100 Timestamp First : Thu Sep 24 13:41:48 CEST 2015 Repeated Counter : 1 Timestamp Last : Thu Sep 24 13:41:48 CEST 2015 Alarm Model Description : Memory usage at Warning level, Storage Engine. Alarm Active Description : Storage Engine (DS-group #100): memory usage at Warning level. ITU Alarm Event Type : 4 ITU Alarm Probable Cause : 151 ITU Alarm Perceived Severity : warning Originating source IP : 10.143.56.132 Sequence Number : 554 --------------------------------------------------------------- this command is used: # fmsendmessage -c STORAGE-ENGINE 8 .1.3.6.1.4.1.193.169.1.2.8.100 "Manually cleared by User" 10.143.56.132
| Note: |
Even though the help message of fmsendmessage suggests that the
<sourceIP> parameter is optional, this
is not the case in CUDB. If the parameter is not specified, the default value (IP address
of the blade or Virtual Machine (VM) on which the command is executed) is used and the
existing alarm is not cleared.For instance, if the fmsendmessage -c
STORAGE-ENGINE 8 .1.3.6.1.4.1.193.169.1.2.8.100 command is executed (where
the <sourceIP> parameter is missing), the
alarm shown in Example 1 is not cleared. |
2.1.3 Alarm List
Alarms are grouped by different application components, as shown in Figure 1. They are described in detail in the following subsection. The alarms in the tables are in alphabetical order.
2.1.3.1 Storage Engine
Storage Engine alarms are related to the Database Cluster. The alarm model for PLDB alarms is shown in Figure 2.
The alarm model for DS and general alarms is shown in Figure 3.
Table 2 shows the list of alarms related to Storage Engine.
|
Alarm |
Operating Instruction |
|---|---|
|
Storage Engine, Automatic Handling of Network Isolation not Completed for DS |
Refer to Storage Engine, Automatic Handling of Network Isolation not Completed for DS. |
|
Storage Engine, Automatic Handling of Network Isolation not Completed for PLDB |
Refer to Storage Engine, Automatic Handling of Network Isolation not Completed for PLDB. |
|
Storage Engine, Backup Fault In DS |
Refer to Storage Engine, Backup Fault In DS. |
|
Storage Engine, Backup Fault In PLDB |
Refer to Storage Engine, Backup Fault In PLDB. |
|
Storage Engine, Backup Notification Failure To Provisioning Gateway |
Refer to Storage Engine, Backup Notification Failure To Provisioning Gateway. |
|
Storage Engine, Data Inconsistency between Replicas Found in DS, Major |
Refer to Storage Engine, Data Inconsistency between Replicas Found in DS, Major. |
|
Storage Engine, Data Inconsistency between Replicas Found in DS, Minor |
Refer to Storage Engine, Data Inconsistency between Replicas Found in DS, Minor. |
|
Storage Engine, Data Inconsistency between Replicas Found in PLDB, Major |
Refer to Storage Engine, Data Inconsistency between Replicas Found in PLDB, Major. |
|
Storage Engine, Data Inconsistency between Replicas Found in PLDB, Minor |
Refer to Storage Engine, Data Inconsistency between Replicas Found in PLDB, Minor. |
|
Storage Engine, Data Inconsistency between Replicas Repaired, DS |
Refer to Storage Engine, Data Inconsistency between Replicas Repaired, DS. |
|
Storage Engine, Data Inconsistency between Replicas Repaired, PLDB |
Refer to Storage Engine, Data Inconsistency between Replicas Repaired, PLDB. |
|
Storage Engine, Deleted Data Due to Reconciliation |
Refer to Storage Engine, Deleted Data Due to Reconciliation. |
|
Storage Engine, DS Cluster Down |
Refer to Storage Engine, DS Cluster Down. |
|
Storage Engine, DS Cluster in Maintenance Mode |
|
|
Storage Engine, DS Cluster Node Down |
Refer to Storage Engine, DS Cluster Node Down. |
|
Storage Engine, Execution of Selective Replica Check Failed, DS, Major |
Refer to Storage Engine, Execution of Selective Replica Check Failed, DS, Major. |
|
Storage Engine, Execution of Selective Replica Check Failed, PLDB, Major |
Refer to Storage Engine, Execution of Selective Replica Check Failed, PLDB, Major. |
|
Storage Engine, High Load In DS |
Refer to Storage Engine, High Load In DS. |
|
Storage Engine, High Load In PLDB |
Refer to Storage Engine, High Load In PLDB. |
|
Storage Engine, Memory Usage Too High In DS, Full Threshold Reached |
Refer to Storage Engine, Memory Usage Too High In DS, Full Threshold Reached. |
|
Storage Engine, Memory Usage Too High In DS, Warning Threshold Reached |
Refer to Storage Engine, Memory Usage Too High In DS, Warning Threshold Reached. |
|
Storage Engine, Memory Usage Too High In PLDB, Major |
Refer to Storage Engine, Memory Usage Too High In PLDB, Major. |
|
Storage Engine, Memory Usage Too High In PLDB, Warning |
Refer to Storage Engine, Memory Usage Too High In PLDB, Warning. |
|
Storage Engine, No Available Master Replica for DS |
Refer to Storage Engine, No Available Master Replica for DS. |
|
Storage Engine, No Available Master Replica for PLDB |
Refer to Storage Engine, No Available Master Replica for PLDB. |
|
Storage Engine, Out Of Memory In DS |
Refer to Storage Engine, Out Of Memory In DS. |
|
Storage Engine, Out Of Memory In PLDB |
Refer to Storage Engine, Out Of Memory In PLDB. |
|
Storage Engine, Out Of Tablespace In DS |
Refer to Storage Engine, Out Of Tablespace In DS. |
|
Storage Engine, Out Of Tablespace In PLDB |
|
|
Storage Engine, PLDB Cluster Down |
Refer to Storage Engine, PLDB Cluster Down. |
|
Storage Engine, PLDB Cluster In Maintenance Mode |
|
|
Storage Engine, PLDB Cluster Node Down |
Refer to Storage Engine, PLDB Cluster Node Down. |
|
Storage Engine, Potential Data Inconsistency between Replicas Found in DS |
Refer to Storage Engine, Potential Data Inconsistency between Replicas Found in DS. |
|
Storage Engine, Potential Data Inconsistency between Replicas Found in PLDB |
Refer to Storage Engine, Potential Data Inconsistency between Replicas Found in PLDB. |
|
Storage Engine, Replication Channels Down in DS |
|
|
Storage Engine, Replication Channels Down in PLDB |
|
|
Storage Engine, Replication Delay Too High In DS |
|
|
Storage Engine, Replication Delay Too High In PLDB |
Refer to Storage Engine, Replication Delay Too High In PLDB. |
|
Storage Engine, Replication Stopped Working in DS |
|
|
Storage Engine, Replication Stopped Working in PLDB |
Refer to Storage Engine, Replication Stopped Working in PLDB. |
|
Storage Engine, Restore Fault in DS |
Refer to Storage Engine, Restore Fault in DS. |
|
Storage Engine, Restore Fault in PLDB |
Refer to Storage Engine, Restore Fault in PLDB. |
|
Storage Engine, Tablespace Usage Too High In DS, Warning |
Refer to Storage Engine, Tablespace Usage Too High In DS, Warning. |
|
Storage Engine, Tablespace Usage Too High In PLDB, Warning |
Refer to Storage Engine, Tablespace Usage Too High In PLDB, Warning. |
|
Storage Engine, Temporary Data Inconsistency |
|
|
Storage Engine, Unable to Synchronize Cluster in DS, Major |
Refer to Storage Engine, Unable to Synchronize Cluster in DS, Major. |
|
Storage Engine, Unable to Synchronize Cluster in DS, Warning |
Refer to Storage Engine, Unable to Synchronize Cluster in DS, Warning. |
|
Storage Engine, Unable to Synchronize Cluster in PLDB, Major |
Refer to Storage Engine, Unable to Synchronize Cluster in PLDB, Major. |
|
Storage Engine, Unable to Synchronize Cluster in PLDB, Warning |
Refer to Storage Engine, Unable to Synchronize Cluster in PLDB, Warning. |
|
Storage Engine, Unrepaired Data Inconsistency between Replicas, PLDB |
Refer to Storage Engine, Unrepaired Data Inconsistency between Replicas, PLDB. |
|
Storage Engine, Unrepaired Data Inconsistency between Replicas, DS |
Refer to Storage Engine, Unrepaired Data Inconsistency between Replicas, DS. |
2.1.3.2 Lightweight Directory Access Protocol Front End
The alarm model for Lightweight Directory Access Protocol (LDAP) Front End (FE)-related alarms is shown in Figure 4.
Table 3 shows the list of alarms related to LDAP FE.
|
Alarm |
Operating Instruction |
|---|---|
|
LDAP Front End, Processing Capacity Below Minimum |
|
|
LDAP Front End, Processing Redundancy Lost |
|
|
LDAP Front End, Server Down |
Refer to LDAP Front End, Server Down |
2.1.3.3 Server Platform
The alarm model for Server Platform-related alarms is shown in Figure 5.
Table 4 shows the list of alarms related to Server Platform.
|
Alarm |
Operating Instruction |
|---|---|
|
Server Platform, Storage Performance Degradation Detected |
Refer to Server Platform, Storage Performance Degradation Detected |
2.1.3.4 Operating System
The alarm model for Operating System-related alarms is shown in Figure 6.
Table 5 shows the list of alarms related to the Operating System.
|
Alarm |
Operating Instruction |
|---|---|
|
Operating System, Disk Usage Too High |
|
|
Operating System, Server Configuration Backup Fault |
Refer to Operating System, Server Configuration Backup Fault |
2.1.3.5 Control
The alarm model for node visibility and global system status related alarms is shown in Figure 7.
Table 6 shows the list of alarms related to Control.
|
Alarm |
Operating Instruction |
|---|---|
|
Control, Automatic Master Election Locked Down |
|
|
Control, Blackboard Coordination Cluster Down |
|
|
Control, Blackboard Coordination Server Down |
|
|
Control, Messaging Service Cluster Down |
|
|
Control, Messaging Service Server Down |
|
|
Control, Potential Split Brain Detected |
|
|
Control, Remote Node Unreachable |
Refer to Control, Remote Node Unreachable |
|
Control, Remote Site Unreachable |
Refer to Control, Remote Site Unreachable |
2.1.3.6 Application Counters
The alarm model for Application Counters-related alarms is shown in Figure 8.
Table 7 shows the list of alarms related to Application Counters.
|
Alarm |
Operating Instruction |
|---|---|
|
Application Counters, Fault In Subscriber Statistic Application |
Refer to Application Counters, Fault In Subscriber Statistic Application |
2.1.3.7 Service Availability Forum
The alarm model for Service Availability Forum (SAF)-related alarms is shown in Figure 9.
Table 8 shows the list of alarms related to SAF.
|
Alarm |
Operating Instruction |
|---|---|
|
Refer to SAF, AMF Component Cleanup Failed |
|
|
Refer to SAF, AMF SI Unassigned |
|
|
Refer to SAF, CLM Cluster Node Unavailable |
|
|
Refer to SAF, LOTC Ethernet Bonding Failed |
|
|
Refer to SAF, LOTC Memory Usage Failed |
|
2.1.3.8 Security
The alarm model for Security-related alarms is shown in Figure 10.
Table 9 shows the list of alarms related to Security.
|
Alarm |
Operating Instruction |
|---|---|
|
Security, OAM User Exceeded Number Of Failed Logins |
Refer to Security, OAM User Exceeded Number Of Failed Logins |
|
Security, OAM User Gaining Privilege Failed |
|
|
Security, OAM User Privilege Raise To Root Failed |
|
|
Security, Root Login Failed |
Refer to Security, Root Login Failed |
2.1.3.9 Preventive Maintenance
2.1.3.10 Licensing
The alarm model for Licensing - related alarms is shown in Figure 12.
Table 11 shows the list of alarms related to Licensing.
|
Alarm |
Operating Instruction |
|---|---|
|
Licensing, Autonomous Mode Activated |
Refer to Licensing, Autonomous Mode Activated. |
|
Licensing, Capacity Usage Threshold Reached, Major |
Refer to Licensing, Capacity Usage Threshold Reached, Major. |
|
Licensing, Capacity Usage Threshold Reached, Warning |
Refer to Licensing, Capacity Usage Threshold Reached, Warning. |
|
Licensing, Key File Fault |
Refer to Licensing, Key File Fault. |
|
Licensing, License Key Not Available, Major |
|
|
Licensing, License Key Not Available, Minor |
|
|
Licensing, License Manager Not Available |
Refer to Licensing, License Manager Not Available. |
2.1.3.11 SOAP Notifications
The alarm model for SOAP Notifications-related alarms is shown in Figure 13.
|
Alarm |
Operating Instruction |
|---|---|
|
SOAP Notifications, Discarded Notifications |
|
|
SOAP Notifications, Endpoint Unreachable |
Refer to SOAP Notifications, Endpoint Unreachable. |
2.1.4 Alarm Relationships
The following alarm relationships are present in the system:
2.2 Infrastructure Alarms
Consider the following regarding infrastructure alarms:
3 Configuration
Some of the alarms are raised when the value of a parameter in CUDB goes above a configured threshold. See the specific alarm OPIs for information on the applicable parameters and thresholds, and how to configure them. It is also possible to configure the NMS IP address where alarm traps are sent. The version used for SNMP is version 3.
For more details about how to configure SNMP for CUDB application components, refer to ESA Fault Management. For more information on configuring SNMP for infrastructure components, see Infrastructure Alarms.
| Note: |
In CUDB ESA configuration, the Master Agent Main Port is
60. |
Reference List
- Storage Engine, Automatic Handling of Network Isolation not Completed for DS
- Storage Engine, Automatic Handling of Network Isolation not Completed for PLDB
- Storage Engine, Backup Fault In DS
- Storage Engine, Backup Fault In PLDB
- Storage Engine, Backup Notification Failure To Provisioning Gateway
- Storage Engine, Data Inconsistency between Replicas Found in DS, Major
- Storage Engine, Data Inconsistency between Replicas Found in DS, Minor
- Storage Engine, Data Inconsistency between Replicas Found in PLDB, Major
- Storage Engine, Data Inconsistency between Replicas Found in PLDB, Minor
- Storage Engine, Data Inconsistency between Replicas Repaired, DS
- Storage Engine, Data Inconsistency between Replicas Repaired, PLDB
- Storage Engine, Deleted Data Due to Reconciliation
- Storage Engine, DS Cluster Down
- Storage Engine, DS Cluster in Maintenance Mode
- Storage Engine, DS Cluster Node Down
- Storage Engine, Execution of Selective Replica Check Failed, DS, Major
- Storage Engine, Execution of Selective Replica Check Failed, PLDB, Major
- Storage Engine, High Load In DS
- Storage Engine, High Load In PLDB
- Storage Engine, Memory Usage Too High In DS, Full Threshold Reached
- Storage Engine, Memory Usage Too High In DS, Warning Threshold Reached
- Storage Engine, Memory Usage Too High In PLDB, Major
- Storage Engine, Memory Usage Too High In PLDB, Warning
- Storage Engine, No Available Master Replica for DS
- Storage Engine, No Available Master Replica for PLDB
- Storage Engine, Out Of Memory In DS
- Storage Engine, Out Of Memory In PLDB
- Storage Engine, Out Of Tablespace In DS
- Storage Engine, Out Of Tablespace In PLDB
- Storage Engine, PLDB Cluster Down
- Storage Engine, PLDB Cluster In Maintenance Mode
- Storage Engine, PLDB Cluster Node Down
- Storage Engine, Potential Data Inconsistency between Replicas Found in DS
- Storage Engine, Potential Data Inconsistency between Replicas Found in PLDB
- Storage Engine, Replication Channels Down in DS
- Storage Engine, Replication Channels Down in PLDB
- Storage Engine, Replication Delay Too High In DS
- Storage Engine, Replication Delay Too High In PLDB
- Storage Engine, Replication Stopped Working in DS
- Storage Engine, Replication Stopped Working in PLDB
- Storage Engine, Restore Fault in DS
- Storage Engine, Restore Fault in PLDB
- Storage Engine, Tablespace Usage Too High In DS, Warning
- Storage Engine, Tablespace Usage Too High In PLDB, Warning
- Storage Engine, Temporary Data Inconsistency
- Storage Engine, Unable to Synchronize Cluster in DS, Major
- Storage Engine, Unable to Synchronize Cluster in DS, Warning
- Storage Engine, Unable to Synchronize Cluster in PLDB, Major
- Storage Engine, Unable to Synchronize Cluster in PLDB, Warning
- Storage Engine, Unrepaired Data Inconsistency between Replicas, PLDB
- Storage Engine, Unrepaired Data Inconsistency between Replicas, DS
- Server Platform, Storage Performance Degradation Detected
- LDAP Front End, High Load in LDAP Processing Layer
- LDAP Front End, Processing Capacity Below Minimum
- LDAP Front End, Processing Redundancy Lost
- LDAP Front End, Server Down
- Operating System, Disk Usage Too High
- Operating System, Server Configuration Backup Fault
- Control, Automatic Master Election Locked Down
- Control, Blackboard Coordination Cluster Down
- Control, Blackboard Coordination Server Down
- Control, Messaging Service Cluster Down
- Control, Messaging Service Server Down
- Control, Potential Split Brain Detected
- Control, Remote Node Unreachable
- Control, Remote Site Unreachable
- Application Counters, Fault In Subscriber Statistic Application
- SAF, AMF Component Cleanup Failed
- SAF, AMF Component Instantiation Failed
- SAF, AMF SI Unassigned
- SAF, CLM Cluster Node Unavailable
- SAF, LOTC Disk Replication Communication Failed
- SAF, LOTC Disk Replication Consistency Failed
- SAF, LOTC Ethernet Bonding Failed
- SAF, LOTC Memory Usage Failed
- SAF, LOTC Time Synchronization Failed
- Security, OAM User Exceeded Number Of Failed Logins
- Security, OAM User Gaining Privilege Failed
- Security, OAM User Privilege Raise To Root Failed
- Security, Root Login Failed
- Preventive Maintenance, Logchecker Found Error(s)
- Licensing, Autonomous Mode Activated
- Licensing, Capacity Usage Threshold Reached, Major
- Licensing, Capacity Usage Threshold Reached, Warning
- Licensing, Key File Fault
- Licensing, License Key Not Available, Major
- Licensing, License Key Not Available, Minor
- Licensing, License Manager Not Available
- SOAP Notifications, Discarded Notifications
- SOAP Notifications, Endpoint Unreachable
- CUDB Consistency Check
- CUDB Glossary of Terms and Acronyms
Other Documents and Online References
- Information Technology - Open Systems Interconnection - Systems Management Alarm Reporting Function ITU-T X.733. CCITT Rec. X.733 (1992 E) http://www.itu.int/rec/T-REC-X.733/
Contents