1 Introduction
This instruction concerns alarm handling for the Storage Engine, Data Inconsistency between Replicas Found in PLDB, Minor alarm.
1.1 Alarm Description
This alarm is raised as a result of issues found when running a Consistency Check on a Processing Layer Database (PLDB) slave replica by means of the cudbConsistencyMgr command. For further information about the command, refer to CUDB Node Commands and Parameters, Reference [1]. For more information about Consistency Check, refer to CUDB Consistency Check, Reference [2].
The alarm is issued in the following situation:
- The number of divergences between the slave replica and its master replica is lower than the threshold (default or selected value).
The possible alarm causes and the corresponding fault reasons, fault locations, and impacts are described in Table 1.
|
Alarm Cause |
Description |
Fault Reason |
Fault Location |
Impact |
|---|---|---|---|---|
|
Divergences between the slave replica and its master replica. |
The number of divergences between the slave replica and its master replica found when running a consistency check is lower than the threshold. |
Wrong backup restored on the slave replica. |
Affected PLDB replica. |
If this slave replica becomes the master replica, there might be a service impact for the subscribers affected by the data inconsistency. |
|
Attribute Name |
Attribute Value |
|---|---|
|
Auto Cease |
No |
|
Module |
STORAGE-ENGINE |
|
Error Code |
21 |
|
Timestamp First |
Date and time when the alarm was raised for the first time. |
|
Repeated Counter |
Number which indicates how many times the alarm was raised. |
|
Timestamp Last |
Date and time of the most recent alarm raised. |
|
Resource ID |
.1.3.6.1.4.1.193.169.1.1.21.<NODE1>.<NODE2> |
|
Alarm Model Description |
Data inconsistency between replicas found, Storage Engine. |
|
Alarm Active Description |
Storage Engine (PLDB): Data inconsistency between replicas found, minor, nodes #<NODE1> and #<NODE2> (task <TASKID>) . |
|
ITU Alarm Event Type |
processingErrorAlarm (4) |
|
ITU Alarm Probable Cause |
databaseInconsistency (160) |
|
ITU Alarm Perceived Severity |
(5) - Minor |
|
Originating Source IP |
Node ID where the alarm was raised. |
|
Sequence Number |
Number which indicates the order in which alarms were raised. |
In Table 2, the indicated variables are as follows:
- <NODE1> is the CUDB Node Identifier the first PLDB replica is located at.
- <NODE2> is the CUDB Node Identifier the second PLDB replica is located at.
- <TASKID> is the identifier of the check task.
1.2 Prerequisites
This section provides information on the documents, tools, and conditions that apply to the procedure.
1.2.1 Documents
Before starting this procedure, ensure that you have read the following documents:
- CUDB Node Fault Management Configuration Guide, Reference [3], regarding alarm configuration.
- The section on the cudbConsistencyMgr command in CUDB Node Commands and Parameters, Reference [1].
- CUDB LDAP Interwork Description, Reference [4] and CUDB Consistency Check, Reference [2] regarding the LDAP tree log file and Consistency Check.
- CUDB System Administrator Guide, Reference [6] regarding the location of master replicas.
- CUDB Backup and Restore Procedures, Reference [5] regarding the combined unit data backup and restore procedure.
- System Safety Information, Reference [8].
- Personal Health and Safety Information, Reference [9].
1.2.2 Tools
Not applicable.
1.2.3 Conditions
Not applicable.
2 Procedure
When this alarm is raised, perform the following steps:
- Locate and identify the Lightweight Directory Access Protocol (LDAP) tree log based on the <NODE1>, <NODE2>, and <TASKID> parameters in the alarm as follows:
- Align the slave replica to its master in the PLDB as follows:
- If this node is still a PLDB slave node, and the node
with ID <NODE1> is still
the master node of the PLDB, perform a combined unit data backup and
restore on the PLDB against this CUDB node.
- Note:
- When performing a PLDB restore, the cudbUnitDataBackupAndRestore command does not work, just exits with an error message, if any
of the DS clusters in the node is the master replica of its data partition.
Therefore, it is required to move the DSG masterships manually before
performing the PLDB restore.
Also, if the PLDB has more than one slave, then consider ordering consistency check for the rest of slaves after restoring the backup.
- If this node (which was the checked slave at the time
of the check) has become a PLDB master node since raising the alarm,
perform a combined unit data backup and restore on the PLDB against
the CUDB node with ID <NODE1>.
- Note:
- Do not perform combined backup and restore if that has been
done since the mastership change of the PLDB.
Also, when performing a PLDB restore, the cudbUnitDataBackupAndRestore command does not work, just exits with an error message, if any of the DS clusters in the node is the master replica of its data partition. Therefore, it is required to move the DSG masterships manually before performing the PLDB restore.
Finally, if the PLDB has more than one slave, then consider ordering consistency check for the rest of slaves after restoring the backup.
- In any other cases, ignore the results. Consider that the PLDB may not be consistent and further checks—involving the new master—are needed to verify it.
- If this node is still a PLDB slave node, and the node
with ID <NODE1> is still
the master node of the PLDB, perform a combined unit data backup and
restore on the PLDB against this CUDB node.
- Analyze the LDAP tree log file as described below to find
the data impacted by the inconsistency:
- For impacted entries in the ou=identities,<rootDn> branch, validate the subscribers with those public identities, or consider reprovisioning them. Refer to the application Front End (FE) documentation for the procedure to validate or reprovision subscribers.
- For impacted entries in the ou=mscCommonData,<rootDn> or ou=servCommonData,<rootDn> branches, validate or reprovision the impacted entry. Refer to the application FE documentation for the procedure to validate or reprovision common data.
- For impacted entries in the ou=admin,<rootDn> branch, validate that the LDAP user data is correct, or delete, then add again the impacted LDAP user by following the procedure described in CUDB System Administrator Guide, Reference [6].
Refer to CUDB LDAP Interwork Description, Reference [4] for more information on the CUDB Main Directory Information Tree (DIT).
- If the LDAP tree log contains an internal error, notify the next level of support. To interpret the contents of the LDAP tree log, refer to CUDB Consistency Check, Reference [2].
- Clear the alarm manually as described in CUDB Node Fault Management Configuration Guide, Reference [3].
To find out where the master replicas are, refer to CUDB System Administrator Guide, Reference [6]. For further information about the combined unit data backup and restore procedure, refer to CUDB Backup and Restore Procedures, Reference [5].
Glossary
For the terms, definitions, acronyms, and abbreviations used in this document, refer to CUDB Glossary of Terms and Acronyms, Reference [7].
Reference List
| Other Ericsson Documents |
|---|
| [8] System Safety Information. |
| [9] Personal Health and Safety Information. |

Contents