1 Introduction
This instruction concerns alarm handling for the Storage Engine, Data Inconsistency between Replicas Found in DS, Minor alarm.
1.1 Alarm Description
This alarm is raised as a result of issues found when running a Consistency Check on a Data Store Unit Group (DSG) slave replica by means of the cudbConsistencyMgr command. For further information about the command, refer to CUDB Node Commands and Parameters, Reference [1]. For more information about Consistency Check, refer to CUDB Consistency Check, Reference [2].
The alarm is issued in the following situation:
- The number of divergences between the slave replica and its master replica is lower than the threshold (default or selected value).
The possible alarm causes and the corresponding fault reasons, fault locations, and impacts are described in Table 1.
|
Alarm Cause |
Description |
Fault Reason |
Fault Location |
Impact |
|---|---|---|---|---|
|
Divergences between the slave replica and its master replica. |
The number of divergences between the slave replica and its master replica found when running a consistency check is lower than the threshold. |
Wrong backup restored on the slave replica. |
Affected DS cluster. |
If this slave replica becomes the master replica, there might be a service impact for the subscribers affected by the data inconsistency. |
|
Attribute Name |
Attribute Value |
|---|---|
|
Auto Cease |
No |
|
Module |
STORAGE-ENGINE |
|
Error Code |
21 |
|
Timestamp First |
Date and time when the alarm was raised for the first time. |
|
Repeated Counter |
Number which indicates how many times the alarm was raised. |
|
Timestamp Last |
Date and time of the most recent alarm raised. |
|
Resource ID |
.1.3.6.1.4.1.193.169.1.2.21.<DG>.<NODE> |
|
Alarm Model Description |
Data inconsistency between replicas found, Storage Engine. |
|
Alarm Active Description |
Storage Engine (DS-group #<DG>): Data inconsistency between replicas found, minor, in node #<NODE> (task <TASKID>) . |
|
ITU Alarm Event Type |
processingErrorAlarm (4) |
|
ITU Alarm Probable Cause |
databaseInconsistency (160) |
|
ITU Alarm Perceived Severity |
(5) - Minor |
|
Originating Source IP |
Node IP where the alarm was raised. |
|
Sequence Number |
Number which indicates the order in which alarms were raised. |
In Table 2, the indicated variables are as follows:
- <DG> is the DSG the DS cluster belongs to.
- <NODE> is the CUDB Node identifier the inconsistent slave replica is located at.
- <TASKID> is the identifier of the check task.
1.2 Prerequisites
This section provides information on the documents, tools, and conditions that apply to the procedure.
1.2.1 Documents
Before starting this procedure, ensure that you have read the following documents:
- CUDB Node Fault Management Configuration Guide, Reference [3], regarding alarm configuration.
- The section on the cudbConsistencyMgr command in CUDB Node Commands and Parameters, Reference [1].
- CUDB LDAP Interwork Description, Reference [4] and CUDB Consistency Check, Reference [2] regarding the LDAP tree log file and Consistency Check.
- CUDB System Administrator Guide, Reference [5] regarding the location of master replicas.
- CUDB Backup and Restore Procedures, Reference [6] regarding the combined unit data backup and restore procedure.
- System Safety Information, Reference [8].
- Personal Health and Safety Information, Reference [9].
1.2.2 Tools
Not applicable.
1.2.3 Conditions
Not applicable.
2 Procedure
When this alarm is raised, perform the following steps:
- Locate and identify the Lightweight Directory Access Protocol
(LDAP) tree log based on the <DG> and <TASKID> parameters
in the alarm as follows:
Search for the /local/cudb_ddci/replica_check/cudbDsuDiff_tree_TASKID*.xml file on both System Controllers (SCs) of:
- Either the CUDB node that contained the master replica of the DSG with ID <DG> at the time the check was executed.
- Or, if the previous information is not known, all the
CUDB nodes that hold a replica of the DSG with ID <DG>, except the current node.
- Note:
- In case of 1+1+1 redundancy, this may mean 2 CUDB nodes to check.
- Align the slave replica to its master in DSG with ID <DG> as follows:
- If this node is still a slave node of the DSG with ID <DG>, and the node where the LDAP tree
log was found is still the master node of the DSG with ID <DG>, perform a combined unit data backup
and restore on the DSG against this CUDB node.
- Note:
- If the DSG has more than one slave, then consider ordering consistency check for the rest of the slaves after restoring the backup.
- If this node (which was the checked slave at the time
of the check) has become a DSG master node since raising the alarm,
perform a combined unit data backup and restore on the DSG against
the CUDB node where the LDAP tree log was found.
- Note:
- Do not perform combined backup and restore if that has been
done since the mastership change of the DSG.
Also, if the DSG has more than one slave, then consider ordering consistency check for the rest of the slaves after restoring the backup.
- In any other case, ignore the results. Consider that the DSG may not be consistent, and further checks—involving the new master—are needed to verify it.
- If this node is still a slave node of the DSG with ID <DG>, and the node where the LDAP tree
log was found is still the master node of the DSG with ID <DG>, perform a combined unit data backup
and restore on the DSG against this CUDB node.
- Perform the following steps to find subscribers impacted
by data inconsistency:
- Analyze the LDAP tree log file to find the subscribers
impacted by the data inconsistency.
For User Data Consolidation (UDC) application data (such as HLR or HSS), build a list of the impacted Distribution Entries (DEs, mscId and assocId; refer to CUDB LDAP Interwork Description, Reference [4] for further details). A DE is impacted if any of its child entries are included in the LDAP tree log file.
For example, if an entry with the ImsShDynInfId=ImsShDynInf,IMPU=sip:262280000194171@ims.mnc280.mcc262.3gppnetwork.org,serv=IMS,assocId=194171,ou=associations,<rootDn> DN shows up in the LDAP tree log file, then the assocId=194171,ou=associations,<rootDn> DE is impacted.
- Once the list of impacted DEs is ready, identify the
related public identities for each impacted DE by running the following
filtered LDAP query:
ldapsearch -x -h <cudbNodeVIP> -p 389 -D <bindDn> -w <bindPassword> -b <impactedDistributionEntryDn> -s subtree -a always | grep -E "^MSISDN|^IMSI|^IMPU|^IMPI"
- Validate the subscribers with those public identities, or consider reprovisioning them. Refer to the application Front End (FE) documentation for the procedure to validate or reprovision subscribers.
- Analyze the LDAP tree log file to find the subscribers
impacted by the data inconsistency.
- If the LDAP tree log contains internal error, notify the next level of support. To interpret the contents of the LDAP tree log, refer to CUDB Consistency Check, Reference [2].
- Clear the alarm manually as described in CUDB Node Fault Management Configuration Guide, Reference [3].
To find out where the master replicas are, refer to CUDB System Administrator Guide, Reference [5]. For further information about the combined unit data backup and restore procedure, refer to CUDB Backup and Restore Procedures, Reference [6].
Glossary
For the terms, definitions, acronyms, and abbreviations used in this document, refer to CUDB Glossary of Terms and Acronyms, Reference [7].
Reference List
| Other Ericsson Documents |
|---|
| [8] System Safety Information. |
| [9] Personal Health and Safety Information. |

Contents