Storage Engine, Unable to Synchronize Cluster in PLDB, Warning
Ericsson Centralized User Database

Contents

1Introduction
1.1Alarm Description
1.2Prerequisites

2

Procedure

Glossary

Reference List

1   Introduction

This document provides the description and troubleshooting steps to take for the Storage Engine, Unable to Synchronize Cluster in PLDB, Warning alarm.

1.1   Alarm Description

The alarm is raised when the Automatic Handling of Network Isolation process is starting attempt(s) to repair Processing Layer Database (PLDB) cluster inconsistency between the former and current master replica servers or the Self-Ordered Backup and Restore process is starting to restore the replication on a PLDB slave which cannot synchronize with its master replica.

The alarm is issued in the following situations:

If the CUDB system enters a state in which no master replica can be reached from the current node for the PLDB, then this alarm is cleared automatically, and the Storage Engine, No Available Master Replica for PLDB, Reference [1] alarm is raised.

The possible alarm causes and the corresponding fault reasons, fault locations and impacts are described in Table 1.

Table 1    Alarm Causes

Alarm Cause

Description

Fault Reason

Fault Location

Impact

Automatic Handling of Network Isolation process started.

Automatic Handling of Network Isolation process is starting the Selective Replica Check and the Data Repair tasks.

There is non-replicated data on the former master replica (missing on new master replica):


  • Due to unexpected mastership change where replication is not restored, slave (former master) replica and master (former slave) replica are not aligned, and their contents are not consistent.

  • Due to recovery from a System Split situation when there are two system partitions each one having its own master replica, when the whole system is rejoined and a single master replica must be elected.


And the Automatic Handling of Network Isolation process is starting rescuing tasks.

Both replica servers.

Not applicable. This alarm is part of the Automatic Handling of Network Isolation process.

Self-Ordered Backup and Restore process started.

The Self-Ordered Backup and Restore process is starting the replication restoration.

The cases when the PLDB slave database cluster cannot synchronize with its corresponding master replica are the following:


  • Due to unexpected mastership change where replication is not restored, slave (former master) replica and master (former slave) replica are not aligned, and their contents are not consistent.

  • Due to recovery from a System Split situation when there are two system partitions, each one having its own master replica, when the whole system is rejoined and a single master replica must be elected.

  • Any other situation where CUDB is unable to recover a failed replica.


And the Self-Ordered Backup and Restore process is starting for the affected slave replica.

Slave replica servers.

Not applicable. This alarm is part of the Self-Ordered Backup and Restore process.

Note:  
An alarm can appear as a result of maintenance activity.

The following are the consequences for the node if the alarm is not solved:

The alarm attributes are listed and explained in Table 2.

Table 2    Alarm Attributes

Attribute Name

Attribute Value

Auto Cease

Yes

Module

STORAGE-ENGINE

Error Code

28

Timestamp First

Date and time when the alarm was raised for the first time.

Repeated Counter

Number which indicates how many times the alarm was raised.

Timestamp Last

Date and time of the most recent alarm raised.

Resource ID

.1.3.6.1.4.1.193.169.1.1.1

Alarm Model Description

Unable to synchronize cluster, Storage Engine.

Alarm Active Description

Storage Engine (PLDB): Synchronization to current master impossible. <add_info> (task <taskid>, time <Timestamp> - <DateTime>).

ITU Alarm Event Type

qualityOfServiceAlarm (3)

ITU Alarm Probable Cause

equipmentMalfunction (514)

ITU Alarm Perceived Severity

(6) – Warning

Originating Source IP

Node ID where the alarm was raised.

Sequence Number

Number which indicates the order in which alarms were raised.

In Table 2, the indicated variables are as follows:

Note:  
  • <taskid>, <Timestamp>, and <DateTime> are not shown in case of Self-Ordered Backup and Restore.

For more information about attributes description, refer to CUDB Node Fault Management Configuration Guide, Reference [2].

1.2   Prerequisites

This section provides information on the documents, tools, and conditions that apply to the procedure.

1.2.1   Documents

Before starting this procedure, ensure that you have read the following documents:

1.2.2   Tools

Not applicable.

1.2.3   Conditions

Not applicable.

2   Procedure

Not applicable. Further actions are part of the Automatic Handling of Network Isolation or Self-Ordered Backup and Restore process.


Glossary

For the terms, definitions, acronyms, and abbreviations used in this document, refer to CUDB Glossary of Terms and Acronyms, Reference [3].


Reference List

CUDB Documents
[1] Storage Engine, No Available Master Replica for PLDB.
[2] CUDB Node Fault Management Configuration Guide.
[3] CUDB Glossary of Terms and Acronyms.
Other Ericsson Documents
[4] System Safety Information.
[5] Personal Health and Safety Information.


Copyright

© Ericsson AB 2016, 2017. All rights reserved. No part of this document may be reproduced in any form without the written permission of the copyright owner.

Disclaimer

The contents of this document are subject to revision without notice due to continued progress in methodology, design and manufacturing. Ericsson shall have no liability for any error or damage of any kind resulting from the use of this document.

Trademark List
All trademarks mentioned herein are the property of their respective owners. These are shown in the document Trademark Information.

    Storage Engine, Unable to Synchronize Cluster in PLDB, Warning         Ericsson Centralized User Database