CUDB Troubleshooting Guide

Contents

1Introduction
1.1Purpose and Scope
1.2Revision Information
1.3Target Groups
1.4Prerequisites
1.5Related Information

2

Tools
2.1CLI Commands
2.2UDC Cockpit Tool

3

Troubleshooting Procedure

4

Trouble Reporting

Glossary

Reference List

1   Introduction

This document provides troubleshooting information for CUDB nodes. However, since a CUDB system is made up of a set of CUDB nodes, the document is also valid for troubleshooting system-level problems and failures.

1.1   Purpose and Scope

The purpose of this document is to provide the instructions and tools needed to recover a CUDB node in case of abnormal behavior or failures.

The document does not contain information on the maintenance tasks and configuration procedures, refer to CUDB Node Preventive Maintenance, Reference [1] for more information on these topics.

For more information on the user names and passwords used in this document, refer to CUDB Users and Passwords, Reference [2] CUDB Users and Passwords.

Note:  
In case the procedures described in this document do not fix the experienced faults, contact the next level of Ericsson support.

1.2   Revision Information


Rev. A
Rev. B
Rev. C
Rev. D
Rev. E
Rev. F
Rev. G
Rev. H
Rev. J
Rev. K
Rev. L

Other than editorial changes, this document has been updated as follows:

1.3   Target Groups

This document is intended for personnel working with the CUDB system.

1.4   Prerequisites

This document provides troubleshooting information only for properly installed and configured CUDB nodes.

Before starting the troubleshooting procedure, ensure the following:

Attention!

Do not activate trace or log without prior consultation with Ericsson, as it can affect traffic throughput. Certain troubleshooting activities can have an impact on node performance.

1.5   Related Information

Definition and explanation of acronyms and terminology, trademark information, and typographic conventions can be found in the following documents:

2   Tools

This section describes the tools that can be used to troubleshoot the CUDB system.

2.1   CLI Commands

This section describes the tools and resources that can be used to help troubleshoot CUDB through the command line interface of CUDB Nodes.

2.1.1   Checking the Active System Controller

Several troubleshooting resources (such as the cudbGetLogs and cudbAnalyser scripts, or the CoreMW console) can be executed only on the active System Controller (SC). If needed, use the following command to check which SC is the active one:

# cudbHaState | grep COM | grep ACTIVE

The expected output must be similar to the below example:

COM is assigned as ACTIVE in controller SC-1.

2.1.2   CUDB Logchecker

CUDB offers a troubleshooting tool called Logchecker, a set of scripts that aim to improve the in-service performance of CUDB, and also help troubleshooting by means of log collection and automatic log analysis.

The CUDB Logchecker consists of the following two scripts:

Note:  
The cudbGetLogs and cudbAnalyser scripts can be executed only on active SCs. See Section 2.1.1 for more information on how to check the active SC.

By default, Logchecker performs scheduled log analysis at 00:50 and 12:50, and saves the detailed result in the following location:

/home/cudb/monitoring/preventiveMaintenance/cron_analysis.<SC_NAME>.log

In the above path, the <SC_NAME> variable can be SC_2_1 or SC_2_2.

The automatic log analysis raises or clears alarms according to the analysis result. The severity of the alarms depends on the severity of the detected faults. For further information, refer to Preventive Maintenance, Logchecker Found Error(s), Reference [7].

For more information about the CUDB Logchecker, refer to CUDB Logchecker, Reference [8] and CUDB Node Commands and Parameters, Reference [6].

2.1.2.1   Manual Log Collection

The CUDB Logchecker logs can be collected manually with the cudbGetLogs command. An example output of the command is shown in Example 1.

Example 1   Manual Log Collection

CUDB107 SC_2_1# cudbGetLogs
Starting /opt/ericsson/cudb/OAM/bin/cudbGetLogs ...
Grepping logs and creating /home/cudb/ \
monitoring/preventiveMaintenance/ \
CUDB_107_201304091136.log ...
The log file is saved as : /home/cudb/\
monitoring/preventiveMaintenance/ \
CUDB_107_201304091136.log
CUDB107 SC_2_1#

Use this content to provide fault information to the next level of Ericsson support.

2.1.2.2   Log Analysis

Two options are available to trigger unscheduled log analysis:

2.1.3   CUDB Logs Collection

The cudbCollectInfo command creates a compressed archive from the CUDB logs. Use the output of this command to provide fault information to the next level of Ericsson support.

For further information about this command, execute cudbCollectInfo -h, or refer to CUDB Node Commands and Parameters, Reference [6].

Note:  
cudbCollectInfo is a system-level command, executed on all nodes automatically. Therefore, only one system-wide cudbCollectInfo command can be executed in the CUDB system at a time.

The command can take from 5 to 20 minutes, depending of the amount of information to be collected.


2.1.4   LDE Status

The status of the Linux Distribution Extension (LDE) platform can be checked with the cudbHaState command. To use this command, at least one SC must be in ACTIVE state.

Execute the command on an SC as follows (in the example below, SC_2_1 is used):

SC_2_1# cudbHaState

Execute the following command to check the alarm status on all clusters:

SC_2_1# cluster alarm --status --all

For more details about the commands, refer to CUDB Node Commands and Parameters, Reference [6] and Data Collection Guideline for CUDB, Reference [9].

2.1.5   ESA Processes

The status of the ESA processes can be checked by performing the following steps:

  1. Check that the ESA agents are running on both SC. Execute the following command on both SCs:

    esa status

    The expected output must look similar to the following:

    [info] ESA Sub Agent is running.
    [info] ESA Master Agent is running.
    [info] ESA PM Agent is running.

    1. Check the ESA FM Agent cluster status with the following command on any of the SC blades:

      esaclusterstatus

      The expected output must look similar to the example below. One SC blade must be in M state, while the other in (M) state:


      M * OAM1 10.22.0.1
      (M) OAM2 10.22.0.2

    2. Check the ESA Master Agent cluster status with the -v option:

      esaclusterstatus -v

      The expected output must look similar to the example below:


      M * esama esafma OAM1 10.22.0.1
      (M) esama esafma OAM2 10.22.0.2

      Where ESA in the first host listed is the ESA Master Agent Active Master.

    The description of the different states are:

    M ESA Master is located on that SC.
    (M) ESA Slave is located on that SC.
    * The SC from where the command was sent.
    esama ESA Master Agent is running on that SC.
    esafma ESA FM Agent is running on that SC.
    Inactive Cluster (ESAFM or ESAMA) is inactive for ESA on that SC.
    Unknown Cluster (ESAFM or ESAMA) is configured as active for ESA on that SC, but it is not connected to the cluster.

    If a host is not shown, we can assume that either the node is down or only the ESA is down in that host.

    In case the ESA cluster mode is inactive, the output will simply indicate it as follows:

    Cluster mode inactive.

    For more details about the command, refer to the Command: Cluster Status section of ESA Setup and Configuration, Reference [10].

2.1.6   Check Database Consistency between Database Clusters

Use the cudbCheckConsistency command to check database consistency in a lightweight manner between database clusters (that is, between the master PLDB or DSG replicas and their slaves) as follows:

SC_2_1# cudbCheckConsistency

For more details about the command, refer to CUDB Node Commands and Parameters, Reference [6].

2.1.7   cudbConsistencyMgr

Use the cudbConsistencyMgr command to check database consistency between database clusters (that is, between the master PLDB or DSG replicas and their slaves) as follows:

SC_2_1# cudbConsistencyMgr --order ms --node <nodeid> {--dsg <dsgid> | --pl}

For more details about the command, refer to CUDB Node Commands and Parameters, Reference [6].

2.1.8   Check Replication Channels Behavior

Replication channels on CUDB system level can be checked by generating dummy transactions, and then verifying that those transactions are replicated properly for each DSG and PLDB slave replica.

Use the cudbCheckReplication command as follows to generate dummy transactions:

SC_2_1# cudbCheckReplication

For more details about the command, refer to CUDB Node Commands and Parameters, Reference [6].

2.2   UDC Cockpit Tool

To follow present and recall earlier system status and performance information of CUDB nodes, use the UDC Cockpit. This is a monitoring application, which presents collected data on a single, web-based GUI.

3   Troubleshooting Procedure

A troubleshooting workflow is shown in Figure 1.

Figure 1   Troubleshooting Workflow

4   Trouble Reporting

Problems identified that cannot be solved by using this document must be reported to the next level of maintenance support through a Customer Service Report (CSR).

The details of the trouble reporting process is outside the scope of this document.

When collecting information for further support, ensure that all current logs are recorded. See time and date for the logs.

For more information on how to collect information, refer to Data Collection Guideline for CUDB, Reference [9].

When sending crash dumps, ensure that the dump is of the actual scenario. See time and date for the dump.


Glossary

For the terms, definitions, acronyms and abbreviations used in this document, refer to CUDB Glossary of Terms and Acronyms, Reference [3].


Reference List

CUDB Documents
[1] CUDB Node Preventive Maintenance.
[2] CUDB Users and Passwords, 3/00651-HDA 104 03/10
[3] CUDB Glossary of Terms and Acronyms.
[4] Trademark Information.
[5] Typographic Conventions.
[6] CUDB Node Commands and Parameters.
[7] Preventive Maintenance, Logchecker Found Error(s).
[8] CUDB Logchecker.
[9] Data Collection Guideline for CUDB.
Other Ericsson Documents
[10] ESA Setup and Configuration.


Copyright

© Ericsson AB 2016-2018. All rights reserved. No part of this document may be reproduced in any form without the written permission of the copyright owner.

Disclaimer

The contents of this document are subject to revision without notice due to continued progress in methodology, design and manufacturing. Ericsson shall have no liability for any error or damage of any kind resulting from the use of this document.

Trademark List
All trademarks mentioned herein are the property of their respective owners. These are shown in the document Trademark Information.

    CUDB Troubleshooting Guide