1 Introduction
This document describes the performance management solution and Key Performance Indicators (KPIs) provided by Ericsson Centralized User Database (CUDB).
1.3 Target Groups
1.4 Prerequisites
1.5 Typographic Conventions
Typographic conventions can be found in the following document:
2 Counters in CUDB
2.1 Overview
A set of counter groups is provided for each CUDB node, containing performance data for the following:
More details about the information provided by CUDB counters can be found in CUDB Counters List.
| Note: |
As part of the integration of different application Front
Ends (FEs), CUDB also provides the Application Counters Framework.
The framework makes it possible for application FEs to have CUDB gather
and publish performance management information about their application
data stored in CUDB (on behalf of the application FEs). For more information
about this framework, refer to
CUDB Application Counters. |
A special set of counters, KPIs, are distributed in the "Overall CUDB node performance" and "Database cluster" counter groups. For more information about KPIs and their purpose, see CUDB KPIs.
2.2 Counter Generation and Publishing
CUDB counters are generated and published independently on each CUDB node, and are available only on that node. They are not replicated to the rest of the CUDB system.
The generation of counter value samples and publishing of counter data are independent processes, with different execution periods:
Counters are published in 3GPP XML format and can be found in the following output location:
/home/cudb/oam/performanceMgmt/output/
The file format is described in ESA XML Interface for Performance Management.
Depending on counter type, the files contain the following information:
For gauge counters:
For accumulated counters:
Counter users collect CUDB counter values by copying the generated files from the output location. It is recommended to retrieve output files with the cudbadmin user through SFTP protocol. Refer to CUDB Users and Passwords for more information on user credentials.
2.3 Configuring Counter Output Files Names
The filenames of these counter output files are based on the following format:
A<date>.<starttime>-<stoptime>-<jobname>_<networkElementName>.xml
The variables in the above file name are the following:
|
<date> |
The date of the measurement in format YYYYMMDD. |
|
<starttime> |
The start time of the measurement in format HHMM. |
|
<stoptime> |
The stop time of the measurement in format HHMM. |
|
<jobname> |
The job name of the measurement. |
|
<networkElementName> (1) |
A string used as unique identity representing the node that runs the ESA. |
<networkElementName> can be configured.
Refer to ESA Performance Management for a complete description of the file names.
The <networkElementName> parameter is set through CUDB configuration CLI, by setting the value of the <networkElementName> configuration attribute. For more information, refer to CUDB Node Configuration Data Model Description.
Refer to the Object Model Modification Procedure for more information on all the steps required to modify the object model.
2.4 CUDB KPIs
CUDB KPIs are a special set of CUDB counters, for CUDB systems deployed on native BSP 8100 and vCUDB, that help the users evaluate and quantify the usage of the processing and memory capacity of certain CUDB resources.
The different types of KPIs per CUDB node type are as follows:
Like other CUDB counters, KPIs are generated every minute. The value of the counters can be published every 5 or 15 minutes, but in both cases, for load related and drop ratio indicators, the counter is a rolling average of the values collected during the previous 15 minute monitoring period, updated each minute. Memory usage KPI counters are not averaged. The KPI values are available in 3GPP file format as the rest of the CUDB counters.
Refer to CUDB Counters List for further information on KPIs.
2.4.1 Guidelines for CUDB KPIs
The KPIs are distinct for resources located in each CUDB node.
| Note: |
Replicas located in different CUDB nodes can have different
KPIs values. |
To see how structure and configuration of a CUDB system can affect different KPIs, see Effects of Structure and Configuration on CUDB Counters.
KPI guidelines are given for normal CUDB operation, which is described as follows:
| Note: |
No KPI related guideline is given for any scenario other
than normal CUDB operation. |
While an important index of CUDB performance, the processing load itself may not be the limiting factor in all the cases. Specific combinations of traffic from different applications towards a particular CUDB node or the way the network was deployed, including network latencies, may also limit the overall throughput of a CUDB system, while processing load remains nominal.
Therefore, in addition to processing load, further indicators are necessary to measure CUDB performance. The drop ratio KPIs can provide an early warning if different limits of the system are close to being exceeded and because of that not all resulting received traffic is successfully handled.
The drop ratio KPIs have a very high granularity level to alert the user to cases the indicator is not zero, even if the rejection rate is still very low. Accordingly, one-hundredth of a percent (0.01% = 0.1‰) is used as the unit of measurement for the drop ratio KPIs. For example, the kpiRatioDropped value at 5 o / ooo indicates a rejection rate of 0.05%.
Another indicator of CUDB performance is the level of database cluster occupancy, provided by the memoryUsage for each database cluster counter.
Prior to checking the value of a KPI counter, ensure that there are no active alarms in the system affecting the CUDB node or CUDB resource that the specific KPI counter is associated with. Refer to CUDB Health Check for instructions on how to check the list of active alarms.
An overview of the procedure for following up on the value of a KPI counter associated with a CUDB resource is shown in Figure 1.
2.4.1.1 LDAP KPIs
|
KPI counter name |
Threshold |
Additional information and recommendation |
|---|---|---|
|
kpiLdapFeLoad |
70 – 80% |
Traffic rejection starts at high CPU usage, above 70% – 80%. If this threshold is reached occasionally, continue monitoring the indicator.
|
|
kpiRatioDroppedLdap |
0 |
If this KPI counter exceeds zero on a regular basis, check the kpiRatioDroppedLdap in other CUDB nodes and check the scale-out policy to see if scale-out is the appropriate action, or to revise the connectivity map between application FEs and CUDB nodes. Refer to Creating New DSG for more information. |
|
kpiLdapFesLoadUnbalance |
10% |
If this KPI counter exceeds 10%, rebalance TCP connections on LDAP FEs with cudbLdapFeRestart command. Refer to CUDB Node Commands and Parameters for more information on the cudbLdapFeRestart command and its options.Counter value below 10% is also prerequisite for scale-out procedure. Refer to Creating New DSG for more information. If this KPI does not drop below threshold value after rebalance of connections, contact Ericsson support personnel. |
|
kpiDsMemoryUsageUnbalance |
10% |
If this KPI counter exceeds 10%, contact Ericsson support to perform defragmentation of each DSG for which there is a DS unit in the CUDB node. Refer to CUDB System Administrator Guide on more information for the defragmentation. Refer to CUDB Technical Product Description for more information on relationship between DSG and DS. Counter value below 10% is also prerequisite for scale-out procedure. Refer to Creating New DSG for more information. If this KPI does not drop below threshold value after defragmentation, perform reallocation of subscribers to even out memory usage on the DSGs whose data is stored in those DS units. Move the distributed data out of the DSG with higher occupation into the emptier DSGs. Refer to CUDB Node Commands and Parameters for more information on the cudbReallocate command and its options. |
2.4.1.2 PLDB KPIs
|
KPI counter name |
Threshold |
Additional information and recommendation |
|---|---|---|
|
kpiClusterLoad |
70 – 80% |
Traffic rejections starts at a high CPU usage. If the threshold is occasionally reached, continue monitoring the indicator. If the threshold is exceeded on a regular basis, check if kpiClusterLoad is at similarly high levels in other PLDB replicas. If that is the case, contact Ericsson support and evaluate the need to expand the system. Otherwise, if the kpiClusterLoad is lower in other PLDB replica(s), revise the connectivity map between application FEs and CUDB nodes and contact Ericsson support if needed. |
|
kpiRatioDroppedCluster |
0 |
If the value of this KPI counter exceeds zero on a regular basis, check the kpiRatioDroppedCluster in other PLDB replicas and, with Ericsson support, evaluate the need to expand the system or revise the connectivity map between application FEs and CUDB nodes. |
|
memoryUsage |
75 |
If this threshold has been reached and the "Storage Engine, Memory Usage Too High In PLDB, Warning" alarm has not been addressed yet, refer to the instructions specified in Storage Engine, Memory Usage Too High In PLDB, Warning. If the actions described in the OPI do not lower the counter below the threshold, contact Ericsson support and evaluate the need to perform a PLDB expansion. |
2.4.1.3 DS KPIs
|
KPI counter name |
Threshold |
Additional information and recommendation |
|---|---|---|
|
kpiClusterLoad |
40 – 50% |
Traffic rejection starts at high CPU usage, above 70%- 80%, but due to high availability within a DS, the recommended KPI threshold is 40% – 50%. If one of the database processes fails within a DSG replica, the surviving process keeps providing the database service without noticeable traffic impact. At the same time, under normal operation, with no process failure, the DSGs are prepared to cope with high load. If the threshold is reached occasionally, continue monitoring the indicator. If kpiClusterLoad threshold is exceeded on a regular basis in a DSG, check if kpiClusterLoad is at similarly high levels in other master DSG replicas. If that is the case, check the scale-out policy to see if scale-out is the appropriate action. Refer to Creating New DSG for more information. Otherwise, if the threshold is exceeded in just one or a few DSGs, consider reallocating data from the highly occupied DSGs towards DSG(s) with lower occupancy levels. Refer to CUDB Multiple Geographical Areas for additional information. |
|
kpiRatioDroppedCluster |
0 |
If this KPI counter exceeds zero on a regular basis, check the kpiRatioDroppedCluster in other master DSGs replicas and, with Ericsson support, evaluate the need for reallocation or for expansion with additional DSG(s) by checking the scale-out policy. Refer to Creating New DSG for more information. |
|
memoryUsage |
75 |
If this threshold has been reached and the "Storage Engine, Memory Usage Too High In DS, Warning Threshold Reached" alarm has not been addressed yet, refer to the instructions specified in Storage Engine, Memory Usage Too High In DS, Warning Threshold Reached. If the actions described in the OPI do not lower the KPI below the threshold, check the scale-out policy to see if scale-out (DSG expansion) is the appropriate action. Refer to Creating New DSG for more information. |
2.5 Effects of Structure and Configuration on CUDB Counters
In order to properly understand and interpret counter values, important aspects of CUDB data access, architecture, and features need to be taken into account. The relationship of the previous factors with CUDB counters is described in the following sections, as well as some general considerations.
2.5.1 Master Distribution
Depending on the configured combinations of readModeInDS and readModeInPL configuration parameters, master DS replicas may receive higher amounts of traffic compared to slave replicas within the same DSG.
This will be reflected in the following counter values:
Master PLDB replicas may receive higher amounts of traffic compared to the slave replicas during provisioning. This will be reflected in the following counter values:
If a node hosts multiple master replicas, the values of the following counters may be higher compared to nodes with fewer master replicas:
For more information on readModeInDS and readModeInPL, refer to CUDB Node Configuration Data Model Description and CUDB LDAP Data Access.
2.5.2 Distribution of Subscriber Profiles
Depending on the reading mode configuration of LDAP users, higher memory occupation in a DSG may result in its master replica receiving more traffic. In terms of CUDB counters, this means that DSG master replicas with higher memoryUsage, Dsn counter values may also have higher values than master replicas of other DSGs for the following counters:
A higher active/inactive subscriber ratio in a DSG may also result in its master replica receiving more traffic. Such master replicas may have higher values of the same counters as listed above, compared to master replicas of other DSGs in the system.
2.5.3 Application FE Connections
The CUDB nodes that are the primary targets for Application FE connections will receive most of the traffic intended for a CUDB System. Depending on the master distribution in the system and the reading mode configuration of LDAP users, such traffic may either end at the primarily affected nodes or be proxied to other nodes in the system.
If the CUDB nodes connected to Application FEs do not host many master replicas, they may have a high number of proxied requests, resulting in a higher value of processedLDAPReqsRemoteNodes than other nodes of the system.
If there are no nodes in the system with a high concentration of master replicas, nodes with Application FE connections will have higher values than other nodes in the system for the following counters:
Otherwise, depending on the reading mode configuration of LDAP users, nodes with a concentration of master replicas may have the highest values for the listed counters.
2.5.4 Network Issues
Increased network latency can result in a higher number of failed proxied requests, such as in the increased value of nonProcessedLdapReqsRemoteNodes.
Network issues in communication with Notification end points can result in failed SOAP notifications, such as an increase of notificationsFailed counter values.
2.5.5 Overload Protection and Load Regulation
Incidents in the core network or on UDC solution level can cause high traffic and trigger the overload protection and load regulation mechanisms, resulting in an increased value of the dropped requests counters:
2.5.6 General Considerations
CUDB maintenance operations can impact local redundancy of a CUDB node or cause high network, storage, and processing load, resulting in an increase of dropped or failed requests as well as the load related and drop ratio KPI counter values.
Infrastructure problems or maintenance can impact the capacity and availability of network, storage, and processing resources, resulting in an increase of dropped or failed requests as well as the load related and drop ratio KPI counter values.
Reference List
- CUDB Counters List
- CUDB LDAP Data Access
- CUDB Application Counters
- CUDB Node Configuration Data Model Description
- CUDB Health Check
- CUDB Multiple Geographical Areas
- Storage Engine, Memory Usage Too High In PLDB, Warning
- Storage Engine, Memory Usage Too High In DS, Warning Threshold Reached
- CUDB Node Commands and Parameters
- CUDB System Administrator Guide
- CUDB Technical Product Description
- CUDB Users and Passwords 3/006 51-HDA 104 03/10
- CUDB Glossary of Terms and Acronyms

Contents