| 1 | Introduction |
| 1.1 | Alert Description |
| 1.2 | Prerequisites |
2 | Procedure |
| 2.1 | Actions |
1 Introduction
This instruction concerns alert handling.
1.1 Alert Description
The CM-HA Service Restarted alert is issued when during a periodic check the Continuous Monitoring High Availability (CM-HA) memory consumption reaches a certain threshold and CM-HA is restarted. The threshold of CM-HA memory usage is 4 GB and the time period is one hour.
The severity of the alert is WARNING.
The possible alert causes, corresponding fault reasons, fault locations, and impacts are described in Table 1.
|
Alert |
Description |
Fault |
Fault |
Impact |
|---|---|---|---|---|
|
Memory utilization of CM-HA is high, and this triggers CM-HA restart. |
The alert is sent when the memory utilization of CM-HA exceeds the threshold level. To free up memory, CM-HA restarts. |
In some cases the Python interpreter does not free up memory used by CM-HA properly, and CM-HA needs to restart. |
Controller nodes |
System capacity can be degraded if memory utilization exceeds the threshold. |
The alert attributes are listed in Table 2.
|
Attribute Name |
Attribute Value |
|---|---|
|
Major Type |
193 |
|
Minor Type |
2031714 |
|
Managed Object Class |
CM-HA |
|
Managed Object Instance |
Region=<name_of_the_region>, |
|
Specific Problem |
CM-HA Service Restarted |
|
Event Type |
other (1) |
|
Probable Cause |
m3100Indeterminate (0) |
|
Additional Text |
CM-HA service is restarted because of its memory consumption reached the threshold value |
|
Severity |
WARNING (6) |
1.2 Prerequisites
This section provides information on the documents, tools, and conditions that apply to the procedure.
1.2.1 Documents
Not applicable.
1.2.2 Tools
No tools are required.
1.2.3 Conditions
No conditions.
2 Procedure
This section describes the procedure to follow when this alert is received.
2.1 Actions
When the CM-HA Service Restarted alert is issued, CM-HA has already recovered. Normally, no further actions are necessary.
If the alert is observed more than once a week, perform the following:
- Collect troubleshooting data as described in the Data Collection Guideline.
- Consult the next level of maintenance support.
Further actions are outside the scope of this instruction.
- The job is completed.

Contents