| 1 | Introduction |
| 1.1 | Alert Description |
| 1.2 | Prerequisites |
2 | Procedure |
| 2.1 | Analysis |
| 2.2 | Actions |
1 Introduction
This instruction concerns alert handling.
1.1 Alert Description
The Service Stopped alert is issued if a service operating at a vCIC or compute node is stopped.
The possible alert cause and the corresponding fault reasons, fault locations and impacts are described in Table 1.
|
Alert |
Description |
Fault |
Fault |
Impact |
|---|---|---|---|---|
|
The service indicated in the Service field of the Managed Object |
The service monitoring functionality has detected that the service indicated in the Service field of the Managed Object Instance attribute stopped. |
|
The vCIC or compute node indicated in the Node field of the Managed Object Instance attribute |
In case a service is running in active-active mode (for example nova-api) on vCIC, then the corresponding performance is lower. |
- Note:
- The alert can appear as a result of the maintenance activity.
The alert attributes are listed in Table 2.
|
Attribute Name |
Attribute Value |
|---|---|
|
Major Type |
193 |
|
Minor Type |
2031710 |
|
Managed Object Class |
Service |
|
Managed Object Instance |
Region=<name_of_the_region>, |
|
Specific Problem |
Service Stopped |
|
Event Type |
Other (1) |
|
Probable Cause |
m3100Indeterminate (0) |
|
Additional Text |
On node <hostname_of_the_node> <service_name> has been stopped. |
|
Severity |
WARNING (6) |
1.2 Prerequisites
This section provides information on the documents, tools, and conditions that apply to the procedure.
1.2.1 Documents
Not applicable.
1.2.2 Tools
No tools are required.
1.2.3 Conditions
Before starting this procedure, ensure that the following condition is met:
- The alert was not issued due to ongoing planned maintenance. If the alert was issued due to ongoing planned maintenance, no further actions are required.
2 Procedure
This section describes the procedure to follow when this alert is received.
2.1 Analysis
Do the following to analyze the alert:
- Check if the Service Permanently Stopped alarm is issued for the same service.
- If the Service Permanently Stopped alarm is issued, refer to Service Permanently Stopped, and exit this procedure.
- If the Service Permanently Stopped alarm is not issued, continue with Step 2.
- Count the number of alert
occurrences in a 10 minute period and perform the relevant action:
- If the alert occurs less than five times in 10 minutes, no actions are needed, the job is completed as the service has been recovered by service supervision.
- If the alert occurs five or more times in 10 minutes, continue with Section 2.2.
2.2 Actions
Do the following:
- Depending on the node type, perform the relevant action:
for VM in $(nova list –-host <hostname_of_the_node>); do nova forcemove $VM; done
- If the alert does not reoccur in the next 10 minutes after moving the VMs, the job is completed. Else, continue with Step 3.
- Collect troubleshooting data as described in the Data Collection Guideline. For alarm-specific logs, refer to the table Data Collection for Alarms and Alerts in the Data Collection Guideline.
- Consult the next level of maintenance support. Further actions are outside the scope of this instruction.
- The job is completed.

Contents