Service Permanently Stopped
Cloud Execution Environment

Contents

1Introduction
1.1Alarm Description
1.2Prerequisites

2

Procedure

1   Introduction

This instruction concerns alarm handling.

1.1   Alarm Description

The Service Permanently Stopped alarm is issued if a service operating at a vCIC or Compute node is stopped permanently.

The possible alarm cause and the corresponding fault reasons, fault locations and impacts are described in Table 1.

Table 1    Alarm Causes

Alarm Cause

Description

Fault Reason

Fault
Location

Impact

The service indicated in the Service field of the Managed Object
Instance
attribute stopped permanently.

The service monitoring functionality has detected that the service indicated in the Service field of the Managed Object Instance attribute stopped permanently.

  • Misconfiguration

  • Other
    undetermined
    reasons

The vCIC or Compute node indicated in the Node field of the Managed Object Instance attribute

In case a service is running in active-active mode (for example, nova-api) on vCIC, the corresponding performance is lower and the impacted functions do not operate.


In the case of a local service (for example, nova-compute service), the function does not work at all on the node.

Note:  
The alarm can appear as a result of the maintenance activity.

The alarm attributes are listed in Table 2.

Table 2    Alarm Attributes

Attribute Name

Attribute Value

Major Type

193

Minor Type

2031715

Managed Object Class

Service

Managed Object Instance

Region=<name_of_the_region>,
CeeFunction=1,
Node=<hostname_of_the_node>,
Service=<service_name>

Specific Problem

Service Permanently Stopped

Event Type

processingErrorAlarm (4)

Probable Cause

softwareProgramAbnormallyTerminated (100545)

Additional Text

On node <hostname_of_the_node> <service_name> has been permanently stopped.

Severity

MAJOR (4)

1.2   Prerequisites

This section provides information on the documents, tools, and conditions that apply to the procedure.

1.2.1   Documents

Not applicable.

1.2.2   Tools

No tools are required.

1.2.3   Conditions

Before starting this procedure, ensure that the following condition is met:

2   Procedure

This section describes the procedure to follow when this alarm is received.

Do the following:

  1. If the affected node is not a Compute node, continue with Step 3.
  2. If the fault is detected at a Compute node, perform the relevant action:
    1. If the alarm is not issued by the nova-compute service, try to move the virtual machines (VMs) by using the following command with the <hostname_of_the_node> reported in the alarm:
      for VM in $(nova list –-host <hostname_of_the_node>); do nova forcemove $VM; done
    2. If the alarm is issued by the nova-compute service, log on to the affected Compute node as root and reboot it:

      ssh root@<Compute_node>reboot -f

  3. Collect troubleshooting data as described in the Data Collection Guideline. For alarm-specific logs, refer to the Table Data Collection for Alarms and Alerts in the Data Collection Guideline.
  4. Consult the next level of maintenance support. Further actions are outside the scope of this instruction.
  5. The job is completed.


Copyright

© Ericsson AB 2016. All rights reserved. No part of this document may be reproduced in any form without the written permission of the copyright owner.

Disclaimer

The contents of this document are subject to revision without notice due to continued progress in methodology, design and manufacturing. Ericsson shall have no liability for any error or damage of any kind resulting from the use of this document.

Trademark List
All trademarks mentioned herein are the property of their respective owners. These are shown in the document Trademark Information.

    Service Permanently Stopped         Cloud Execution Environment