1 Introduction
This instruction concerns alarm handling.
1.1 Alarm Description
The Fuel Failed alarm is issued by the Managed Object (MO) Fuel. The alarm is issued when the periodic supervision algorithm detects that the vFuel node has failed the availability test three consecutive times, and, following that, remains unavailable for at least five minutes.
The severity of the alarm is MAJOR or CLEARED.
The possible alarm causes and fault locations are explained in Table 1.
|
Alarm |
Description |
Fault |
Fault |
Impact |
|---|---|---|---|---|
|
The vFuel node is not available. |
The vFuel node has failed the availability test three consecutive times, and, after that, remains unavailable for more than five minutes. |
vFuel node malfunction |
vFuel node |
The vFuel node becomes permanently unavailable. |
The following is the consequence for the node if the alarm is not solved: the vFuel node remains unavailable.
The alarm attributes are listed in Table 2.
|
Attribute Name |
Attribute Value |
|---|---|
|
Major Type |
193 |
|
Minor Type |
2031706 |
|
Managed Object Class |
Fuel |
|
Managed Object Instance |
Region=<name_of_the_region>, |
|
Specific Problem |
Fuel Failed |
|
Event Type |
other (1) |
|
Probable Cause |
m3100Unavailable(14) |
|
Additional Text |
N/A |
|
Severity |
MAJOR (4) or CLEARED |
1.2 Prerequisites
This section provides information on the documents, tools, and conditions that apply to the procedure.
1.2.1 Documents
Not applicable.
1.2.2 Tools
Before starting this procedure, ensure that the following tools are available for establishing console connection to the affected server, as required by the documentation provided by the hardware manufacturer:
- A console cable
- A Linux computer with SSH capability
1.2.3 Conditions
Before starting this procedure, ensure that the following conditions are met:
- You have access to the information about how to establish console connection to the hardware.
- The password for the root user in vFuel is known.
2 Procedure
This section describes the procedure to follow when this alarm is received.
2.1 Actions
Perform the following steps:
- Access the failed node by SSH:
ssh <user_id>@<fuel_address>
If the personal user ID does not work, use the ceeadm user ID.
The following scenarios are possible:
- Issue the following command to reboot:
sudo reboot -f
After a successful reboot proceed to Step 3.
- Note:
- The user must have sudo privileges to be able to run this command.
- Wait at least five minutes
to see whether the alarm has ceased.
If the reboot solved the problem and the alarm is ceased, exit this procedure.
If the reboot did not solve the problem, collect troubleshooting data as described in the Data Collection Guideline and contact the next level of maintenance support.
- Check whether the
compute, where the failed node is running as a virtual machine (VM),
is available.
Access the compute by SSH:
ssh <user_id>@<compute_host_where_node_is_running>
If the personal user ID does not work, use the ceeadm user ID.
The following scenarios are possible:
- Check for an active Compute Host Failed alarm for the compute node where
the failed vFuel is running.
If there is an active Compute Host Failed alarm, continue with Step 6.
If there is no active Compute Host Failed alarm, continue with Step 7.
- Solve the Compute Host Failed alarm following the instructions
in Compute Host Failed.
In case the server needs to be changed to solve the Compute Host Failed alarm, start up the passive Fuel VM as described in section Failover to the Cold Standby Fuel VM (Recover from Failure) in Fuel Synchronization.
The following scenarios are possible:
- If the Fuel Failed alarm is ceased, exit this procedure.
- If the compute can be accessed by SSH, proceed to Step 8.
- If the compute cannot be accessed, collect troubleshooting data as described in the Data Collection Guideline and contact the next level of maintenance support.
- If there is
no active Compute Host Failed alarm, reboot
the failed compute node manually by using the corresponding out-of-band
management.
The following scenarios are possible:
- If the compute can be accessed by SSH after the reboot, proceed to Step 8.
- If the compute cannot be accessed, collect troubleshooting data as described in the Data Collection Guideline and contact the next level of maintenance support.
- Check the state of the VM on
the compute with the following command:
virsh list --all
The name of the VM must be fuel_master.
Two scenarios are possible:
- Start the VM with the following command:
virsh start <name_of_the_vm_displayed_by_virsh_list>
Two scenarios are possible:
- Perform a manual failover
on the cold standby Fuel VM. For more information, refer to section Failover to the Cold Standby Fuel VM (Recover from Failure) in Fuel Synchronization.
Two scenarios are possible:
- If the alarm is ceased, exit this procedure.
- If the alarm is not ceased, collect troubleshooting data as described in the Data Collection Guideline and contact the next level of maintenance support.
- Check the networking of the
failed VM by checking the traffic through the virtual switch.
Start pinging the failed node from a vCIC which is still alive.
While sending ping requests, run the following command on the compute multiple times, where the failed node is running and check if the RX/TX counters are increasing:
ovs-appctl dpctl/show -s system@ovs-system | grep vfm_eth0 -A4
If the counters are increasing, collect troubleshooting data as described in the Data Collection Guideline and contact the next level of maintenance support. If the counters are not increasing, proceed to Step 12.
- Restart the virtual switch
on the compute by issuing the following command:
service openvswitch-switch restart
Two scenarios are possible:
- If the alarm is ceased, exit this procedure.
- If the alarm is not ceased, collect troubleshooting data as described in the Data Collection Guideline and contact the next level of maintenance support.
- The job is completed.
3 Additional Information
The alarm is ceased when the vFuel node has passed the availability test.

Contents