Maintenance Activities due to Faulty Blade
Ericsson Service-Aware Policy Controller

Contents

1Maintenance Activities due to Faulty Blade Introduction
1.1Prerequisites

2

Maintenance Activities due to Faulty Blade Procedure
2.1System Controller Blades
2.2Payload Blades

1   Maintenance Activities due to FaultyBlade Introduction

This instruction explains the steps to fulfill the blade replacement after a faulty blade has been detected in a blade system.

1.1   Prerequisites

This section provides the prerequisites, which must be addressed before using the procedure.

Conditions

The following conditions must apply:

2   Maintenance Activities due to Faulty Blade Procedure

There are two different scenarios depending on the blade to replace.

2.1   System Controller Blades

System controller blades are the only blades virtualized.

2.1.1   Lock CBA Node

  1. SC-x:~ # cmw-node-lock SC-x

Further information in SAPC Troubleshooting Guide.

2.1.2   Stop DHCP Services

  1. Stop the DHCP service in both SC.

    SC-1:~ # systemctl stop dhcpd.service

  2. Repeat for the other SC

    SC-2:~ # systemctl stop dhcpd.service

The SAPC cluster is now ready to procedure with the Blade replacement.

2.1.3   Blade Hardware Replacement

  1. Shut down the problematic blade, if it is still running despite the malfunction.
  2. Disconnect all interfaces and power switches from the blade. Remove the blade from the blade system.
  3. Insert a new blade replacing the one removed.
  4. Connect all interfaces and power switch on the new blade. This blade is accessible from the ILOM interface in this point.

2.1.4   Host Operating System Installation and Configuration

Follow the SAPC PNF Deployment Instruction to install the SLES12 Operating System and the updates needed. Once the updates have been applied, copy files from the other System Controller. In the following example, the SC-1 is considered the faulty blade.

  1. Access the SC-2 host machine. Check that you have access to Host_1 from there to copy the files.

    InstallationServer:# ssh root@Host_2

    Host_2:# ssh root@Host_1

    Host_1:# exit

  2. Copy the files. If the destination directories do not exist, create them before.

    Host_2:# scp /mnt/images/adapt_cluster.cfg root@Host_1:/mnt/images/

    Host_2:# scp /mnt/images/adapt_cluster.iso root@Host_1:/mnt/images/

    Host_2:# scp /mnt/images/reboot.img root@Host_1:/mnt/images/

    Host_2:# scp -r /mnt/store/SAPC/host-config/ root@Host_1:/mnt/store/SAPC/

  3. Define and boot the Virtual Machine.

    Host_2:# ssh root@Host_1

    Host_1:# virsh define /mnt/store/SAPC/host-config/VM/vms/sc01.xml

    Host_1:# qemu-img create -f qcow2 /mnt/images/originalImage/sapc_sc-1_cxp9030138.qcow2 100G

    Host_1:# cat /mnt/store/SAPC/host-config/VM/vms/sc01.xml | grep "<name>"

    <name>SC-1.Host_1</name>

    Host_1:# virsh start SC-1.Host_1 --console

  4. Wait for the SC-1 to synchronize.

    Host_1:# ssh root@192.168.100.126

    SC-1:# drbd-overview

    The output must have the following line Connected Primary/Secondary UpToDate/UpToDate like in the example:

    0:drbd0/0 Connected Primary/Secondary UpToDate/UpToDate C r----- lvm-pv: lde-cluster-vg 41.87g 23.09g

2.1.5   Start DHCP Service

  1. Start the DHCP service in both SC.

    SC-1:~ # systemctl start dhcpd.service

  2. Repeat for the other SC

    SC-2:~ # systemctl start dhcpd.service

2.1.6   Unlock CBA Node

  1. SC-x:~ # cmw-node-unlock SC-x

Further information in SAPC Troubleshooting Guide.

2.2   Payload Blades

2.2.1   Stop SAPC Components

If the faulty blade is powered off, skip this task. In case it is running, stop all processes.

  1. SC-x:~ # sapcPcrfProc status PL-x
  2. If the payload is running, then execute the following.

    SC-x:~ # sapcPcrfProc stop PL-x

    The PL-x is the blade that is going to be replaced.

2.2.2   Lock CBA Node

  1. SC-x:~ # cmw-node-lock PL-x

Further information in SAPC Troubleshooting Guide.

2.2.3   Blade Hardware Replacement

  1. Shut down the problematic blade, if it is still running despite the malfunction.
  2. Disconnect all interfaces and power switches from the blade. Remove the blade from the blade system.
  3. Insert a new blade replacing the one removed.
  4. Connect all interfaces but do not power on the blade. This blade is accessible from the ILOM interface in this point.

2.2.4   Prepare The Blade Before Power On

Attention!

Depending on the payload number, this task changes.

PL-3 PL-4

  1. PL-3 and PL-4 are fixed traffic processors, so add the MAC addresses of the new blade to the cluster.conf file. To obtain the MAC addresses, create the PL_interfaces file as it is described in the SAPC PNF Deployment Instruction. Use the values of that file to edit the /cluster/etc/cluster.conf file and reload the values.

    SC-1:# vi /cluster/etc/cluster.conf

    # PL-x 
    interface x eth0 ethernet 74:c9:9a:4f:65:44 
    interface x eth1 ethernet 74:c9:9a:4f:65:45 
    interface x eth2 ethernet 74:c9:9a:4f:65:40 
    interface x eth3 ethernet 74:c9:9a:4f:65:41

    SC-1:# cluster config -r -a

PL-5 Onwards

  1. Scale in the payload because it was scaled out during the deployment of the SAPC.

    SC-1:# sapcScaleIn <PL-X>

2.2.5   Power On The Blade

Now it is time to power on the blade.

  1. Follow the SAPC PNF Scale Out procedure.

2.2.6   Unlock CBA Node

  1. SC-x:~ # cmw-node-unlock PL-x

Further information in SAPC Troubleshooting Guide.