Server Platform, Blade Replacement

Contents

1Introduction
1.1Description
1.2Revision Information
1.3Typographic Conventions

2

Node Hardware Description

3

Replacing a Blade
3.1Identifying the Faulty Blade
3.2Identifying Blade Rack and Subrack Position
3.3Preparing the Blade Replacement
3.4Replacing GEP Boards
3.5CUDB Node Configuration Changes
3.6System Controller Replacement Steps
3.7DSG and PLDB Replacement Steps
3.8Finalizing Replacement
3.9Changing the Boot Device Order
3.10Replacing Multiple Blades in Parallel

Glossary

Reference List

1   Introduction

This document describes how to replace a blade in an Ericsson Centralized User Data Base (CUDB) node deployed on native BSP 8100.

1.1   Description

This Operating Instruction (OPI) describes how to replace a blade in a CUDB node. A blade replacement must be performed either because of blade fault, or due to a hardware upgrade.

1.2   Revision Information


Rev. A
Rev. B
Rev. C
Rev. D
Rev. E
Rev. F
Rev. G

Editorial changes only.

1.3   Typographic Conventions

Typographic conventions can be found in the following document:

2   Node Hardware Description

Before replacing any blades, make sure to check the hardware description of the node. Refer to CUDB Node Hardware Description, Reference [1] for more information.

3   Replacing a Blade

This section describes how to identify a faulty blade, how to perform a blade replacement in a CUDB node, and how to prepare the replacement blade for operation.

3.1   Identifying the Faulty Blade

Perform the following steps to identify a faulty blade in a CUDB node:

  1. Establish an SSH session towards the target CUDB node with the following command:

    ssh root@<CUDB_Node_OAM_VIP_Address>

    This session is established to the first or second System Controller (SC), that is either to SC_2_1 or SC_2_2.

    Refer to CUDB Users and Passwords, Reference [4] CUDB Users and Passwords for more information on the default root password.

Warning!

If the failing blade is a master Data Store Unit Group (DSG) replica, then the below procedure results in a cluster mastership change. The mastership change can cause traffic loss, or even data loss: therefore, the blade replacement must be performed in low traffic periods. Make sure that the provisioning traffic is stopped before starting the blade replacement.

  1. Several methods are available to identify an SC, a DSG, or a PLDB in the /cluster/etc/cluster.conf file. An example is provided below.

    For instance, below is a defined node containing 2 SCs, and 8 payload blades:

    node 1 control SC_2_1
    node 2 control SC_2_2
    node 3 payload PL_2_3
    node 4 payload PL_2_4
    node 5 payload PL_2_5
    node 6 payload PL_2_6
    node 7 payload PL_2_7
    node 8 payload PL_2_8
    node 9 payload PL_2_9
    node 10 payload PL_2_10
    
    
    host all 10.22.0.1 OAM1
    host all 10.22.0.2 OAM2
    host all 10.22.0.3 PL0
    host all 10.22.0.4 PL1
    host all 10.22.0.5 DS1_0
    host all 10.22.0.6 DS1_1
    host all 10.22.0.7 DS2_0
    host all 10.22.0.8 DS2_1
    ..............
    

    The below list provides some example scenarios for various failing blades:

    • In case the failing blade is Blade 7, with the IP of 10.22.0.7 (note the last octet), then pay attention to the following two lines of the cluster.conf file:

      node 7 payload PL_2_7
      host all 10.22.0.7 DS1_0
      

      In the above lines, PL_2_7 is the name of the payload blade, the blade number is "7", while the identification number of the associated DS is "1" (DS1_0).

    • In case the failing blade is Blade 3 with the IP of 10.22.0.3 (note the last octet), then pay attention to the following two lines of the cluster.conf file:

      node 3 payload PL_2_3
      host all 10.22.0.3 PL0
      

      In the above lines, PL_2_3 is the name of the payload blade, "3" is the blade number, while the Processing Layer ID is PL0. In case the blade in the cluster needs reboot, the following command must be used:

      cluster reboot --node 3

    • In case the failing blade is Blade 1 with the IP of 10.22.0.1 (note the last octet), then pay attention to the following two lines of the cluster.conf file:

      node 1 payload SC_2_1
      host all 10.22.0.1 OAM1
      

      In the above lines SC_2_1 is the name of the blade, "1" is the blade number, while the SC ID is OAM1.

3.2   Identifying Blade Rack and Subrack Position

To identify the physical blade needed to be replaced, do the following:

  1. Establish a BSP CLI session:

    ssh advanced@<BSP-NBI-SCX> -p2024

  2. Enter the following commands:

    show-table ManagedElement=1,DmxcFunction=1,Eqm=1,VirtualEquipment=cudb -m Blade -p bladeId,userLabel

The output should be similar to the following example:

=======================
| bladeId | userLabel                               |
=======================
| 0-1      | SC-1                                   |
| 0-11     | PL-6                                   |
| 0-13     | PL-7                                   |
| 0-15     | PL-8                                   |
| 0-17     | PL-9                                   |
| 0-19     | PL-10                                  |
| 0-21     | PL-11                                  |
| 0-23     | PL-12                                  |
| 0-3      | SC-2                                   |
| 0-5      | PL-3                                   |
| 0-7      | PL-4                                   |
| 0-9      | PL-5                                   |
| 1-1      | PL-13                                  |
| 1-11     | PL-18                                  |
| 1-13     | PL-19                                  |
| 1-15     | PL-20                                  |
| 1-17     | PL-21                                  |
| 1-19     | PL-22                                  |
| 1-21     | PL-23                                  |
| 1-23     | PL-24                                  |
| 1-3      | PL-14                                  |
| 1-5      | PL-15                                  |
| 1-7      | PL-16                                  |
| 1-9      | PL-17                                  |
| 2-1      | PL-25                                  |
| 2-11     | PL-30                                  |
| 2-13     | PL-31                                  |
| 2-15     | PL-32                                  |
| 2-17     | PL-33                                  |
| 2-19     | PL-34                                  |
| 2-21     | PL-35                                  |
| 2-23     | PL-36                                  |
| 2-3      | PL-26                                  |
| 2-5      | PL-27                                  |
| 2-7      | PL-28                                  |
| 2-9      | PL-29                                  |
=======================

Note:  
LDE and BSP 8100 naming conventions are slightly different, so SC_2_1 on LDE level equals to SC-1 on BSP 8100 and so on.

The bladeId identifies the blade position in the rack, the first number meaning the subrack and the second meaning the slot within the subrack. For example, PL-14 is in the third slot of subrack 1.

3.3   Preparing the Blade Replacement

This section and the following subsections describe the preparations to perform before replacing a blade. The following steps apply to replacing every blade type (SC, DSG, PLDB).

3.3.1   Connecting to the CUDB Node

Establish an SSH session towards the target CUDB node with the following command:

ssh root@<CUDB_Node_OAM_VIP_Address>

This session is established to the first or second SC, that is either to SC_2_1 or SC_2_2.

Refer to CUDB Users and Passwords, Reference [4] CUDB Users and Passwords for more information on the default root password.

3.3.2   Preparing Payload Blade (PLDB or DSG) Replacement

The following steps apply only to DSG or PLDB replacement. In case the replaced blade is an SC, skip to Section 3.3.3.

3.3.2.1   Preparing DSG Replacement

If the blade belongs to a DSG, execute a planned mastership change procedure when it hosts the DSG master. Perform the following steps to do so:

  1. Identify the DSG blade number in the /cluster/etc/cluster.conf LDE file.
  2. Read the dsGroupId attribute of the specific instance of the CudbLocalDs class. For more information, refer to the Class CudbLocalDs section of CUDB Node Configuration Data Model Description, Reference [5].

    Refer to the Object Model Modification Procedure in CUDB Node Configuration Data Model Description, Reference [5] for more information on how to check the attribute.

  3. Use the following command to check if the master replica of the DSG with the given dsGroupId attribute is hosted on the affected blade:

    cudbSystemStatus -R

  4. If the DSG blade is a master, then execute a planned mastership change procedure starting from Section 3.3.2.3.

3.3.2.2   Preparing PLDB Replacement

If the blade belongs to the PLDB, execute a planned mastership change procedure when it hosts the PLDB master. Perform the following steps to do so:

  1. Identify the PLDB blade number in the /cluster/etc/cluster.conf LDE file.
  2. Use the following command to check if the affected PLDB blade is a master or not:

    cudbSystemStatus -R

  3. If the PLDB blade is a master, then execute a planned mastership change procedure starting from Section 3.3.2.3.

3.3.2.3   Disabling Automatic Mastership Change

This section describes how to disable Automatic Mastership Change (AMC) if it is enabled.

First, check whether AMC is enabled. To do so, check the value of the enabled attribute of the CudbAutomaticMasterChange class. For more information, refer to the Class CudbAutomaticMasterChange section of CUDB Node Configuration Data Model Description, Reference [5].

If AMC is enabled (that is, the value of the enabled attribute is true), then disable it on all CUDB nodes by setting the value of the enabled attribute to false.

Refer to the Object Model Modification Procedure in CUDB Node Configuration Data Model Description, Reference [5] for more information on all the steps required to modify the object model (for example, on using the applyConfig administrative operation to activate the changes).

Refer to the Configuring Automatic Mastership Change section of CUDB System Administrator Guide, Reference [6] for more information.

3.3.2.4   Mastership Change

Follow the below steps to perform a mastership change:

  1. Log in to the SC of the node where the master replica is to be hosted with the following command:

    ssh root@<CUDB_Node_OAM_VIP_Address>

    Refer to CUDB Users and Passwords, Reference [4] CUDB Users and Passwords for more information on the default root password.

  2. Execute one of the below commands depending on the type of the mastership change:
    1. In case of a DSG mastership change, execute the following command:

      cudbDsgMastershipChange -d <DSG_number>

    2. In case of PLDB mastership change, execute the following command:

      cudbDsgMastershipChange --pl

Refer to CUDB Node Commands and Parameters, Reference [2] for more information on the cudbDsgMastershipChange command, and to CUDB System Administrator Guide, Reference [3] for more information on the manual mastership change procedure.

3.3.2.5   Checking Mastership Status

When finished, check that the system has executed the planned mastership change without faults. Use the following command to do so:

cudbSystemStatus -R

Note:  
If the replication status is not correct, stop the procedure, and contact the next level of maintenance support.

3.3.3   Finishing Blade Replacement Preparation

Perform the following steps to finish blade replacement preparations.

Note:  
In the below commands, <name> and <blade> are used to identify blades, where:
  • <blade> is a numeric identifier, for example in SC_2_1 <blade> is 1, in PL_2_3 <blade> is 3.
  • <name> is the controller name (SC_2_ <blade>) or the payload blade name (PL_2_ <blade>).

  1. Lock the blade at SAF level with the following command:

    cmw-node-lock <name>

  2. Check if the specific blade is locked at SAF level with the following command:

    cmw-status -v node

    The output must be similar to the following example:

    safAmfNode=PL-7,safAmfCluster=myAmfCluster

    AdminState=LOCKED-INSTANTIATION(3)

    OperState=ENABLED(1)

  3. Make a backup of the rpm.conf file as follows:
    1. Make a copy of the file, and rename it to rpm.conf_FULL with the following command:

      cp /cluster/nodes/<blade>/etc/rpm.conf /cluster/nodes/<blade>/etc/rpm.conf_FULL

      The rpm.conf_FULL is now contains all entries of the original rpm.conf file.

    2. Overwrite the contents of the original rpm.conf file, so that it contains only the cudbKernelTuning rpm, and the ldews-control or ldews-payload entries (depending on the type of blade to replace). Use the following command to do so:

      grep -ia 'ldews\|linux\|cudbKernelTuning' /cluster/nodes/<blade>/etc/rpm.conf_FULL > /cluster/nodes/<blade>/etc/rpm.conf

  4. In case an SC is being replaced, create a symbolic link for the executing hooks with the following command:

    cd /cluster/hooks/<blade>

    ln -s /cluster/hooks/pre-installation.tar.gz pre-installation.tar.gz

    ln -s /cluster/hooks/post-installation.tar.gz post-installation.tar.gz

  5. Replace the blade as described in Section 3.4.
    Note:  
    When replacing SC, alarm SAF, LOTC Disk Replication Consistency Failed, might appear. If physical replacement is taking more than 20 minutes, alarm SAF, LOTC Disk Replication Communication Failed, might appear. These alarms are expected during blade replacement procedure on SC and should be automatically cleared when all replacement steps are executed. For more information, please follow the corresponding alarm OPI.

3.4   Replacing GEP Boards

Refer to the Manage Blade document in the BSP 8100 CPI for detailed information on the procedure needed to physically replace a blade.

3.5   CUDB Node Configuration Changes

This section describes the configuration changes to perform in a CUDB node in case blade replacement is needed.

3.5.1   Obtaining MAC Addresses for the New Blade

The MAC addresses are used as input to create the cluster.conf file, which is used by LDE. The MAC addresses are also needed to configure the Jumpstart server before installing LDE on the SCs, as well for the blade replacement procedure.

The MAC addresses are fetched through the BSP CLI. This MAC is the MAC base, used to obtain the MAC addresses necessary to complete the cluster.conf file generation.

To obtain the MAC addresses, do the following:

  1. Establish a BSP CLI session:

    ssh advanced@<BSP-NBI-SCX> -p2024

  2. Execute the following commands to show the MAC addresses:

    show-table ManagedElement=1,DmxcFunction=1,Eqm=1,VirtualEquipment=cudb -m Blade -p bladeId,firstMacAddr

    The output must be similar to the below example:

    ===============================
    | bladeId | firstMacAddr      |
    ===============================
    | 0-1     | 90:55:AE:3A:CB:1D |
    | 0-11    | 90:55:AE:3A:CA:5D |
    | 0-13    | 90:55:AE:3A:C9:CD |
    | 0-15    | 90:55:AE:3A:CA:75 |
    | 0-17    | 90:55:AE:3A:C9:9D |
    | 0-19    | 90:55:AE:3A:CA:ED |
    | 0-21    | 90:55:AE:3A:CB:AD |
    | 0-23    | 90:55:AE:3A:CD:5D |
    | 0-3     | 90:55:AE:3A:C9:55 |
    | 0-5     | 90:55:AE:3A:CA:15 |
    | 0-7     | 90:55:AE:3A:C9:FD |
    | 0-9     | 90:55:AE:3A:C9:25 |
    | 1-1     | 90:55:AE:3A:B0:7D |
    | 1-11    | 90:55:AE:3A:BF:C5 |
    | 1-3     | 90:55:AE:3A:C1:45 |
    | 1-5     | 90:55:AE:3A:BF:05 |
    | 1-7     | 90:55:AE:3A:BF:1D |
    | 1-9     | 90:55:AE:3A:BF:35 |
    ===============================

The MAC shown for each shelf slot is the base MAC. All the MACs can be obtained by adding a number to the <base mac>, in accordance to the following tables. Table 1 applies to BSP 8100 (GEP3) boards, while Table 2 applies to BSP 8100 (GEP5) boards.

Table 1    MAC Address Relation to GEP3 Boards

Address

Resulting MAC(1)

 

<BASE MAC> + 1

eth3

Left SCX Backplane Port

<BASE MAC> + 2

eth4

Right SCX Backplane Port

<BASE MAC> + 3

eth2

ETH-Debug Front Port

<BASE MAC> + 5

eth0

ETH-0 Front Port

<BASE MAC> + 6

eth1

ETH-1 Front Port

<BASE MAC> + 8

eth5

Left SCX 10GbE Backplane Port

<BASE MAC> + 9

eth6

Right SCX 10GbE Backplane Port

(1)  The resulting MAC must be in hexadecimal format.


Table 2    MAC Address Relation to GEP5 Boards

Address

Resulting MAC(1)

 

<BASE MAC> + 1

eth3

Left SCX 1GbE Backplane Port

<BASE MAC> + 2

eth4

Right SCX 1GbE Backplane Port

<BASE MAC> + 3

eth2

ETH-Debug Front Port

<BASE MAC> + 5

eth5

Left SCX 10GbE Backplane Port

<BASE MAC> + 6

eth6

Right SCX 10GbE Backplane Port

<BASE MAC> + 8

eth0

ETH-0 Front Port

<BASE MAC> + 9

eth1

ETH-1 Front Port

(1)  The resulting MAC must be in hexadecimal format.


Note:  
Ports ETH-0 and ETH-1 are enabled only during the initial SW installation phase from the Jumpstart server. After the LDE is installed on the blade, they remain disabled and cannot be used.

3.5.2   Obtaining Board Revision for the New Blade

In case of SC replacement in BSP 8100 systems with GEP3 hardware, perform the procedure below to check the product revision of the new blade:

  1. Establish a BSP NBI CLI session:

    ssh advanced@<BSP-NBI-SCX> -p2024 -t -s cli

  2. Execute the following command to show the blade hardware revisions:

    show ManagedElement=1,SystemFunctions=1,HwInventory=1 -m HwItem -p productIdentity

    The expected output must be similar to the below example:

    productIdentity="ROJ 208 840/3"
    productDesignation="GEP3-HD300"
    productRevision="R4B"
    

3.5.3   Editing the LDE installation.conf File

In case of SC replacement in BSP 8100 systems with GEP3 hardware, perform the procedure below to edit the installation.conf file.

  1. Establish an SSH session towards the target CUDB node with the following command:

    ssh root@<CUDB_Node_OAM_VIP_Address>

    This session is established to the first or second SC, either to SC_2_1 or SC_2_2.

    Refer to CUDB Users and Passwords, Reference [4] CUDB Users and Passwords for more information on the default root password.

  2. Locate the installation.conf file in the following directory:

    /cluster/etc/installation.conf

  3. Edit the file and set parameter value depending on the hardware revision of the new blade obtained in Section 3.5.2:
    • If it is lower than R9A, use the following value:

      disk_device_path=/dev/sdb

    • If it is R9A or higher, use the following value:

      disk_device_path=/dev/sda

3.5.4   Editing the LDE cluster.conf File

Perform the following steps to edit the cluster.conf file.

  1. Establish an SSH session towards the target CUDB node with the following command:

    ssh root@<CUDB_Node_OAM_VIP_Address>

    This session is established to the first or second SC, that is either to SC_2_1 or SC_2_2.

    Refer to CUDB Users and Passwords, Reference [4] CUDB Users and Passwords for more information on the default root password.

  2. Locate the cluster.conf file in the following directory:

    /cluster/etc/cluster.conf

  3. Open the file, and replace the old MACs with the ones. Use Table 1 or Table 2 in Section 3.5.1 as a means to calculate the actual MAC addresses.

    An example of the LDE cluster.conf file is provided below. Interfaces 1 or 2 are related to blade number: for example, if payload blade PL_2_5 is replaced, then interface 5 needs MAC addresses adaptation.

Example 1   cluster.conf File Example

# # Example /cluster/etc/cluster.conf
########################
#  
#  Interface definition
#  

interface 1 eth3 ethernet 90:55:ae:3a:b0:7e
interface 1 eth4 ethernet 90:55:ae:3a:b0:7f
interface 1 eth5 ethernet 90:55:ae:3a:b0:82
interface 1 eth6 ethernet 90:55:ae:3a:b0:83

interface 2 eth3 ethernet 90:55:ae:3a:c1:46
interface 2 eth4 ethernet 90:55:ae:3a:c1:47
interface 2 eth5 ethernet 90:55:ae:3a:c1:4a
interface 2 eth6 ethernet 90:55:ae:3a:c1:4b

interface 3 eth3 ethernet 90:55:ae:3a:bf:06
interface 3 eth4 ethernet 90:55:ae:3a:bf:07
interface 3 eth5 ethernet 90:55:ae:3a:bf:0a
interface 3 eth6 ethernet 90:55:ae:3a:bf:0b

interface 4 eth3 ethernet 90:55:ae:3a:c9:fe
interface 4 eth4 ethernet 90:55:ae:3a:c9:ff
interface 4 eth5 ethernet 90:55:ae:3a:ca:02
interface 4 eth6 ethernet 90:55:ae:3a:ca:03

interface 5 eth3 ethernet 90:55:ae:3a:c9:26
interface 5 eth4 ethernet 90:55:ae:3a:c9:27
interface 5 eth5 ethernet 90:55:ae:3a:c9:2a
interface 5 eth6 ethernet 90:55:ae:3a:c9:2b
  1. Verify the syntax of the cluster.conf file with the following command:

    cluster config -v

    In case of any error message, check the command output and correct syntax mistakes. Warning messages can be ignored.

  2. Reload the configuration with the following command:

    cluster config --reload --all

    Note:  
    The command fails for a currently replaced blade, this is the expected behavior. (Node X (<name>)not responding, skipped). Continue with next step.

  3. The new blade(s) start(s) booting from network.

3.6   System Controller Replacement Steps

This section describes the procedure to finalize the SC blade replacement.

The new blade is by default set to boot from network, the following procedure describes how to set it to boot from hard disk.

During this procedure, the new SC also synchronizes its replicated storage disk partition with another SC. This process can take up to one hour, depending on storage disk partition size and available network bandwidth. Use the following command on another SC to check the synchronization status:

cat /proc/drbd

Perform the following steps to finalize the SC replacement:

  1. Restore the original rpm.conf file with the following command: cp /cluster/nodes/<blade>/etc/rpm.conf_FULL /cluster/nodes/<blade>/etc/rpm.conf

    In the above command, <blade> must be replaced with the blade number. For example, in case of SC_2_2, the blade number is 2.

  2. Set the new SC blade to boot from hard disk. See Section 3.9 for details.
  3. Reboot the new SC from console interface with the following command:

    reboot

3.7   DSG and PLDB Replacement Steps

If the blade to replace is a DSG or PLDB, then perform the following steps:

  1. Log in to one of the SCs, and execute the following commands:

    ssh root@<CUDB_Node_OAM_VIP_Address>

    Refer to CUDB Users and Passwords, Reference [4] CUDB Users and Passwords for more information on the default root password.

    cd /opt/ericsson/cudb/OAM/support/bin/

    ./cudbPartTool rebuild -n <blade>

    In the above command, <blade> must be replaced with the blade number. For example, in case of PL_2_5, <blade> is 5.

  2. Check if the partition is created with the following command:

    ./cudbPartTool check -n <blade>

    In the above command, <blade> must be replaced with the blade number. For example, in case of PL_2_5, <blade> is 5.

    The output must be similar to the below example:

    CUDB_82 SC_2_2# ./cudbPartTool check -n 5
    
    CUDB partitioning tool
    
    -= Cluster filesystem analysis =-
    
    Payload PL_2_5 report:
     WARNING: local storages not mounted.
    
    
    Done.
    

  3. Restore the original rpm.conf file with the following command:

    cp /cluster/nodes/<blade>/etc/rpm.conf_FULL /cluster/nodes/<blade>/etc/rpm.conf

    In the above command, <blade> must be replaced with the blade number. For example, in case of PL_2_5, <blade> is 5.

3.8   Finalizing Replacement

Perform the following steps to finish blade replacement. The following steps apply to replacing every blade type (SC, DSG, and PLDB).

Note:  
In case of SC replacement, crontab jobs and their definitions, or similar tasks, which are not deployed by default in CUDB, will be lost. If necessary, redeploy them after the procedure is completed.

  1. Unlock the blade at SAF level with the following command:

    cmw-node-unlock <name>

    <name> is the name of the replaced blade, for example PL_2_5.

  2. Reboot the newly-installed blade with the following command:

    cluster reboot -n <blade>

    <blade> is the number of the replaced blade, for example 5 for PL_2_5.

  3. Wait until the blade has rebooted and joined the cluster. Use the following command to list the joined blades, and to check the operational states of the SUs:

    cmw-status -v node

    The expected output must be similar to the below example:

    safAmfNode=PL-7,safAmfCluster=myAmfCluster

    AdminState=UNLOCKED(1)

    OperState=ENABLED(1)

  4. Wait until all the processes are started in the blade and check if the system has recovered without faults with the cudbSystemStatus command. In case of DS, errors related to the DS database can be ignored because of the data restore done later.
    Note:  
    If the status is not correct, stop the procedure, and contact the next level of maintenance support.

  5. Exit the SSH session with the exit command.
  6. Depending on the blade type, do the following:
    • If the replaced blade is an SC, the procedure is finished.
    • If the replaced blade is a DSG blade, to backup and restore a DSG replica, perform the steps described in the Performing Combined Unit Data Backup and Restore section of CUDB Backup and Restore Procedures, Reference [7].
    • If the replaced blade is any one in PLDB group, to backup and restore a PLDB replica, perform the steps described in the Performing Combined Unit Data Backup and Restore section of CUDB Backup and Restore Procedures, Reference [7].

      After the NDBs are started and the mysql server connections are OK, execute the following command:

      cudbPrepareStore --pl

      Note:  
      After finishing the rebuild procedure, the stored procedures are not restored. Recreate them with the following command:
      cudbManageStore -p -o restorestoredprocedures


Warning!

SW backup created before blade replacement will not be valid after blade replacement since the backup contains an outdated cluster.conf file, therefore the new blade cannot be reached. For creating a new SW backup, follow the steps described in the Software and Configuration Backup section of CUDB Backup and Restore Procedures, Reference [7] section.

  1. In case the replaced blade is a PLDB or DSG blade, and AMC was manually disabled because of the steps performed in Section 3.3.2.3, or because of some other emergency recovery procedure, then re-enable AMC by setting the value of the enabled attribute of the CudbAutomaticMasterChange class to true. For more information, refer to the Class CudbAutomaticMasterChange section of CUDB Node Configuration Data Model Description, Reference [5].

    Refer to the Object Model Modification Procedure in CUDB Node Configuration Data Model Description, Reference [5] for more information on all the steps required to modify the object model (for example, on using the applyConfig administrative operation to activate the changes).

Refer to the Configuring Automatic Mastership Change section of CUDB System Administrator Guide, Reference [6] for more information.

3.9   Changing the Boot Device Order

The procedure required to change boot device order on BSP 8100 (GEP5) or BSP 8100 (GEP3) hardware is described in the Boot Device Setting section of the Manage Blade document in the BSP 8100 CPI (Ericsson BSP R9 or above).

Note:  
Refer to the Set GEP3 Disk Processor Boot Device or Set GEP5 Disk processor Boot Device documents in the BSP 8100 CPI, if older version of Ericsson BSP product is used.

The default boot order configuration for new blades in BSP 8100 systems with GEP3 or GEP5 hardware is to boot from network:

boardConfiguration="0.1.255.255.255.255.255.255.255.255.255.255.255.255.255.255.4.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0"

The target boot order configuration for PL blades in BSP 8100 systems with GEP3 or GEP5 hardware, after blade replacement is finalized, is the same.

The target boot order configuration for SC blades in BSP 8100 systems, after blade replacement is finalized, is to boot from hard disk:

GEP3

boardConfiguration="16.255.255.255.255.255.255.255.255.255.255.255.255.255.255.255.4.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0"

GEP5

boardConfiguration="18.17.16.255.255.255.255.255.255.255.255.255.255.255.255.255.4.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0"

3.10   Replacing Multiple Blades in Parallel

This section provides instructions needed to replace multiple blades in parallel on CUDB nodes.

3.10.1   Parallel Blade Replacement Procedure

Only the same group of blades can be replaced in parallel at once. In the CUDB system, blades can be grouped into three distinct groups: SC blades, PLDB blades, and DSG blades. These groups can be further divided into groups of even-numbered and odd-numbered blades, resulting six distinct groups of blades in total:

  1. SC_2_2
  2. SC_2_1
  3. Odd-numbered PLDB blades
  4. Even-numbered PLDB blades
  5. Odd-numbered DSG blades
  6. Even-numbered DSG blades
Warning!

Do not replace blades in parallel if they belong to different blade groups. Replacing blades belonging to different groups in parallel at the same time can cause major node outage.

Perform the following steps to replace multiple blades in parallel:

Note:  
To ensure that there is enough traffic handling capacity during replacement execution, it is recommended that the maximum number of payload blades to be replaced in parallel should not be larger than the configured value of the redundancyLevel attribute of the CudbLdapAccess class.

If there are more blades to be replaced, it should be done iteratively, in a way that in each iteration, replacement is done for maximum of N blades from the same group in parallel, where N is the value of the redundancyLevel attribute. However, if replacement is done in low traffic period or in a maintenance window, when the degraded traffic handling capacity could still be sufficient, it can be decided to execute replacement for more than N blades in parallel.


  1. Check the value of the redundancyLevel attribute of the CudbLdapAccess class and take special note of it. For more information, refer to the Class CudbLdapAccess section of CUDB Node Configuration Data Model Description, Reference [5].
  2. Identify all faulty blades inside the node, as described in Section 3.1 to be able to group them.
  3. Identify the position of all faulty blades, as described in Section 3.2.
  4. Prepare for the replacement of all faulty blades, as described in Section 3.3.
    Note:  
    AMC is a system-wide change that must be performed only once.

  5. In case of replacing SC group(s) or PLDB group(s), force the external applications to move their primary connections to another CUDB node. This applies in case primary connections are established, or the SC or the PLDB blades are affected.
  6. Execute the blade replacement exactly in the following order, skipping any group which has no faulty blades:
    1. SC_2_2
    2. SC_2_1
    3. Odd-numbered PLDB blades
    4. Even-numbered PLDB blades
    5. Odd-numbered DSG blades
    6. Even-numbered DSG blades
    Warning!

    Always follow the order of groups exactly.

  7. If number of blades that have been replaced in parallel was greater than value of redundancy level parameter, please also execute cudbLdapFeRestart command. For more information, refer to the cudbLdapFeRestart section of CUDB Node Commands and Parameters, Reference [2].

Glossary

For the terms, definitions, acronyms and abbreviations used in this document, refer to CUDB Glossary of Terms and Acronyms, Reference [8].


Reference List

CUDB Documents
[1] CUDB Node Hardware Description.
[2] CUDB Node Commands and Parameters.
[3] CUDB System Administrator Guide.
[4] CUDB Users and Passwords, 3/00651-HDA 104 03/10
[5] CUDB Node Configuration Data Model Description.
[6] CUDB System Administrator Guide.
[7] CUDB Backup and Restore Procedures.
[8] CUDB Glossary of Terms and Acronyms.


Copyright

© Ericsson AB 2016-2018. All rights reserved. No part of this document may be reproduced in any form without the written permission of the copyright owner.

Disclaimer

The contents of this document are subject to revision without notice due to continued progress in methodology, design and manufacturing. Ericsson shall have no liability for any error or damage of any kind resulting from the use of this document.

Trademark List
All trademarks mentioned herein are the property of their respective owners. These are shown in the document Trademark Information.

    Server Platform, Blade Replacement