1 Introduction
The document provides instructions on how to replace a server in the Cloud Execution Environment (CEE). Throughout this document Compute host hardware is referred to as server.
1.1 Scope
This document describes how to replace a server in the CEE.
The document is applicable for the replacement of the following servers:
- Host not containing the CEE virtual Cloud Infrastructure Controller (vCIC) node
- Host containing the vCIC node
1.2 Prerequisites
This section provides information on the documents, tools, and conditions that apply to the procedure.
1.2.1 Documents
Before starting this procedure, ensure that the following documents have been read and understood:
The following document is referred and used in this procedure:
- Depending on the used hardware environment, use the relevant document from the ones referred in Section 2.2. The documents contain further prerequisites.
- Data Collection Guideline
1.2.2 Tools
The following tools are needed:
- An Electrostatic Discharge (ESD) wrist strap (part number LYB 250 01/14)
- A computer with the ability to do a Secure Shell (SSH) logon to the vCIC
1.2.3 Data
A site-specific IP and VLAN plan is required.
The address variables used in the document IP and VLAN plan are used throughout this document, and are summarized in the following table.
|
VLAN |
Variable Name |
Factory Default IP Address Allocation |
|---|---|---|
|
fuel_ctrl_sp |
<vFuel (static)> |
192.168.0.11 |
Other site-specific data is listed in the following table:
|
Resource |
Variable Name |
Additional Information |
|---|---|---|
|
External vCIC IP address |
<vcic_address> |
|
|
Personal user name to the vCIC |
<personal-user> |
|
|
Password for the personal username to the vCIC |
||
|
Password for the root user on vFuel |
||
|
Name of the host to be replaced |
<hostname> |
Host names are specified by the following scheme: compute-<shelf_number>-<blade_number> |
1.2.4 Conditions
Before starting this procedure, ensure that the following conditions are met:
- A work order for the replacement is received or the document is referred from another procedure.
- A compute host failed alarm is active, if faulty board is to be replaced.
- The new server is available and it has been verified visually that it is undamaged.
- The IP addresses and credentials for SSH connections to the devices below are known. See also Section 1.2.3.
- Name of the host to be replaced is known. See also Section 1.2.3.
- There is no active Fuel Failed alarm.
- All keys to the site are available and site access is granted.
2 Procedure
This procedure describes how to replace a server.
The procedure contains the following activities:
- Preparations for server removal, see Section 2.1.
- Hardware replacement of the server and server BIOS configuration, see Section 2.2.
- Executing the installation command, see Section 2.3.
- Concluding Routine, see Section 2.5.
Start the procedure with Section 2.1.
2.1 Preparations for Server Removal
This section describes how to prepare for changing a server.
- Inform the Operation and Maintenance Center (OMC) that work is in progress on the node with possible disturbance to the service.
- Check current alarms to have a baseline for the same checks after compute host has been changed.
- Log in to an active vCIC with the logon credentials given
in the site documentation.
ssh <personal-user>@<vcic_address>
Example:
ssh <personal-user>@10.0.22.10
- If prompted, provide the user password.
- Log on to vFuel by using SSH:
ssh root@<vFuel (static)>
Example:
ssh root@192.168.0.11
- If prompted, provide the user password.
- Remove the node by issuing the following command on the
vFuel node:
removeceenode --name <hostname>
If the removal fails, stop the process and consult the next level of support. Further actions are outside the scope of this instruction.
- Log out from vFuel:
exit
- Log out from vCIC:
exit
- Continue with Section 2.2.
2.2 Hardware Replacement and Configuration of the Server
Perform the following steps:
- Depending on the used HW infrastructure, refer to the
relevant instruction indicated below and perform the steps provided
for hardware installation of the server including BIOS configuration:
- HP-based systems: HP c7000 Server HW Replacement and HP c7000 Server BIOS Configuration.
- BSP-based systems: CPI documentation of the Blade Server Platform (BSP). Use section Replace Device Board in the instruction Manage Blade, Reference [1].
- Dell-based systems:
- HDS-based systems: Hyperscale Datacenter System 8000 Customer Documentation, Reference [2]
- Other HW: documentation provided by the manufacturer.
- Note:
- In the case of unmanaged servers, before the execution of
the installation command, the new server needs to be discovered in
vFuel. After the new server has been configured, do the following:
- Set the boot device to PXE.
- Force restart the server.
- Verify in vFuel that the new server has been discovered.
- Continue with Section 2.3.
2.3 Executing the Installation Command
Perform the following steps:
- Log on to a vCIC by using SSH:
ssh <personal-user>@<vcic_address>
Example:
ssh <personal-user>@10.0.22.10
- If prompted, provide the user password.
- Log on to vFuel using SSH:
ssh root@<vFuel (static)>
Example:
ssh root@192.168.0.11
- If prompted, provide the user password.
- If the replacement server has different characteristics from those of the old server (for example, if GEP5 blade is exchanged to GEP7), update the server parameter in the /mnt/cee_config/config.yaml file. For more information, refer to the platform-related System Dimensioning Guide and the Configuration File Guide.
- Issue the following command in order to start the installation:
expandcee --repair
- Note:
- Verification for the newly replaced server is also performed by the command.
- Do the relevant action:
- Collect the console printout.
- Collect all logs by referring to Data Collection Guideline.
- Consult the next level of maintenance support. Stop this workflow. Further actions are outside the scope of this instruction.
- Log out from vFuel:
exit
- Continue with Section 2.4.
2.4 Fuel Synchronization
In order to synchronize Fuel VM, follow the instructions in Fuel Synchronization.
Continue with Section 2.5.
2.5 Concluding Routine
Perform the following steps:
- Check that there are no additional active alarms. If new active alarms are found, act on them according to the relevant Alarm OPI.
- Check that there are no alerts created during the repair process. If any such alerts are present, act on them according to the relevant OPI.
- Log out from the vCIC.
exit
- Collect all tools and equipment.
- Report that the server has been replaced.
- Handle the removed unit according to company proceduresregarding repair and data security.
- Note:
- Sensitive data may be present on the disk.
- Carry out any remaining actions according to the work order, if applicable.
- The job is completed.
Reference List
| [1] Manage Blade, 53/1543-APR 901 0549/1 |
| [2] Hyperscale Datacenter System 8000 Customer Documentation, 2/1551-LZN 901 5032 |

Contents