CEE SW Update and Rollback
Cloud Execution Environment

Contents

1Introduction
1.1Prerequisites
1.2Tools and Equipment
1.3Conditions and Limitations

2

Update
2.1Fuel Update
2.2Updating vCIC and Compute Hosts
2.3Finalization Steps

3

Rollback
3.1Select or Restore Old Repository
3.2vCIC and Compute Hosts Rollback

4

Update State

5

Error Handling

Reference List

1   Introduction

Note:  
CEE software rollback is not supported in this CEE release.

This document is used for performing a Cloud Execution Environment (CEE) software update or rollback between CEE R6 releases.

For information about Atlas upgrade, refer to Atlas SW Upgrade.

1.1   Prerequisites

In case of single server environment, vFuel has to be enabled before the update and disabled after the update. Refer to vFuel On Demand Use for detailed instructions.

1.2   Tools and Equipment

No tools and equipments are needed.

1.3   Conditions and Limitations

There must be no active alarms in the system when starting the update or rollback process.

The environment must be healthy. Perform a health check as described in Health Check Procedure.

1.3.1   CEE Update with Deployed SR-IOV VMs

If the version of the deployed CEE system is R6.2 or earlier and it runs VMs based on SR-IOV networking, an additional step is required before starting the update process.

An additional configuration parameter, physical_network must be added in the config.yaml for every SR-IOV PF on each compute host, see the example below:

Example 1   physical_network SR-IOV Configuration Parameter

  sriov_configs:
  - &DELL_620_sriov_info
    - pci_address: "0000:41:00.0"
      physical_network: "pool_0000_41_00_0"
    - pci_address: "0000:41:00.1"
      physical_network: "pool_0000_41_00_1"
  shelf:
    ...
      blade:
        -
          id: 5
          blade_mgmt:
            name: subrack_ctrl_sp
            ip: 10.0.3.24
            username: admin
            passwd: ericsson
            nic_assignment: *DELL_620_nic_assignment
            reservedHugepages: *DELL_620_reservedHugepages
            reservedCPUs: *auto_reservedCPUs
            reservedDisk: *reservedDisk
            sriov:
              vf: 8
              devices: *DELL_620_sriov_info

The name of the physical_network must be originated from the PCI address of the respective NIC device, by adding "pool" to the beginning of the address and changing colons ":" and dots "." to underscores "_", as seen in the example above. If the physical_network names do not follow the R6.2 algorithm mentioned above, the SR-IOV VMs must be stopped before the update and reinitialized with the new pool names. For more information, refer to the CEE Architecture Description.

The already assigned physical networks can be checked in the Neutron configuration file /etc/neutron/plugins/ml2/sriov_agent.ini on the affected compute hosts, as shown in the example below:

Example 2   Example Assigned Physical Networks

cat /etc/neutron/plugins/ml2/sriov_agent.ini
...
[sriov_nic]
physical_device_mappings =pool_0000_41_00_0:eth8,pool_0000_41_00_1:eth9

2   Update

This section describes the procedure for updating the CEE software.

Compute examples are used in the following manner for the CEE structure used in this document:

Note:  
The same examples are used for the rollback procedure, at the relevant steps.

2.1   Fuel Update

Perform the following steps:

  1. In case of single server deployment, proceed to Step 4.
  2. Log on to the vCIC using SSH:

    ssh -X <personal-user>@<vcic_address>

  3. If prompted, provide the user password.
  4. Log on to the vFuel using SSH:

    ssh -X root@<Fuel (static)>

    Example:

    ssh -X root@192.168.0.11

    Note:  
    Connectivity to the vCICs will be lost during the update.

  5. If prompted, provide the user password.
  6. It is recommended to perform a Fuel synchronization. For detailed instructions, refer to Fuel Synchronization.
  7. Check the amount of free space on /root with the command:

    df -h

    Note:  
    If there is not enough free space for the ISO image, free up some space before moving on to the next step.

  8. Transfer the ISO image to the home of the root user on vFuel.
  9. To ensure that the update process is not interrupted, start a screen session and run the commands in it:

    screen -L

    Later during the update, if a node is rebooted and the connection is lost towards Fuel, log back to Fuel with the steps above, and reattach the screen session with the command:

    screen -r

    Note:  
    The screen session can only be reattached after the node rebooted and is back online.

    After exiting the screen session, a screenlog.X log file will be available in the current working directory.


  10. Create a list of the repositories:

    /usr/share/ericsson-orchestration/scripts/⇒
    ericsson_repo.py list
    /usr/share/ericsson-orchestration/scripts/ericsson_repo.py list

    Make a note of the repositories used, including the old ones.

  11. Mount the transferred ISO image using the following commands:

    mkdir -p /mnt/update
    mount -o loop <ISO> /mnt/update

  12. Update Fuel repository by running the following commands:

    rsync -ar --delete /mnt/update/isolinux/ ⇒
    /var/www/nailgun/centos/x86_64/isolinux/
    rsync -ar --delete /mnt/update/isolinux/ /var/www/nailgun/centos/x86_64/isolinux/

    rsync -ar --delete /mnt/update/Packages/ ⇒
    /var/www/nailgun/centos/x86_64/Packages/
    rsync -ar --delete /mnt/update/Packages/ /var/www/nailgun/centos/x86_64/Packages/

    rsync -ar --delete /mnt/update/images/ ⇒
    /var/www/nailgun/centos/x86_64/images/
    rsync -ar --delete /mnt/update/images/ /var/www/nailgun/centos/x86_64/images/

    rsync -ar --delete /mnt/update/repodata/ ⇒
    /var/www/nailgun/centos/x86_64/repodata/
    rsync -ar --delete /mnt/update/repodata/ /var/www/nailgun/centos/x86_64/repodata/

  13. Copy the repository:

    /mnt/update/ericsson_repo.py copy /mnt/update/

  14. Select the new repository from the list:

    /mnt/update/ericsson_repo.py list
    /mnt/update/ericsson_repo.py select <repository>

  15. Check the Limitations and Workarounds for Cloud Execution Environment (CEE), Reference [1], for any workarounds and apply them.
  16. Continue with Section 2.2.

2.2   Updating vCIC and Compute Hosts

Perform the update following the steps in the given order:

  1. Create backup for the configuration YAML files:

    mkdir -p /mnt/cee_config/backup-<date>

    cp /mnt/cee_config/*.yaml /mnt/cee_config/backup-<date>

  2. Check if there are any changes in the config.yaml between the releases. If necessary, update the config.yaml using the new templates bundled with the ISO image.
  3. Update the install framework:

    /opt/ecs-fuel-utils/install_ceescripts.sh fuelrestore

  4.  
    • If not updating from R6.0, continue with Step 5.
    • If updating from R6.0, follow these steps:
    1. Download the attached files from https://cc-jira.rnd.ki.sw.ericsson.se/browse/CLD-2334.
    2. Copy the downloaded files to the /usr/share/ericsson-orchestration/playbooks/ folder.
    3. Open the directory containing the playbooks:
      cd /usr/share/ericsson-orchestration/playbooks/

    4. Run the following playbook:
      openstack-ansible R6_0-to-R6_2-update.yml

    5. Verify the result according to Step 7.
  5. Create a text file containing the vCIC node names to be updated:

    Example:

    cic-1
    cic-2
    cic-3

    Note:  
    CICs can be updated one-by-one instead of a single step if a single name is entered into the text file.

  6. Update the vCICs with the following command:

    /usr/share/ericsson-orchestration/scripts/⇒
    update-serial.sh <cic_nodes.txt>
    /usr/share/ericsson-orchestration/scripts/update-serial.sh <cic_nodes.txt>

    Note:  
    The vCIC nodes are updated one after the other in alphabetical order (serial method).

  7. Verify the result by checking the last row in the displayed output. It must be similar to the following example:

    # cic-1 : ok=38 changed=6 unreachable=0 failed=0

    Ensure that the values fulfill the following rules:

    ok= Not zero
    changed= It can be any number.
    unreachable= 0
    failed= 0
  8. In case of a multi-server deployment, repeat the previous steps, starting from Step 5, if there are any vCICs that are not updated.

    In case of single server deployment, proceed to Step 10.

  9. In case of a multi-server deployment, update the host where vFuel is running:
    1. Prepare a text file with the compute host name where vFuel is running, for example:

      compute-0-1

    2. Issue the command:

      /usr/share/ericsson-orchestration/scripts/⇒
      update-serial.sh <compute_host.txt>
      /usr/share/ericsson-orchestration/scripts/update-serial.sh <compute_host.txt>

    3. Verify the result by checking the output as described in Step 7.
    Note:  
    During the update the vFuel host is restarted, and consequently the vCIC is also restarted. The connection to vFuel is lost, and a new connection has to be established (including reattaching the screen session) as described in Step 9 in Section 2.1.

  10. Update the hosts where vCICs are running:
    1. Prepare a text file with the compute host names where vCICs are running, for example:

      compute-0-2
      compute-0-3

      Note:  
      The Compute hosts are updated in alphabetical order.

      Compute hosts can be updated one-by-one instead of a single step if a single name is entered into the text file.

      VMs are handled according to Nova migration policy, refer to OpenStack Compute API in CEE for details.


    2. Issue the command:

      /usr/share/ericsson-orchestration/scripts/⇒
      update-serial.sh <compute_hosts.txt>
      /usr/share/ericsson-orchestration/scripts/update-serial.sh <compute_hosts.txt>

    3. Verify the progress and result by checking the output as described in Step 7.
    4. Perform a health check for the updated host refer to Health Check Procedure for details.
    5. In case of a multi-server deployment, repeat the previous steps, starting from Step 10, until all hosts hosting vCIC are updated one-by-one.
  11. In case of single server deployment, proceed to Step 14.
  12. Update the Compute hosts either one after the other (serial method) or concurrently (parallel method).

    Serial method

    1. Create a text file containing the compute host names, for example:

      compute-0-4
      compute-0-5

    2. Issue the command:

      /usr/share/ericsson-orchestration/scripts/⇒
      update-serial.sh <compute_hosts.txt>
      /usr/share/ericsson-orchestration/scripts/update-serial.sh <compute_hosts.txt>

      The Compute hosts are updated in alphabetical order.

    3. Verify the progress and result by checking the output as described in Step 7.

    Parallel method

    1. Determine the number of hosts that can be updated at the same time. The size of the group is determined by the available free capacity in the region. For example, if the free capacity is enough to host VMs currently located in two compute nodes, then the maximum usable size is two.
    2. Create text files containing the compute host names for each group to be updated together, for example:

      compute-0-6
      compute-0-7

    3. Issue the command:

      /usr/share/ericsson-orchestration/scripts/⇒
      update-parallel.sh <compute_host_group_n.txt>
      /usr/share/ericsson-orchestration/scripts/update-parallel.sh <compute_host_group_n.txt>

    4. Verify the result by checking the output as described in Step 7.
  1. Repeat the previous steps, starting from Step 12, until all hosts are updated.
  2. Continue with Section 2.3.
Note:  
In case of any errors during the process, see Section 5.

2.3   Finalization Steps

  1. Check that the update is performed successfully. Run the health check according to the Health Check Procedure.
  2. After an update the system has changed and new vCIC backups must be created for possible future vCIC repair actions. This implies that the vCICs are brought down to maintenance mode one at a time. Issue the command:

    backup_vcic_images

  3. If the update is performed successfully, the old repository or repositories can be deleted by using the following command:

    /mnt/update/ericsson_repo.py delete <old_version>

  4. Exit the screen session using the following command:

    exit

3   Rollback

Note:  
CEE software rollback is not supported from this CEE release.

Note:  
Rollback to CEE R6.0 is not supported.

This section describes the procedure for rolling back the CEE software.

3.1   Select or Restore Old Repository

Perform the following steps:

  1. Connect to vFuel if not already connected:
    1. Log on to the vCIC by using SSH:

      ssh -X <personal-user>@<vcic_address>

    2. If prompted, provide the user password.
    3. Log on to the vFuel by using SSH:

      ssh -X root@<Fuel (static)>

      Example:

      ssh -X root@192.168.0.11

      Note:  
      Connectivity to the vCICs will be lost during rollback.

  2. If not in a screen session, then start one:

    screen -L

  3. Collect logs according to Data Collection Guideline
  4. Select the old repository by using the following commands:

    /usr/share/ericsson-orchestration/scripts/⇒
    ericsson_repo.py list
    /usr/share/ericsson-orchestration/scripts/ericsson_repo.py list

    /usr/share/ericsson-orchestration/scripts/⇒
    ericsson_repo.py select <old_repo_name>
    /usr/share/ericsson-orchestration/scripts/ericsson_repo.py select <old_repo_name>

  5. Check the Limitations and Workarounds for Cloud Execution Environment (CEE) of the old version for any installation or update related workarounds, and the Limitations and Workarounds for Cloud Execution Environment (CEE), Reference [1], of the new version for any rollback related workarounds, and apply them.
  6. Continue with Section 3.2.

3.2   vCIC and Compute Hosts Rollback

Attention!

Perform the rollback process only for the nodes or hosts that have been updated in Section 2.

Rollback to the previous CEE software level by performing the following steps:

  1. Review and restore the configuration YAML files from the backup created during the update process in Section 2.2:

    cp /mnt/cee_config/backup-<date>/*.yaml /mnt/cee_config

  2. Update the install framework:

    /opt/ecs-fuel-utils/install_ceescripts.sh

  3. Create a text file containing the vCIC node name to rollback:

    Example:

    cic-1

  4. Rollback the vCICs with the following command:

    /usr/share/ericsson-orchestration/scripts/⇒
    update-serial.sh <cic_nodes.txt>
    /usr/share/ericsson-orchestration/scripts/update-serial.sh <cic_nodes.txt>

    Note:  
    Rollback of the vCIC nodes happens one after the other in alphabetical order (serial method).

  5. Verify the result by checking the last row in the displayed output. It must be similar to the following example:

    # cic-1 : ok=38 changed=6 unreachable=0 failed=0

    Ensure that the values fulfill the following rules:

    ok= Not zero
    changed= It can be any number.
    unreachable= 0
    failed= 0
  6. In case of a multi-server deployment, repeat the previous steps, starting from Step 3, until all vCICs are rolled back.

    In case of a single server deployment, continue with Step 8.

  7. Rollback the host where vFuel is running:
    1. Prepare a text file with the compute host name where vFuel is running, for example:

      compute-0-1

    2. Issue the command:

      /usr/share/ericsson-orchestration/scripts/⇒
      update-serial.sh <compute_host.txt>
      /usr/share/ericsson-orchestration/scripts/update-serial.sh <compute_host.txt>

    3. Verify the result by checking the output as described in Step 5.
    Note:  
    During the rollback the vFuel host is restarted, and consequently the vCIC is also restarted. The connection to vFuel is lost, and a new connection needs to be established (including reattaching the screen session) as described in Step 9 in Section 2.1.

  8. Rollback the hosts where vCICs are running:
    1. Prepare a text file with the compute host names where vCICs are running, for example:

      compute-0-2
      compute-0-3

      Note:  
      The Compute hosts are rolled back in alphabetical order.

      Compute hosts can be rolled back one-by-one instead of a single step if a single name is entered into the text file.


    2. Issue the command:

      /usr/share/ericsson-orchestration/scripts/⇒
      update-serial.sh <compute_hosts.txt>
      /usr/share/ericsson-orchestration/scripts/update-serial.sh <compute_hosts.txt>

    3. Verify the progress and result by checking the output as described in Step 5.
    4. In case of a multi-server deployment, repeat the previous steps, starting from Step 8, until all hosts are rolled back one-by-one.
    5. In case of a single server deployment, continue with Step 12.
  9. Restore the Compute hosts either one after the other (serial method) or concurrently (parallel method).

    Serial method

    1. Create a text file containing the compute host names, for example:

      compute-0-4
      compute-0-5

    2. Issue the command:

      /usr/share/ericsson-orchestration/scripts/⇒
      update-serial.sh <compute_hosts.txt>
      /usr/share/ericsson-orchestration/scripts/update-serial.sh <compute_hosts.txt>

      The Compute hosts are rolled back in alphabetical order.

    3. Verify the progress and result by checking the output as described in Step 5.

    Parallel method

    1. Determine the number of hosts that can be rolled back at the same time. The size of the group is determined by the available free capacity in the region. For example, if the free capacity is enough to host VMs currently located in two compute nodes, then the maximum usable size is two.
    2. Create text files containing the compute host names for each group to be rolled back together, for example:

      compute-0-6
      compute-0-7

    3. Issue the command:

      /usr/share/ericsson-orchestration/scripts/⇒
      update-parallel.sh <compute_host_group_n.txt>
      /usr/share/ericsson-orchestration/scripts/update-parallel.sh <compute_host_group_n.txt>

    4. Verify the result by checking the output as described in Step 5.
  10. After rollback the system has changed and new vCIC backups must be created for possible future vCIC repair actions. This implies that the vCICs are brought down to maintenance mode one at a time. Issue the command:

    backup_vcic_images

  11. If the update is performed successfully, the old repository or repositories can be deleted by using the following command:

    /mnt/update/ericsson_repo.py delete <old_version>

  12. Exit the screen session:

    exit

4   Update State

During the update or rollback process the state of the update can be checked anytime from Fuel. Run the command optionally with node names:

update_state [node_name]

This gives a short state report of the nodes. The following is an example of the update state report:

Example 3   Update State Report

[root@fuel ~]# update_state
+--------------+----------+----------------------+----------------------+
|     Node     |  State   |       Current        |        Target        |
+--------------+----------+----------------------+----------------------+
| compute-0-1  | started  | 16-R2A23-5cc008f-7.0 | 16-R3A24-386cd1b-7.0 |
| compute-0-2  |  queued  | 16-R2A23-5cc008f-7.0 |         None         |
| compute-0-3  |  queued  | 16-R2A23-5cc008f-7.0 |         None         |
| compute-0-4  |  queued  | 16-R2A23-5cc008f-7.0 |         None         |
| compute-0-5  |  queued  | 16-R2A23-5cc008f-7.0 |         None         |
| compute-0-6  |  queued  | 16-R2A23-5cc008f-7.0 |         None         |
|    cic-1     | finished | 16-R3A24-386cd1b-7.0 |         None         |
|    cic-2     | finished | 16-R3A24-386cd1b-7.0 |         None         |
|    cic-3     | finished | 16-R3A24-386cd1b-7.0 |         None         |
+--------------+----------+----------------------+----------------------+

5   Error Handling

In case any error occurs during the update, follow these steps to repair:

  1. Check the following logs:
    1. /var/log/ansible.log
    2. The logs of the failed systems according to ansible.log.
  2. Perform data collection according to the Data Collection Guideline.
  3. Fix the possible problems and rerun the update towards the failing node.

    If the error symptoms do not suggest any possible solution, follow these steps:

    1. Restore the vFuel by following the procedure in the document Fuel Synchronization.
    2. Rollback the nodes that were failing during the update, and rollback the other, already updated nodes.
  4. Contact the next level of support.

Reference List

[1] Limitations and Workarounds for Cloud Execution Environment (CEE) AZE 102 01/5 R5A, 5/109 21-AZE 102 01/5-4 Uen


Copyright

© Ericsson AB 2016. All rights reserved. No part of this document may be reproduced in any form without the written permission of the copyright owner.

Disclaimer

The contents of this document are subject to revision without notice due to continued progress in methodology, design and manufacturing. Ericsson shall have no liability for any error or damage of any kind resulting from the use of this document.

Trademark List
All trademarks mentioned herein are the property of their respective owners. These are shown in the document Trademark Information.

    CEE SW Update and Rollback         Cloud Execution Environment