OpenStack Compute API in CEE
Cloud Execution Environment

Contents

1Introduction
1.1API Version
1.2Document References
1.2.1API Design Base Reference

2

Supported Operations
2.1Basic OpenStack Operations
2.2OpenStack Extensions

3

Ericsson Extended Functions
3.1forcemove
3.1.1API Operation
3.1.2VM Migration and Evacuation
3.1.3Scheduling
3.1.4Hints for VM Affinity and Antiaffinity
3.2redefine
3.2.1API Operation
3.3PCI Passthrough and SR-IOV Physical Function Passthrough Use in OpenStack
3.3.1Nova Flavor Creation for PCI Passthrough Ports and SR-IOV Physical Function Passthrough Ports
3.4Bandwidth Based Scheduling
3.4.1API Operation
3.4.2VMs with and without Bandwidth Requirements

4

Limitations
4.1Limitations for Multi-Server Deployment
4.2Limitations for Single Server Deployment

5

Additional Information
5.1Host Aggregates in CEE
5.1.1Removal of Host from Host Aggregate
5.2Libvirt Real Time Instances

1   Introduction

This document serves as an introduction to the use of the Application Programming Interface (API) of the OpenStack component "compute" in the Cloud Execution Environment (CEE).

While the main aim of the document is to present the compute API in CEE, it also contains descriptive information about the features of CEE compute.

Note:  
Unless otherwise indicated, values in this document are given in KB, MB, and GB, according to JESD100B.01. For more information, refer to Units of measurement in the document Glossary of Terms and Acronyms.

1.1   API Version

By default, the external Virtualized Network Function Manager (VNFM) and Network Function Virtualization Orchestrator (NFVO) use the CEE compute API microversion based on the OpenStack compute API v2.1.

Note:  
OpenStack compute API v2.0 is supported but deprecated in this release and is to be removed in the next major release.

1.2   Document References

This section contains the official OpenStack API reference.

1.2.1   API Design Base Reference

For the description of the API operations, refer to the OpenStack Compute API and section "Compute" in the OpenStack Administrator Guide.

These are stored copies of the OpenStack document versions that were the base for the development of this version of CEE.

2   Supported Operations

The following sections contain the API operations and API extensions that are supported in CEE.

2.1   Basic OpenStack Operations

For the detailed description of basic compute API operations, refer to the OpenStack Compute API.

CEE compute supports basic compute API operations, with the limitations listed in Section 4.

2.2   OpenStack Extensions

Not applicable for OpenStack compute API v2.1.

3   Ericsson Extended Functions

This section presents the extended API functions that are specific to the CEE.

3.1   forcemove

forcemove is an API function that supports the migration and evacuation of Virtual Machines (VMs) from a compute host.

forcemove uses Nova rescheduling, honoring same_host, different_host, server_groups affinity filters, and High Availability (HA) policy.

Note:  
The HA policy are configured in the metadata field of the VM.

3.1.1   API Operation

In CEE, the standard OpenStack API is extended with a call for forcemove:

POST v2/<tenant_id>/servers/<server_id>/action

Request body:

{
  "forcemove": {
      "ignore_hints": false,
      "ignore_broken_dependencies": false,
      "block_migrate": false,
      "disk_over_commit": false
  }
}

Note:  
It is required to pass all four forcemove parameters, even though, in most cases, these are false (OpenStack default).

If the request succeeds, the response body is the following:

{
  "needs_start": false
}

In case of an error, the response body is the following:

{
  "badRequest": {
       "message": "No valid host was found.", 
       "code": 507
  }
}

Note:  
The message and the code can vary.

3.1.2   VM Migration and Evacuation

CEE allows the configuration of HA policy for each VM. This policy is honored by the forcemove function.

CEE 15B policies VM Migration and VM Evacuation are removed from CEE 6 and replaced by HA-policies. For more information, see Section 3.1.2.1.

3.1.2.1   HA Policy

The user can define the level of HA needed on a specific VM. Three levels of HA can be achieved by defining a HA Policy on the VM.

The possible configuration values for ha-policy are the following:

unmanaged "Unmanaged" means that CEE does not try to manage the VM after it was started. No action is performed by CEE on this VM. (This is the default, if no HA-policy is provided.)
managed-on-host "Managed on host" means that the VM starts up with the host and shuts down with it. In case of failure, the VM is not moved to another host, but it is restarted when the node is restarted. On forcemove, the VM is shut down.
ha-offline "High Availability with offline migration" means that the VM is evacuated in case of failure, moved to another host on forcemove.
Note:  
ha-policy is case-sensitive.

3.1.2.2   Policy Configuration

During the creation of a VM, configure policies in the metadata field in the following way:

{ 
  "server":{ 
     "flavorRef":"http://openstack.example.com/openstack/flavors/1", 
     "imageRef":"http://openstack.example.com/openstack/images/70a599e0-31e7-49b7-b260-868f441e862b", 
     "metadata":{ 
        "ha-policy":"managed-on-host" 
     }, 
     "name":"new-server-test" 
  } 
} 
{ 
  "server":{ 
     "flavorRef":"http://openstack.example.com/openstack⇒
/flavors/1", 
     "imageRef":"http://openstack.example.com/openstack/images⇒
/70a599e0-31e7-49b7-b260-868f441e862b", 
     "metadata":{ 
        "ha-policy":"managed-on-host" 
     }, 
     "name":"new-server-test" 
  } 
} 

Here, the migration policy is set to managed-on-host.

3.1.3   Scheduling

The default OpenStack migration call contains a parameter for the destination compute host to where the VM is moved.

For evacuation, <host> is optional:

root@cic-1:~# nova evacuate

Syntax:

nova evacuate [--password <password>] ⇒
[--on-shared-storage] <server> [<host>]
nova evacuate [--password <password>] [--on-shared-storage] <server> [<host>]

In CEE, a VM can be moved simply from one compute host to another compute host in the cluster, using compute rescheduling to choose the destination compute host.

This process is supported by the forcemove operation.

3.1.4   Hints for VM Affinity and Antiaffinity

In order to have high availability and redundancy, the application layer needs to guarantee that some VMs are on different compute hosts.

To achieve this, OpenStack provides a feature called "scheduler hints". For example, if vm1 is booted, and, after that, vm2 is booted using -–hint different_host=<vm1_uuid>, then OpenStack guarantees that vm2 will be on a different node than vm1.

The forcemove functionality of CEE adds a patch to OpenStack to save hints used during boot.

3.1.4.1   Supported Hints

In CEE, server_group hint is supported and recommended, refer to the section "Server groups (os-server-groups)" in the OpenStack Compute API.

The hints same_host and different_host are also supported, but deprecated.

3.1.4.2   Prerequisites

In order to use scheduler hints, nova.conf must add them through scheduler_default_filters.

Note:  
This prerequisite is automatically met during installation and does not require further action.

3.1.4.3   Recommended Setup

The following scheduling filters are configured with CEE:

3.1.4.4   Hint Configuration

During the boot of a VM, configure scheduler hints in the following way:

{
  "os:scheduler_hints": {
    "different_host": "f2e31dcd-927b-4231-a652-3ceb42c9182e"
  },
  "server": {
    "name": "test-server",
    "imageRef": "e5bb056f-af7e-4d10-9b85-fca2519a74a0",
    "flavorRef": "1",
    "max_count": 1,
    "min_count": 1,
    "networks": [{"uuid": "d67ccfaf-0de5-4ae6-9cbb-765882d1c895"}]
  }
}

3.1.4.5   Broken Dependencies

forcemove always tries to consider the saved scheduler hints, but, in certain cases, it cannot follow the saved hints.

Example 0   Broken Dependencies 1

$ nova boot vm1 --image ... --flavor ... # this vm got id 8088e8b6-fd1a-4bf2-bff5-e2debb668a3a
$ nova boot vm2 --image ... --flavor ... --hint same_host=8088e8b6-fd1a-4bf2-bff5-e2debb668a3a
$ nova delete vm1
$ nova forcemove vm2
+--------------------------------------+---------------+---------------------------+-------------+
| Server UUID                          | Move accepted | Error Message             | Needs Start |
+--------------------------------------+---------------+---------------------------+-------------+
| 59961962-fc06-4d90-82e4-366aaa3a9b38 | False         | No valid host was found.  | False       |
+--------------------------------------+---------------+---------------------------+-------------+

Example 1   Broken Dependencies 1

$ nova boot vm1 --image ... --flavor ... # this vm got id 8088e8b6-fd1a-4bf2-bff5-e2debb668a3a
$ nova boot vm2 --image ... --flavor ... --hint same_host=8088e8b6-fd1a-4bf2-bff5-e2debb668a3a
$ nova delete vm1
$ nova forcemove vm2
+--------------------------------------+---------------+---------------------------+-------------+
| Server UUID                          | Move accepted | Error Message             | Needs Start |
+--------------------------------------+---------------+---------------------------+-------------+
| 59961962-fc06-4d90-82e4-366aaa3a9b38 | False         | No valid host was found.  | False       |
+--------------------------------------+---------------+---------------------------+-------------+

Here, forcemove tries to use the saved hint, but cannot succeed, since vm1 has been deleted.

However, deleted VMs can be ignored during the handling of hints:

Example 1   Broken Dependencies 2

$ nova forcemove vm2 --ignore-broken-dependencies
+--------------------------------------+---------------+---------------+-------------+
| Server UUID                          | Move accepted | Error Message | Needs Start |
+--------------------------------------+---------------+---------------+-------------+
| 59961962-fc06-4d90-82e4-366aaa3a9b38 | True          |               | False       |
+--------------------------------------+---------------+---------------+-------------+

Example 2   Broken Dependencies 2

$ nova forcemove vm2 --ignore-broken-dependencies
+--------------------------------------+---------------+---------------+-------------+
| Server UUID                          | Move accepted | Error Message | Needs Start |
+--------------------------------------+---------------+---------------+-------------+
| 59961962-fc06-4d90-82e4-366aaa3a9b38 | True          |               | False       |
+--------------------------------------+---------------+---------------+-------------+

Here, the hint same_host=8088e8b6-fd1a-4bf2-bff5-e2debb668a3a is ignored.

Note:  
If the server_groups affinity filter is used, hints can be ignored, but forcemove does not check the deleted VMs, so broken dependencies have no effect on forcemove.

3.1.4.6   Ignore Hints

forcemove can be configured to ignore all hints:

Example 2   Ignore Hints

$ nova forcemove vm1 --ignore-hints
+--------------------------------------+---------------+---------------+-------------+
| Server UUID                          | Move accepted | Error Message | Needs Start |
+--------------------------------------+---------------+---------------+-------------+
| 59961962-fc06-4d90-82e4-366aaa3a9b38 | True          |               | False       |
+--------------------------------------+---------------+---------------+-------------+

Example 3   Ignore Hints

$ nova forcemove vm1 --ignore-hints
+--------------------------------------+---------------+---------------+-------------+
| Server UUID                          | Move accepted | Error Message | Needs Start |
+--------------------------------------+---------------+---------------+-------------+
| 59961962-fc06-4d90-82e4-366aaa3a9b38 | True          |               | False       |
+--------------------------------------+---------------+---------------+-------------+

3.2   redefine

redefine is an API function that defines the properties of VMs on a compute host from the Nova database. It is used when the libvirt XML or the file system image is missing.

nova redefine creates a new libvirt XML based on the previously stored content in the Nova database and defines the instance for libvirt.

If the file system image is missing (for example, in case of board replacement), nova redefine downloads the base image from Glance.

The functionality checks if the following criteria apply:

If the above criteria are not met, redefine raises an exception.

Note:  
If the VM was in ACTIVE state before the redefine operation, it starts automatically once the operation is completed. If the VM was in SHUTOFF state, it has to be started manually.

3.2.1   API Operation

In CEE, the standard OpenStack API is extended with a call for redefine:

POST v2.1/<tenant_id>/servers/<server_id>/action

Request body:

{
  "redefine": {}
}

An example response body for a successful request is:

{"adminPass": "x8C3iYRUnJ86"}

Note:  
The message and the code can vary.

3.3   PCI Passthrough and SR-IOV Physical Function Passthrough Use in OpenStack

This section describes Nova flavor creation for PCI passthrough and SR-IOV Physical Function passthrough ports and VM boot using PCI passthrough or SR-IOV Physical Function passthrough devices.

Note:  
Interface configuration and driver installation on the tenant VM is not performed by CEE and is the responsibility of the operator.

3.3.1   Nova Flavor Creation for PCI Passthrough Ports and SR-IOV Physical Function Passthrough Ports

VMs can be provisioned with PCI passthrough ports using the Nova extra spec pci_passthrough:alias with the aliases defined before deployment in the configuration file, or using the default aliases. The default aliases are the following:

For more information about PCI passthrough alias configuration, refer to the Configuration File Guide.

An example of a flavor configuration to pass two predefined PCI passthrough devices called my_pci_dev through the configuration file to the VM:

openstack flavor create --ram 4096 --disk 100 --vcpus 2 m1.medium.pci_passthrough
openstack flavor set --property "pci_passthrough:alias"="my_pci_dev:2" m1.medium.pci_passthrough
nova flavor-key m1.medium.pci_passthrough set hw:mem_page_size=1048576
nova flavor-key m1.medium.pci_passthrough set hw:cpu_policy=dedicated
openstack flavor create --ram 4096 --disk 100 ⇒
--vcpus 2 m1.medium.pci_passthrough
openstack flavor set --property "pci_passthrough:⇒
alias"="my_pci_dev:2" m1.medium.pci_passthrough
nova flavor-key m1.medium.pci_passthrough set ⇒
hw:mem_page_size=1048576
nova flavor-key m1.medium.pci_passthrough set ⇒
hw:cpu_policy=dedicated

An example of a flavor configuration to pass two default SR-IOV Physical Function passthrough devices called PF_devs file to the VM:

openstack flavor create --ram 4096 --disk 100 --vcpus 2 m1.medium.sr-iov_pf_pt
openstack flavor set --property "pci_passthrough:alias"="PF_devs:2" m1.sr-iov_pf_pt
nova flavor-key m1.sr-iov_pf_pt set hw:mem_page_size=1048576
nova flavor-key m1.sr-iov_pf_pt set hw:cpu_policy=dedicated
openstack flavor create --ram 4096 --disk 100 ⇒
--vcpus 2 m1.medium.sr-iov_pf_pt
openstack flavor set --property "pci_passthrough:⇒
alias"="PF_devs:2" m1.sr-iov_pf_pt
nova flavor-key m1.sr-iov_pf_pt set ⇒
hw:mem_page_size=1048576
nova flavor-key m1.sr-iov_pf_pt set ⇒
hw:cpu_policy=dedicated

3.4   Bandwidth Based Scheduling

Bandwidth based scheduling is an extension that enables VM scheduling based on free network bandwidth. The user requests the required bandwidth using attributes in flavor. Nova scheduler uses these attributes when scheduling VMs with that specific flavor. Nova keeps track of the reserved bandwidth on each network interface controller (NIC) on every host. Nova ensures that no NIC is overprovisioned. Both bit rate and packet rate capacity are taken into consideration.

3.4.1   API Operation

This extension extends the extra_specs attribute in flavor. Two new attributes are added to extra_specs:

Note:  
The unit of measurement for the requested bandwidth is kBps, kilobytes per second, 103 bytes per second.

The values for the new attributes are in JSON format and contain requested bandwidth and average packet size. Rate is requested bandwidth. Size is average packet size in bytes of the VM frames on this interface. The size is used together with the byte rate to calculate the requested frames per second.

Format:

'{ "<device name of sriov hwnic>" : [<minimum-bandwidth-vf1>, <minimum-bandwidth-vf2>,...],
"<device name of neutron network hwnic>" : { "rate": [<rate-vnic1>, <rate-vnic2>, ...], "size": [<avg-packet-size-vnic1> , <avg-packet-size-vnic2>,...]}...}'
'{ "<device name of sriov hwnic>" : [<minimum-bandwidth-vf1>, ⇒
<minimum-bandwidth-vf2>,...],
"<device name of neutron network hwnic>" : { "rate": ⇒
[<rate-vnic1>, <rate-vnic2>, ...], "size": ⇒
[<avg-packet-size-vnic1> , <avg-packet-size-vnic2>,...]}...}'

In config.yaml, the SR-IOV section contains the physical_network parameter, which is used in the bandwidth based scheduling extension as an SR-IOV interface name. For physical_network configuration, refer to the "SR-IOV" section of the Configuration File Guide.

An example of the configuration:

Example 4   SR-IOV Configuration

  sriov_configs:
  - &DELL_620_sriov_info
    - pci_address: "0000:41:00.0"
      bandwidth: 10000000
      physical_network: "physnet0"
    - pci_address: "0000:41:00.1"
      bandwidth: 10000000
      physical_network: "physnet1"

Example 5   bandwidth:vif_*bound_average Attribute

nova flavor-key bw-sriov-flavor set
bandwidth:vif_inbound_average='{ "physnet0":[20000, 10000], "physnet1":[10000], "default":{ "rate":⇒
[ 10000, 20000, 30000 ], "size": [512, 1024, 1024]}}' bandwidth:vif_outbound_average='{ "physnet0":⇒
[20000, 10000], "physnet1":[10000], "default":{ "rate":[ 10000, 20000, 30000], "size": [512, 1024, 1024] }}'

The pci_passthrough:alias is not necessary.

The vNICs specified in the flavor have to match the vNICs specified on "nova boot". To use this feature, the user has to specify bandwidth on all NICs on the VM, including SR-IOV interfaces. For example, it is not allowed to specify one bandwidth in flavor and then boot the VM with two vNICs. It is accepted to not use the extension by not specifying any "bandwidth:*" attributes in flavor.

3.4.2   VMs with and without Bandwidth Requirements

This section provides information about handling VMs with and without bandwidth requirements in the same CEE region.

VMs with unspecified bandwidth cannot be scheduled on the same hosts as VMs with specified bandwidth. If VMs with unspecified bandwidth consume all bandwidth, fragmentation issues can occur. In CEE, the default ram_weight_multiplier is set to 1, meaning the scheduler spreads VMs on all available hosts. This can result in all hosts having VMs without specified bandwidth, which makes it impossible to schedule VMs with specified bandwidth. This worst-case scenario is shown in fig-Bandwidth-Based-Scheduling-Fails_eps Figure 1.

Figure 0   Bandwidth Based Scheduling Fails

Figure 1   Bandwidth Based Scheduling Fails

To avoid this issue, use host aggregates to divide the compute hosts into two groups, as shown in fig-Using-Host-Aggregates_eps Figure 2: VMs with bandwidth requirements and VMs without bandwidth requirements. When a flavor is created, the host aggregate must be set accordingly.

Figure 1   Using Host Aggregates to Manage Bandwidth Requirements

Figure 2   Using Host Aggregates to Manage Bandwidth Requirements

For the use of host aggregates in CEE, see Section 5.1.

4   Limitations

Note:  
In addition to the limitations listed in this section, also refer to Section 4.1 for limitations specific to CEE in multi-server deployment and Section 4.2 for limitations specific to CEE in single server deployment.

The following limitations exist in CEE compute:

4.1   Limitations for Multi-Server Deployment

In addition to the limitations described in Section 4, the following limitations apply to CEE in multi-server deployment:

4.2   Limitations for Single Server Deployment

In addition to the limitations described in Section 4, the following limitations apply to CEE in single server deployments:

Single server installation requires special flavor metadata (extra specs) settings.

5   Additional Information

5.1   Host Aggregates in CEE

Follow the below procedure to use OpenStack host aggregates:

  1. To configure the scheduler to support host aggregates, add filter AggregateInstanceExtraSpecsFilter to the scheduler_default_filters configuration options in /etc/nova/nova.conf.
  2. Create the host aggregate in the nova availability zone and add the required compute nodes:

    openstack aggregate create --zone nova <host_aggregate_name>
    openstack aggregate create --zone nova ⇒
    <host_aggregate_name>

    openstack aggregate add host <host_aggregate_name> <host_name>
    openstack aggregate add host <host_aggregate_name><host_name>

    openstack aggregate set --property <key=value> <aggregate>
    openstack aggregate set --property <key=value><aggregate>

  3. Map the host aggregate metadata to the flavor:

    openstack flavor set --property aggregate_instance_extra_specs:<key=value> <flavor>
    openstack flavor set ⇒
    --property aggregate_instance_extra_specs:⇒
    <key=value> <flavor>

  4. Start an instance using the flavor:

    nova boot --flavor <flavor> (--image <image> | --volume <volume>) <server_name> 
    nova boot --flavor <flavor> (--image <image> ⇒
    | --volume <volume>) <server_name> 

5.1.1   Removal of Host from Host Aggregate

Note:  
Do not remove a host from an aggregate if there are any running instances on the host, as it causes a mismatch between the Nova database record and the nova show output.

To remove a host from the aggregate, use the following command:

openstack aggregate remove host <host_aggregate_name> <host_name>
openstack aggregate remove host <host_aggregate_name><host_name>

5.2   Libvirt Real Time Instances

The CPU pinning feature adds the ability to assign guest virtual CPUs to dedicated host CPUs, providing guarantees for CPU time, and improving the worst-case latency for CPU scheduling. The addition of the real time scheduling policy further improves the worst-case scheduler latency for vCPUs.

Note:  
Enabling the real time feature compromises the overall throughput of the system. As such, it is only to be enabled in case the guest workload demands it.

To enable the real time feature for an instance:

  1. Map a flavor with cpu_realtime specifications:

    openstack flavor set --property hw:cpu_policy=dedicated <flavor_name>
    openstack flavor set --property hw:mem_page_size=1048576 <flavor_name>
    openstack flavor set --property hw:cpu_realtime=yes <flavor_name>
    openstack flavor set --property hw:cpu_realtime_mask=^1 <flavor_name>
    openstack flavor set --property ⇒
    hw:cpu_policy=dedicated <flavor_name>
    
    openstack flavor set --property ⇒
    hw:mem_page_size=1048576 <flavor_name>
    
    openstack flavor set --property ⇒
    hw:cpu_realtime=yes <flavor_name>
    
    openstack flavor set --property ⇒
    hw:cpu_realtime_mask=^1 <flavor_name>

    Note:  
    hw:cpu_realtime_mask=^1 indicates that vCPU1 remains non real time, while all other vCPUs have a real time policy. The vCPU(s) with non real time policy will run the emulator thread(s).

  2. Boot an instance using the flavor:
    nova boot --flavor <flavor_name> --image <image_id> --nic net-id=<network_id> <server_name>
    nova boot --flavor <flavor_name> --image <image_id> ⇒
    --nic net-id=<network_id> <server_name>