Large Scale SIG/ScaleOut

The fourth stage in the Scaling Journey is Scale Out.

No matter how much you manage to scale up a single cluster, at one point you will have to scale out to multiple clusters, zones, regions or cells. It can be difficult to navigate the various choices you have and pick the best architecture. This page aims to help answer those questions.

Once you are past that stage, you are ready to proceed to the final stage of the Scaling Journey: Upgrade and Maintain.

FAQ

When should I scale out my OpenStack Infrastructure?

There’s no special formula that will tell you that you need to scale out your OpenStack infrastructure. As almost everything else it depends on your requirements and expectations. There are however some factors that can help your decision. As you scale up your infrastructure you increase the exposure/risk if something goes wrong with your control plane. To mitigate that risk, usually complex setups and procedures are designed for the message broker and databases. However, in some cases, these complex solutions are also a headache to maintain and keep available. If you are reaching this point probably you should consider the option to scale out your deployment. There are also other motivations to scale out a deployment, like expose the users to the different deployment partitions, manage particular resources or workloads, keep it simple, …

What are the different options to scale out an OpenStack Infrastructure?

There are two main options to scale out your OpenStack infrastructure: Regions and Cells.

Regions: A region is a new OpenStack deployment that typically shares the same Keystone and Horizon components. Regions are useful when the deployer wants to explicitly expose the users to this deployment partitioning, usually to help them achieve a greater fault-tolerance for their applications. Because regions are independent OpenStack deployments, if user applications are distributed between them, they should not be affected if a region suffers an outage.

Cells: Cells are a Nova concept that allows the partition of your Nova deployment. This allows you to shard the load between different message brokers, databases and nova-conductors. As a consequence you can increase massively the number of compute nodes in the infrastructure without having central and complex message brokers and database setups. Cells are not exposed to users but can be leveraged to set different AVZs, isolate particular workloads or to help in resource management.

What are the advantages of Regions?

Some advantages of regions: Independent OpenStack deployments Regions are exposed to users Helps to scale services that don’t support sharing (for example: Neutron) Typically Keystone and Horizon are shared between different region but other services can also be shared (Glance, Magnum, …)

What are the advantages of Nova Cells?

Some advantages of deploying cells: Single endpoint. Scale transparently between different Data Centres Availability and Resilience Isolate failure domains Dedicate resources (Cells) to projects Have hardware type per Cell Easy to have different compute nodes configurations per Cell

Are Availability Zones an option to scale out my OpenStack deployment?

Depends how availability zones are deployed! By default Availability Zones are aggregates and they are used to expose the different group of resources to the users.

However, Availability Zones can be used to expose the resources of the different Cells to users. Using this strategy, you increase the availability and isolation of your Availability Zones, because Cells have different control planes.

Should I deploy Cells or Regions?

Again it depends on your use case… For example some large infrastructures actually deploy both.

Cells are useful if you want to shard your deployment, allowing you to manage thousands of compute nodes without complex message brokers or database setups. As a consequence it improves the resilience of the infrastructure in case of control plane issues. They can also be used to isolate resources and workloads, and provide a logical organisation of all available resources. Cells are not exposed to users, though! Which can be an advantage or disadvantage, depending in your use case. Also, Cells are a Nova concept and don’t help in scaling other services (for example, Neutron).

Regions, because they are completely independent deployments, allows you to scale all OpenStack components. However, you have the overhead to setup and manage different deployments. Regions are also exposed to users, helping then to achieve a better availability for their applications.

Who is running Cells and Regions?

As an example, CERN deployment uses Cells and Regions.