NovaInstanceReferencing

Launchpad Entry: NovaSpec: https://blueprints.launchpad.net/nova/+spec/nova-instance-referencing

Created: 29 March 2011

Updated: 14 April 2011

Drafter: Ed Leafe

Drafter's email: ed@leafe.com

Summary
With the addition of zones to OpenStack, we now have a complication in addressing instances, since currently instances are referenced by their Primary Key, which is an auto-incremented integer. This is part of the current Rackspace API. Adding zones, which each have their own database, means that there is no guarantee of uniqueness among these PKs; in fact, duplicates are practically guaranteed to occur. We need to discuss alternatives, weighing how well they work, and how they would affect both current nova code as well as the APIs.

Primary Solution Approaches

 * 1) Change to a globally-unique PK, with UUIDs as the most obvious candidate.
 * 2) Retain the current ID field, but change the API to use a compound key approach that involves adding the zone path to the instance, since the ID will be unique within its zone.
 * 3) Require an installation to manually partition the range of possible PK values so that the keys generated in different zones cannot overlap.

Rationale
Assumptions about unique identifiers are no longer valid once you introduce zones and the "shared-nothing" approach to scalability. There are two separate considerations: database PK uniqueness and API addressability uniqueness. All three of the above approaches would satisfy the database considerations; what this discussion will focus on is which approach will yield the best solution for the API (both internal and external).

User stories
As a Nova customer, I need to be able to uniquely address my instances so that I can interact with them programmatically.

As a Nova developer, I need to be able to address an instance in a particular zone without having to constantly check every zone so that I can reduce bandwidth and latency.

As a Nova administrator, I need to allow my customers to access their instances without revealing internal structures so that I can reduce attack vectors based on such internal knowledge.

Assumptions

 * 1) Every zone has its own database. Data is not shared among zones
 * 2) Zones, being logical structures, can be changed as needed, and should not change the URI for accessing instances.

UUIDs

 * Pros
 * Easily generated
 * Used extensively; proven technology
 * Cons
 * Difficult for humans to work with
 * Requires extensive change to Nova codebase, which assumes IDs are integers. Yes, I know that UUIDs are actually 128-bit integers, but are most commonly used in their string representation.
 * Does not provide information about locating the host for the instance. To operate on an instance, the zone hierarchy will have to be searched to find the one in question.

Compound Key

 * Pros
 * Identifies an instance's location in a nested zone configuration
 * Requires no local changes to the database
 * Cons
 * Greatly limits flexibility - changing the logical zone layout will break all current URIs
 * Requires substantial change to Nova APIs, which do not currently have the concept of zone location.
 * Including information about internal structures (i.e., the zone layout) in public URIs is considered a security weakness.

Partition PK Ranges

 * Pros
 * Requires no changes to the current APIs
 * Requires no changes to the codebase
 * Requires no changes to the database
 * Would aid in addressing a given instance, since its ID would be tied to a particular zone.
 * Cons
 * Manually intensive - each zone would have to be configured individually
 * Would require key generation code that would eventually have to re-use old keys once the range limit was reached
 * Would require knowing in advance the possible number of instances a given zone would ever have to handle.
 * Ranges would reveal some information about a node's physical location.
 * Incredibly hackish and ugly, with scaling issues to boot as a deployment grows.

Proposed Solution
Based on the above, I propose adopting the UUID approach. While it may require a greater amount of code changes to implement, over the long term it provides the best combination of scalability and security. The issues regarding locating an instance in a complex zone configuration is its biggest long-term drawback, but can be addressed with a caching solution similar to the current capabilities approach to zones "knowing" what they can handle in their branch of the zone tree.

Of course, this solution is not set in stone, or else why have a summit discussion?

Discussion Agenda
Please add any additional pros/cons to the above section that you can think of. At the Summit I'd like to have a discussion on these different aspects of each, and if the issues are documented ahead of time, we can all be prepared to discuss them in more depth.