Launchpad Entry: NovaSpec: https://blueprints.launchpad.net/nova/+spec/nova-instance-referencing

Created: 29 March 2011

Updated: 14 April 2011

Drafter: Ed Leafe

Drafter's email: ed@leafe.com

Summary

With the addition of zones to OpenStack, we now have a complication in addressing instances, since currently instances are referenced by their Primary Key, which is an auto-incremented integer. This is part of the current Rackspace API. Adding zones, which each have their own database, means that there is no guarantee of uniqueness among these PKs; in fact, duplicates are practically guaranteed to occur. We need to discuss alternatives, weighing how well they work, and how they would affect both current nova code as well as the APIs.

Primary Solution Approaches

  1. Change to a globally-unique PK, with UUIDs as the most obvious candidate.
  2. Retain the current ID field, but change the API to use a compound key approach that involves adding the zone path to the instance, since the ID will be unique within its zone.
  3. Require an installation to manually partition the range of possible PK values so that the keys generated in different zones cannot overlap.

Rationale

Assumptions about unique identifiers are no longer valid once you introduce zones and the "shared-nothing" approach to scalability. There are two separate considerations: database PK uniqueness and API addressability uniqueness. All three of the above approaches would satisfy the database considerations; what this discussion will focus on is which approach will yield the best solution for the API (both internal and external).

User stories

As a Nova customer, I need to be able to uniquely address my instances so that I can interact with them programmatically.

As a Nova developer, I need to be able to address an instance in a particular zone without having to constantly check every zone so that I can reduce bandwidth and latency.

As a Nova administrator, I need to allow my customers to access their instances without revealing internal structures so that I can reduce attack vectors based on such internal knowledge.

Assumptions

  1. Every zone has its own database. Data is not shared among zones
  2. Zones, being logical structures, can be changed as needed, and should not change the URI for accessing instances.

Pros / Cons for each approach

UUIDs

Compound Key

Partition PK Ranges

Proposed Solution

Based on the above, I propose adopting the UUID approach. While it may require a greater amount of code changes to implement, over the long term it provides the best combination of scalability and security. The issues regarding locating an instance in a complex zone configuration is its biggest long-term drawback, but can be addressed with a caching solution similar to the current capabilities approach to zones "knowing" what they can handle in their branch of the zone tree.

Of course, this solution is not set in stone, or else why have a summit discussion?

Discussion Agenda

Please add any additional pros/cons to the above section that you can think of. At the Summit I'd like to have a discussion on these different aspects of each, and if the issues are documented ahead of time, we can all be prepared to discuss them in more depth.


CategoryProposal

Wiki: NovaInstanceReferencing (last edited 2011-04-14 14:24:35 by EdLeafe)