Jump to: navigation, search

Difference between revisions of "NoDowntimeDBMigrations"

(Problem Description)
(Proposed Solution)
Line 16: Line 16:
 
Embrace the [http://exortech.com/blog/2009/02/01/weekly-release-blog-11-zero-downtime-database-deployment/ Expand/Contract] pattern for database migrations. This pattern splits database migrations into three parts:
 
Embrace the [http://exortech.com/blog/2009/02/01/weekly-release-blog-11-zero-downtime-database-deployment/ Expand/Contract] pattern for database migrations. This pattern splits database migrations into three parts:
  
# Expand schema (adding new columns)
+
# Expand schema (adding new columns/tables/indexes)
 
# Migrate data
 
# Migrate data
# Contract schema (removing unused columns)
+
# Contract schema (removing unused columns/tables/indexes)
  
 
Code would migrate data on load when the service is running. Optionally, a background task can migrate all data at whatever speed specified. When all data that needs to be migrated has been migrated, the contraction can be run which will remove unused columns.
 
Code would migrate data on load when the service is running. Optionally, a background task can migrate all data at whatever speed specified. When all data that needs to be migrated has been migrated, the contraction can be run which will remove unused columns.
  
 
This decouples the database schema from the code allowing the two to be updated independently and allowing the service to continue running transparently while the data is migrated.
 
This decouples the database schema from the code allowing the two to be updated independently and allowing the service to continue running transparently while the data is migrated.

Revision as of 16:27, 23 August 2013

Problem Description

Database migrations in Openstack currently require services are stopped before the migrations are run. This is because code currently assumes a fixed schema.

In the past, some migrations have shown to take a significant amount of time run on large installations, causing unacceptable amounts of downtime.

There are generally two causes of downtime during the migrations:

  1. Database engine limitations during schema changes
  2. Data needs to be migrated from one format to another


As Openstack installations continue getting larger and larger, these long downtimes will continue getting worse and worse. If an Openstack provider guarantees 99.99% uptime, that leaves only 4 minutes total per month for all downtime (planned or unplanned). Some database migrations in the past have taken hours to run on large installations.

Proposed Solution

Embrace the Expand/Contract pattern for database migrations. This pattern splits database migrations into three parts:

  1. Expand schema (adding new columns/tables/indexes)
  2. Migrate data
  3. Contract schema (removing unused columns/tables/indexes)

Code would migrate data on load when the service is running. Optionally, a background task can migrate all data at whatever speed specified. When all data that needs to be migrated has been migrated, the contraction can be run which will remove unused columns.

This decouples the database schema from the code allowing the two to be updated independently and allowing the service to continue running transparently while the data is migrated.