Jump to: navigation, search

Difference between revisions of "Db-string-type-cleanup"

(Design)
(Rationale)
Line 4: Line 4:
  
 
== Rationale ==
 
== Rationale ==
 
  
 
We use string columns to store a variety of logical datatypes: UUIDs, IP addresses, generic strings, among others.
 
We use string columns to store a variety of logical datatypes: UUIDs, IP addresses, generic strings, among others.
Line 17: Line 16:
  
 
* ''No Physical Storage/Logical Datatype Abstraction'' - Some databases support certain logical datatypes natively, like UUID.  By declaring the columns as strings, instead of say 'UUID', we're tying that column down to a particular underlying storage type. (In fact for IP Addresses and CIDR addresses, we're already using this abstraction.)
 
* ''No Physical Storage/Logical Datatype Abstraction'' - Some databases support certain logical datatypes natively, like UUID.  By declaring the columns as strings, instead of say 'UUID', we're tying that column down to a particular underlying storage type. (In fact for IP Addresses and CIDR addresses, we're already using this abstraction.)
 
{| class="wikitable"
 
|-
 
! Count !! Type !! Notes
 
|-
 
| 2    || String(12)  || PCI Address
 
|-
 
| 2    || String(5)    || Protocol
 
|-
 
| 2    || String(8)    || PCI Device Type
 
|-
 
| 4    || String(4)    || PCI Vendor/Product IDs
 
|-
 
| 5    || String(39)  || IPAddress
 
|-
 
| 7    || String(256) || Typos?
 
|-
 
| 8    || String(43)  || CIDR
 
|-
 
| 56  || String(36)  || UUID
 
|-
 
| 388 || String(255) || Generic String
 
|}
 
  
 
== Design ==
 
== Design ==
  
 
The proposal here is to extend the work we did for `IPAddress` and `CIDR` to other logical datatypes that are stored as underlying Strings.
 
The proposal here is to extend the work we did for `IPAddress` and `CIDR` to other logical datatypes that are stored as underlying Strings.

Revision as of 16:25, 23 October 2013

Blueprint Name: db-string-type-cleanup

Summary

Rationale

We use string columns to store a variety of logical datatypes: UUIDs, IP addresses, generic strings, among others.

The current practice is for developers to use SQLAlchemy's `String` datatype directly and specify the desired length each time, e.g. String(255), or String(36).

This has a few problems:

  • Typos - For example, https://review.openstack.org/#/c/39296 accidentally changed '255' to '25'. Since these are runtime and not parse-time errors, they are usually difficult to recover from and require further cleanup migrations.
  • Unecessary Variance - For example, we have 'String(256)' in places where String(255) was preferable (since that's the de facto standard for generic strings)
  • No Physical Storage/Logical Datatype Abstraction - Some databases support certain logical datatypes natively, like UUID. By declaring the columns as strings, instead of say 'UUID', we're tying that column down to a particular underlying storage type. (In fact for IP Addresses and CIDR addresses, we're already using this abstraction.)

Design

The proposal here is to extend the work we did for `IPAddress` and `CIDR` to other logical datatypes that are stored as underlying Strings.