Difference between revisions of "Db-string-type-cleanup"
Rick Harris (talk | contribs) (→Rationale) |
Rick Harris (talk | contribs) (→Design) |
||
Line 20: | Line 20: | ||
The proposal here is to extend the work we did for `IPAddress` and `CIDR` to other logical datatypes that are stored as underlying Strings. | The proposal here is to extend the work we did for `IPAddress` and `CIDR` to other logical datatypes that are stored as underlying Strings. | ||
+ | |||
+ | |||
+ | == Appendix == | ||
+ | |||
+ | === Prevalence of String Datatypes In Nova's Data Model === | ||
+ | |||
+ | {| class="wikitable" | ||
+ | |- | ||
+ | ! Count !! Physical Datatype !! Logical Datatype | ||
+ | |- | ||
+ | | 2 || String(12) || PCI Address | ||
+ | |- | ||
+ | | 2 || String(5) || Protocol | ||
+ | |- | ||
+ | | 2 || String(8) || PCI Device Type | ||
+ | |- | ||
+ | | 4 || String(4) || PCI Vendor/Product IDs | ||
+ | |- | ||
+ | | 5 || String(39) || IPAddress | ||
+ | |- | ||
+ | | 7 || String(256) || Typos? | ||
+ | |- | ||
+ | | 8 || String(43) || CIDR | ||
+ | |- | ||
+ | | 56 || String(36) || UUID | ||
+ | |- | ||
+ | | 388 || String(255) || Generic String | ||
+ | |} |
Revision as of 16:27, 23 October 2013
Blueprint Name: db-string-type-cleanup
Contents
Summary
Rationale
We use string columns to store a variety of logical datatypes: UUIDs, IP addresses, generic strings, among others.
The current practice is for developers to use SQLAlchemy's `String` datatype directly and specify the desired length each time, e.g. String(255), or String(36).
This has a few problems:
- Typos - For example, https://review.openstack.org/#/c/39296 accidentally changed '255' to '25'. Since these are runtime and not parse-time errors, they are usually difficult to recover from and require further cleanup migrations.
- Unecessary Variance - For example, we have 'String(256)' in places where String(255) was preferable (since that's the de facto standard for generic strings)
- No Physical Storage/Logical Datatype Abstraction - Some databases support certain logical datatypes natively, like UUID. By declaring the columns as strings, instead of say 'UUID', we're tying that column down to a particular underlying storage type. (In fact for IP Addresses and CIDR addresses, we're already using this abstraction.)
Design
The proposal here is to extend the work we did for `IPAddress` and `CIDR` to other logical datatypes that are stored as underlying Strings.
Appendix
Prevalence of String Datatypes In Nova's Data Model
Count | Physical Datatype | Logical Datatype |
---|---|---|
2 | String(12) | PCI Address |
2 | String(5) | Protocol |
2 | String(8) | PCI Device Type |
4 | String(4) | PCI Vendor/Product IDs |
5 | String(39) | IPAddress |
7 | String(256) | Typos? |
8 | String(43) | CIDR |
56 | String(36) | UUID |
388 | String(255) | Generic String |