Difference between revisions of "Db-string-type-cleanup"
Rick Harris (talk | contribs) (Adding table to db-string-type-cleanup) |
Rick Harris (talk | contribs) (→Rationale) |
||
Line 6: | Line 6: | ||
We use String columns as the physical storage for many kinds of logical datatypes, whether it's a UUID as String(36), IPAddress as String(39), or generic strings which are usually String(255). | We use String columns as the physical storage for many kinds of logical datatypes, whether it's a UUID as String(36), IPAddress as String(39), or generic strings which are usually String(255). | ||
− | |||
− | |||
− | |||
Line 33: | Line 30: | ||
| 388 || String(255) || Generic String | | 388 || String(255) || Generic String | ||
|} | |} | ||
+ | |||
+ | Right now, the standard is for developers to use SQLAlchemy's String type directly, where they pass in the desired length. | ||
There are a few problems with this approach: | There are a few problems with this approach: | ||
− | 1. | + | 1. UNSAFE: Each time a developer types '255', there is an opportunity to introduce a bug. Two examples of this are the 7 256-length columns we have and https://review.openstack.org/#/c/39296/ where '255' was accidentally changed to '25' without reviewers noticing. |
+ | |||
+ | What makes this issue particularly nefarious, is that these kinds of column-length bugs usually do not surface until after a deployment when we start using real data, with all of its natural variance. This makes recovering from these bugs much harder than normal, since additional, carefully-crafted migrations need to be written to fix the existing bad data. As such, it would be nice to turn these runtime bugs into things the interpreter can catch at parse-time. | ||
== Design == | == Design == |
Revision as of 00:28, 23 October 2013
Blueprint Name: db-string-type-cleanup
Summary
Rationale
We use String columns as the physical storage for many kinds of logical datatypes, whether it's a UUID as String(36), IPAddress as String(39), or generic strings which are usually String(255).
Count | Type | Notes |
---|---|---|
2 | String(12) | PCI Address |
2 | String(5) | Protocol |
2 | String(8) | PCI Device Type |
4 | String(4) | PCI Vendor/Product IDs |
5 | String(39) | IPAddress |
7 | String(256) | Typos? |
8 | String(43) | CIDR |
56 | String(36) | UUID |
388 | String(255) | Generic String |
Right now, the standard is for developers to use SQLAlchemy's String type directly, where they pass in the desired length.
There are a few problems with this approach:
1. UNSAFE: Each time a developer types '255', there is an opportunity to introduce a bug. Two examples of this are the 7 256-length columns we have and https://review.openstack.org/#/c/39296/ where '255' was accidentally changed to '25' without reviewers noticing.
What makes this issue particularly nefarious, is that these kinds of column-length bugs usually do not surface until after a deployment when we start using real data, with all of its natural variance. This makes recovering from these bugs much harder than normal, since additional, carefully-crafted migrations need to be written to fix the existing bad data. As such, it would be nice to turn these runtime bugs into things the interpreter can catch at parse-time.