AdaptiveVersioning

Versioning is a tricky problem for both oslo.rpc and oslo.notify. Whenever there is a schema change, the version number has to get bumped. If we forget, we hope it will get picked up in code review but broken versions could creep in. Perhaps we can solve this problem using adaptive versioning?

note: I have no idea if "adaptive versioning" is a thing, but the expression made the most sense to me here.

With rpc and notifications we are essentially creating a dict payload and throwing it over the fence. Hopefully, the people on the other side know how to deal with it. This is where versioning comes in. "Oh, that's a version 7 Foo, I know how to deal with that."

But the secret is in the dict payload. The nested dict can be flattened to essentially form a runtime schema:

Consider this notification:

{   u'_context_auth_token': u'3d8b13de1b7d499587dfc69b77dc09c2', u'_context_is_admin': True, u'_context_project_id': u'7c150a59fe714e6f9263774af9688f0e', u'_context_quota_class': None, u'_context_read_deleted': u'no', u'_context_remote_address': u'10.0.2.15', u'_context_request_id': u'req-d68b36e0-9233-467f-9afb-d81435d64d66', u'_context_roles': [u'admin'], u'_context_timestamp': u'2012-05-08T20:23:41.425105', u'_context_user_id': u'1e3ce043029547f1a61c1996d1a531a2', u'event_type': u'compute.instance.create.end', u'message_id': u'dae6f69c-00e0-41c0-b371-41ec3b7f4451', u'payload': {u'created_at': u'2012-05-08 20:23:41', u'deleted_at': u'', u'disk_gb': 0, u'display_name': u'testme', u'fixed_ips': [{u'address': u'10.0.0.2', u'floating_ips': [], u'meta': {}, u'type': u'fixed', u'version': 4}], u'image_ref_url': u'http://10.0.2.15:9292/images/UUID', u'instance_id': u'9f9d01b9-4a58-4271-9e27-398b21ab20d1', u'instance_type': u'm1.tiny', u'instance_type_id': 2, u'launched_at': u'2012-05-08 20:23:47.985999', u'memory_mb': 512, u'state': u'active', u'state_description': u'', u'tenant_id': u'7c150a59fe714e6f9263774af9688f0e', u'user_id': u'1e3ce043029547f1a61c1996d1a531a2', u'reservation_id': u'1e3ce043029547f1a61c1996d1a531a3', u'vcpus': 1, u'root_gb': 0, u'ephemeral_gb': 0, u'host': u'compute-host-name', u'availability_zone': u'1e3ce043029547f1a61c1996d1a531a4', u'os_type': u'linux?', u'architecture': u'x86', u'image_ref': u'UUID', u'kernel_id': u'1e3ce043029547f1a61c1996d1a531a5', u'ramdisk_id': u'1e3ce043029547f1a61c1996d1a531a6', },   u'priority': u'INFO', u'publisher_id': u'compute.vagrant-precise', u'timestamp': u'2012-05-08 20:23:48.028195', }

If we strip the values and flatten the keys we can get:

[  u'_context_auth_token'[str] u'_context_is_admin'[bool] u'_context_project_id'[str] u'_context_quota_class'[str] u'_context_read_deleted' u'_context_remote_address'[str] u'_context_request_id'[str] u'_context_roles'[list] u'_context_timestamp'[str] u'_context_user_id'[str] u'event_type'[str] u'message_id'[str] u'payload.created_at'[str] u'payload.deleted_at'[str] u'payload.disk_gb'[int] u'payload.display_name'[str] u'payload.fixed_ips'[list] u'payload.fixed_ips.address'[str] u'payload.fixed_ips.floating_ips'[str] u'payload.fixed_ips.meta'[str] u'payload.fixed_ips.type'[int] u'payload.fixed_ips.version'[int] u'payload.image_ref_url'[str] u'payload.instance_id'[str] u'payload.instance_type'[str] u'payload.instance_type_id'[int] u'payload.launched_at'[str] u'payload.memory_mb'[int] u'payload.state'[str] u'payload.state_description'[str] u'payload.tenant_id'[str] u'payload.user_id'[str] u'payload.reservation_id'[str] u'payload.vcpus'[int] u'payload.root_gb'[str] u'payload.ephemeral_gb'[int] u'payload.host'[str] u'payload.availability_zone'[str] u'payload.os_type'[str] u'payload.architecture'[str] u'payload.image_ref'[str] u'payload.kernel_id'[str] u'payload.ramdisk_id'[str] u'priority'[str] u'publisher_id'[str] u'timestamp'[str] }

Note: Arrays within the nested dicts take a little extra care if they're not homogeneous, but we can work that out.

After a service restart, we can start checking messages or notifications the first time we see a new one.



We would compare the flattened schema (or perhaps a hash of the schema) to a database entry of the last time we saw a message of this type. If the hashes match, we keep the same version number. If they differ, we bump the version number and store the new version number (and flattened schema and new hash) in the database.

We only have to do this the first time we see a message we haven't seen since restart.

For notifications we would key off the event_type. For rpc, the method name.

This also means we have a database history of changes in the schema over time.

If we want to get fancy we can just store the diffs from one version to the next which would make it easier for finding changes.