ObjectProposal

tl;dr

This is a proposal for rich objects that behave the same whether they're able to access the database directly or required to go over RPC. The goal includes bundling data with the methods that operate on them, specifically serialization and the actual database implementation as well as insulating code from the actual database schema for easier rolling upgrades. Create and query operations should be class methods that replace the functional interface(s) in db/api.py. Changes to attributes are tracked so that a save() implementation can perform updates.

Justification

Right now in Nova, everything is converted to a "primitive" before being passed over RPC. We have a large (and nasty) serialization function that aims to be able to serialize any of our objects, and is a constant source of frustration. Passing around primitives means that we use non-OO interfaces to make changes to those items.

The primitives we use are direct representations of the SQLAlchemy objects that we pull out of the database API. This is far from the ideal format, and currently the format of the primitives change if the underlying database schema changes (which is a problem for live upgrade).

Implementation

Object Class Registry

In order to avoid any client being able to randomly instantiate any object on a remote endpoint, a registry will be kept of the objects we're willing to instantiate. Each object will contain a major/minor version number and any attempt to instantiate an object will be handled in roughly the same way as RPC calls right now. The object registry will be automatically maintained through some nifty metaclass stuff.

Note that this is not an instance registry. It serves only to host the classes that can be instantiated. It does not attempt to track instances that are in-flight or in use by other services.

Strictish Object Properties

By semi-strictly defining object attributes in terms of what sort of data they should contain, we can help to quickly identify problems at the calling/constructing site. In Nova, we have had many bugs where something is passed as a string instead of an integer from the API layer all the way to the database, where MySQL interprets it with its usual laziness without issue. Switching to postgres as a backend means the same code path fails spectacularly. The goal here is to define what objects should look like rather strictly, which will help the goal of live upgrade and reduce the occurrence of issues like this..

Inbuilt Serialization

Right now, we expect jsonutils.to_primitive() to be able to serialize anything, which is nasty. Each object will have to_primitive() and from_primitive() methods, which will do context-sensitive (de-)serialization of the object in a more efficient manner. In order for this to work, the Oslo RpcProxy and RpcDispatcher classes will need to support calling these routines on parameters and result values of RPC methods to make this transparent to the services. We can, of course, avoid disruption of any of the existing behavior by just checking for isinstance(foo, ThisMagicObject).

Object Implementation

Each object will implement the query and modification methods it needs assuming direct database access. This means that instead of the giant nova/db/api.py that we have now, each object will contain the necessary implementation to query the database and return the generalized version. If the project desires database driver insulation (like Nova does), there should be a path for subclassing the default implementation (i.e. SQLAlchemy) of an object and replacing it with another one.

Each object class can provide classmethod implementations of the existing/desired query and creation operations, providing a pythonic experience for such things. Changes to attributes are tracked automatically, making it easy for a "save" operation to do selected attribute updates (similar to how instance_update() is used today).

Magic Remoting Behavior

Each method of the object (classmethod and regular methods) can be decorated with a helper that will provide automatic remote calling over RPC. This will be based on a per-service flag designating whether the service should be able to go direct to the database, or if it must go over RPC to something like conductor. In the Nova world, nova-api would be able to go direct to the database (as it is today) and thus the actual object implementation would be called. In the case of nova-compute, which is unable to talk directly to the database today, the decorator would intercept attempts to call the implementation methods and direct the call over RPC to conductor, which would call the actual object implementation.

Example

Consider a fictitious object definition and implementation:

class MyObject(base.BaseObject):
    fields = {
        'id': int,
        'foo': str,
    }
    version = '1.0'

    # NOTE(danms) magic_static_method returns a classmethod
    @base.magic_static_method
    def get_by_id(cls, context, obj_id):
        db_obj = db.fake_method_that_returns_a_thing(context, obj_id)
        myobj = cls()
        myobj.id = db_obj.id
        myobj.foo = db_obj.foo
        return myobj

    @base.magic_method
    def save(self, context):
        if 'foo' in self.what_changed():
            db.fake_method_that_updates(context, self.id, foo=self.foo)

Using this object from any service (nova-api or nova-compute) should be as simple as the following:

obj = myobject.MyObject.get_by_id(context, obj_id)
obj.foo = 'new value for foo'
obj.save()

Next Steps

We have the beginnings of an implementation, hacked into nova at the moment:

https://github.com/kk7ds/nova/commits/nova-obj

Things to look at include:

nova/object/base.py -- The base object implementation, registry, and a hacky RPC helper for the purposes of testing
nova/object/migration.py -- A simple object implementation
nova/tests/test_object.py -- Look at the two Migration tests, which demonstrate the behavior of the object in direct-to-db and remote-from-db scenarios.

It needs a lot of TLC in the area of integration of the RPC-related bits, and of course, some actual integration into the nova code itself. Right now, the wire format uses "nova" in its identifiers.

Several of us agree that this is something that can and should go into Oslo, for use by other projects if and as desired. Thus, the following steps need to be completed:

Agreement that this approach is generally useful
Agreement that it should (or should not) go into Oslo
Clean integration of the serialization hooks in the RPC classes
Integration of the base object class and utilities into Oslo
Synchronization to and Nova
Iterative conversion of Nova internals to using these objects