Jump to: navigation, search

Difference between revisions of "Nova/Object Cache"

(Created page with "'''Nova/Object Cache''' One of the reasons why Objects were introduced was to reduce the load on the database. But a further optimization is possible and especially useful in...")
 
Line 1: Line 1:
'''Nova/Object Cache'''
 
  
One of the reasons why Objects were introduced was to reduce the load on the database. But a further optimization is possible
+
With the move to objects, there is reduced pressure on the database. Additionally we can reduce network pressure caching objects.
and especially useful in a cloud of many hosts and consequently many more objects representing VMs, networks, security groups and much more.
+
This is especially useful for large clouds comprised of many nodes. It essentially leverages the updated_at field in the database that is used to track when an object changed.
This is particularly useful for management dashboards such as horizon for very large clouds. For instance you want to display all VMs in the system (active, sleeping ..), or all VMs belonging to a tenant etc.
+
If a local object copy indicates an earlier updated_at, it must be refreshed, else use with full confidence. Management interfaces such as Horizon frequently retrieve large sets of objects and refresh them frequently (when they are not event based).
  
The database field: "updated_at" in conjunction with the object-id can be used to determine whether a copy of an object one has on hand is the latest.
+
Object header lists are tuples of object-id and updated_at.
If it is, one may use it, display it without further re-fresh.
 
  
To support caching,
+
Consider for instance the Horizon web server caching a list of virtual machine instance objects that it displays. Assume the headers are: {(VM1, t1), (VM2, t2), (VM3, t3) }
# the base object class should carry the field "updated_at".
+
Assume that only VM2 was updated and a new instance VMk was created. It is adequate to retrieve only these two objects and use them in conjunction with the existing local copies of VM1 and VM3.
# Need an API that retrieves only headers .. get_all_instances( .., headers_only=true)
 
  this would return {(VM1, t1), (VM2, t2), (VM3, t3) ...(VMz, tz)}
 
#refresh_cache(my_cache, object_headers)
 
This method essentially establishes whether the object at hand was updated since it was last retrieved. For instance, assume the cache contains only
 
{(VM1, t0), (VM2, t2)} and that t0 < t1.
 
The refresh option then retrieve a fresh copy of VM1 and copies of VM3 and VMz to give
 
  
 +
'''Some possible API changes:'''
 +
#get_instances(cached_headers, ... <currently existing args such as tenant-id>) == > in this scheme the object server does the determination of what to send.
 +
A final merge must happen at the client.
  
While this solution requires two calls and a comparison and then retrieval of full objects, in cases where all objects do not change, or do not change too frquently,
+
Alternately the client can request headers from the object server and then determine which complete objects to retrieve and do a final merge on receipt.
much the caching improves performance.
+
get_all_instances( .., headers_only=true)
  
Yet another performance speed-up is to request an object with only limited fields. This typically is useful in a display hierarchy where the more detailed object is
+
The refresh method essentially does a compare of the cached copy of an object (using the object-id) based on its updated_at value with the latest greatest in the respository per its updated_at value.
only required should the user select the item.
+
 
 +
When there are many objects, possibly with aggregates of subjects with few of them changing, this caching is useful.
 +
 
 +
Occasionally displays have a table abstract view and a more detailed view. It may be useful to add an API that requests only some fields of any object to support the abstract view.
 +
get_all_instances(field1, field2 ..)

Revision as of 09:08, 10 January 2014

With the move to objects, there is reduced pressure on the database. Additionally we can reduce network pressure caching objects. This is especially useful for large clouds comprised of many nodes. It essentially leverages the updated_at field in the database that is used to track when an object changed. If a local object copy indicates an earlier updated_at, it must be refreshed, else use with full confidence. Management interfaces such as Horizon frequently retrieve large sets of objects and refresh them frequently (when they are not event based).

Object header lists are tuples of object-id and updated_at.

Consider for instance the Horizon web server caching a list of virtual machine instance objects that it displays. Assume the headers are: {(VM1, t1), (VM2, t2), (VM3, t3) } Assume that only VM2 was updated and a new instance VMk was created. It is adequate to retrieve only these two objects and use them in conjunction with the existing local copies of VM1 and VM3.

Some possible API changes:

  1. get_instances(cached_headers, ... <currently existing args such as tenant-id>) == > in this scheme the object server does the determination of what to send.

A final merge must happen at the client.

Alternately the client can request headers from the object server and then determine which complete objects to retrieve and do a final merge on receipt. get_all_instances( .., headers_only=true)

The refresh method essentially does a compare of the cached copy of an object (using the object-id) based on its updated_at value with the latest greatest in the respository per its updated_at value.

When there are many objects, possibly with aggregates of subjects with few of them changing, this caching is useful.

Occasionally displays have a table abstract view and a more detailed view. It may be useful to add an API that requests only some fields of any object to support the abstract view. get_all_instances(field1, field2 ..)