Introduce a new, optional nova service to cache high priority images on compute hosts prior to the image being requested during the usual instance creation process.
Current Caching Strategy
In the current icehouse release of Nova, images are cached on demand. As instances are created, compute hosts need to retrieve the requested image in order to build each instance. The host may then cache that image for some period of time locally. This solution speeds up subsequent requests to build an instance from the same image, on the same host.
However, there are some limitations with the current approach:
- Inconsistent build times - the first build on a host for each image is slower than subsequent ones
- Network throttling - If creation of a large number of instances is requested, many simultaneous transfers of images to hosts will be initiated. This may have a crippling effect on network performance.
Proposed Caching Service
To rectify the current limitations, we can add a simple nova service to aid in efficient distribution of high priority images to compute hosts out of the control path of the instance creation process. The service would consist of a periodic task performing the following tasks:
- List - Obtain a list of images to pre-cache.
- Fetch - Obtain a copy of each image's data.
- Serve - Setup each image's data for distribution to hosts.
- Distribute - Trigger downloads of each image by the hosts.
Use a pluggable strategy for obtaining a list of images to pre-cache. A default implementation might query Glance to obtain a list of images for certain tenants.
class ImageLister(object): def list(self): ... return image_list # return a list of images to pre-cache
Use a pluggable strategy for fetching the images from their backing storage. One option might be a class capable of fetching image data from swift.
class ImageFetcher(object): def fetch(self, image_id): ... return image_filename # return local filesystem path to image
Perform work necessary to make the image available to compute hosts. In the BitTorrent case, this would mean providing an initial seed of the image.
class ImageServer(object): def serve(self, image_id, image_filename): ... return meta # return metadata necessary for a compute host to download the image
Pre-cache service would send a RPC fanout message on the "compute" topic for each image. This would cause each host to fetch the image and pre-cache it uses virt driver specific mechanisms.
RPC API method:
def cache(image_id, meta) ... # meta contains all details necessary to fetch image from pre-cache service