[ DRAFT ]
- Created: 25th November 2013
- Author: Stephen Gordon
Currently the Block Storage (Cinder) service only allow a volume to be attached to one instance at a time. The Compute Service (Nova) also makes assumptions in a number of places to this effect as do the APIs, CLIs, and UIs exposed to users. This specification aims to outline the changes required to allow users to share volumes between multiple guests using either read-write or read-only attachments.
There have been several discussions about adding this type of functionality over the Grizzly and Havana cycles. This page is intended to link together those discussions and provide a place for recording future consensus on any and all outstanding issues with the design backing these blueprint(s).
In addition to the noted blueprints these resources were consulted in framing this page:
- summit-havana-cinder-multi-attach-and-ro-volumes minutes
- 2013-09-25-16.13 Cinder Meeting
- 2013-04-03-16.00 Cinder Meeting
- openstack-dev: About Read-Only volume support
Traditional cluster solutions rely on the use of clustered filesystems and quorum disks, writeable by one or more systems and often read by a larger number of systems, to maintain high availability. Users would like to be able to run such clustered applications on their OpenStack clouds. This requires the ability to have a volume attached to multiple compute instances, with some instances having read-only access and some having read-write access.
- Administrators and users are ultimately responsible for ensuring data integrity is maintained once a shared volume is attached to multiple instances in read-write mode, however such attachments must only occur as the result of an explicit request.
- Horizon support is not crucial to "Phase I" implementation of this feature, but must be considered and properly tracked as a potential future addition.
- Read-only volume support in Cinder and Nova (read-only-volumes).
- Users must be able to explicitly define a volume as "shareable" at creation time.
- Users must be able to attach a "shareable" volume to multiple compute instances, specifying a separate mode (read-write or read-only) for each attachment. That is, some attachments to a volume may be read-only, while other attachments to the same volume may be read-write.
- While Cinder will track the mode of each attachment restriction of write access must be handled by the Hypervisor drivers in Nova.
- Normal reservations should be required (and enforced) for volumes that are not marked as shareable.
initial patchset submitted by Charlie Zhou likely need further discussion and iteration to ensure they meet the requirements outlined above. In particular additional changes would be required to support explicit marking volumes as shareable, the current patch assumes all volumes are shareable and also effectively removes the reservation system previously introduced to correct 1096983 as a result.
- New volume_attachment table:
Column('id', String(length=36), primary_key=True, nullable=False), Column('volume_id', String(length=36), ForeignKey('volumes.id'), nullable=False), Column('instance_uuid', String(length=36)), Column('attached_host', String(length=255)), Column('mountpoint', String(length=255)), Column('attach_time', DateTime), Column('detach_time', DateTime), Column('attach_status', String(length=255)), Column('attach_mode',String(255)), Column('created_at', DateTime), Column('updated_at', DateTime), Column('deleted_at', DateTime), Column('deleted', Boolean)
- Currently we save 'attached_mode' in volume's admin_metadata (r/o-attach change did) table, under mutli-attach an attaching mode should be related to an attachment but volume, so we will move attached_mode to volume_attachment table as a column.
- Column 'volume_id', 'instance_uuid' and 'attached_host' will be an unique constraint as a composite index for volume_attachment table.
- New exceptions:
- TBD based on Cinder implementation, read-only volume support in the Libvirt/KVM driver has merged.
- Adding mode argument to OpenStack API volumes extension (volume-attach, v2, v3) and novaclient:
- Volumes screen:
- Reflect multiple attachments in the Attached To column/field.
- Reflect the mode of each attachment (ro or rw).
- Reflect whether a volume is "shareable" or not.
- Volume Detail screen:
- As per requirements for Volumes screen.
- Create Volume dialog:
- Allow the marking of the volume as "shareable".
- Edit Attachments dialog:
- Allow the addition of further attachments to shareable volumes that have already been attached to an instance.
- Allow setting of the attachment mode.
All existing volumes must automatically be marked, or assumed to be marked, as non-shareable. User impact is therefore expected to be minimal except for users explicitly using this new feature.
This need not be added or completed until the specification is nearing beta.
These issues or questions are outstanding and without resolution will block implementation of this proposal:
- A determination needs to be made with regards to what to resolve the conflict between the overall volume status and the status of individual attachment ('attach_status').
- Current volume status set:
- Attachment: attaching, in-use, detaching
- Basic: creating, available, deleting, deleted
- Misc: uploading, extending, awaiting-transfer
- Error: error, error_deleting, error_attaching, error_detaching, error_extending, error_restoring
- Current volume attachment status set:
- attached, detached
- Proposal (from 2013-09-25):
- Volume status = attached when one or more attachments exist.
- Volume status = detached when there are no attachments exist.
- Volume status = attaching on first attach.
- Volume status = detaching on last detach.
- Proposal (from zhiyan and thingee):
- Adding attaching and detaching to volume attachment status set.
- The priority of volume status determination is in-use, attaching, detaching.
- Volume status = in-use if any of the attachments are in attached status, even if one of the attachments is in an attaching or detaching status.
- Volume status = attaching if none of attachments are in a in-use status, but one of them is in an attaching status.
- Volume status = detaching if none of attachments are in an attaching status, but one of attachments is in a detaching status.
- what about the determination for error_attaching, error_detaching?
- Current volume status set:
- If multi-attach is determined to be extension functionality, then how to implement as an extension of the core attachment functionality?
- In the discussion on the shared-volume blueprint itself it was suggested that volumes should have to be explicitly marked as shareable to allow multi-attachment, in addition to later discussion about failing the attach if no mode is specified. Is there consensus that a "shareable" marker is required? Currently this proposal assumes the answer is yes.
- Are there additional issues to watch out for when snapshotting a shared volume?
- Wont work for QCOW2 disks on Libvirt/KVM only RAW - do other Hypervisors have any similar restrictions for sharing of volumes with read-write?