Jump to: navigation, search

Difference between revisions of "Cinder/NewLVMbasedDriverForSharedStorageInCinder"

(Overview)
 
(28 intermediate revisions by the same user not shown)
Line 1: Line 1:
= New LVM-based driver for shared storage in Cinder =
+
= Enhancement of LVM driver to support a volume-group on shared storage volume =
  
 
== Related blueprints ==
 
== Related blueprints ==
 
* https://blueprints.launchpad.net/cinder/+spec/lvm-driver-for-shared-storage
 
* https://blueprints.launchpad.net/cinder/+spec/lvm-driver-for-shared-storage
 
* https://blueprints.launchpad.net/nova/+spec/lvm-driver-for-shared-storage
 
* https://blueprints.launchpad.net/nova/+spec/lvm-driver-for-shared-storage
 +
 +
== Documents: ==
 +
* Cinder-Support_LVM_on_a_sharedLU: [[File:Cinder-Support_LVM_on_a_sharedLU.pdf]]
  
 
== Goals ==
 
== Goals ==
* The goal of this blue print is to implement a new LVM-based driver which supports shared block device on backend storage via Fibre Channel or iSCSI.
+
* The goal of this blue print is support  a volume-group on shared storage volume which is visible from multiple nodes.
* This will improve I/O bandwidth and response of guest VM compared to standard LVM driver.
 
* Use a backend storage directly without software iSCSI target.
 
  
 
== Overview ==
 
== Overview ==
At the standard LVM driver, Cinder creates volumes from volume group and Cinder provides these volumes using software iSCSI target and Compute node recognizes these volumes using software iSCSI initiator remotely. In this method, Compute node is not necessary to be connected directly to backend storage which includes the volume group. Therefore cinder can provide volumes to Compute node easily. However, user can’t use a storage which is connected via SAN(Fibre Channel or iSCSI storage) directly by this driver because this LVM driver presupposes virtual storage using iSCSI targetd(or LIO) and this driver does not support a storage via SAN.
+
Currently, many vendor storages have vendor specific cinder driver and these drivers supports rich features to create/delete/snapshot volumes.
 +
On the other hands, there are also many storages which do not have specific cinder driver.
 +
LVM is one of an approach to provide light weight features such as volume creation/deletion/extend, snapshot, etc.
 +
By using "cinder LVM driver" with "a volume-group on shared storage volume", cinder will be able to handle these storages.  
  
In contrast, the new LVM-based shared storage driver supports backend storages which is connected via SAN(Fibre Channel or iSCSI) directly. As a result, this driver will improve I/O bandwidth and response of guest VM compared to standard LVM driver.
+
The purpose of this blue print is support of storages which does not have specific cinder driver using LVM and VG on shared storage volume.
  
[[File:StandardLVM.png|border|500px|Standard iSCSI LVM driver]]
+
== Approach ==
[[File:NewLVMbasedDriverForSharedStorage.png|border|500px|LVM-based driver for shared storage]]
+
Current LVM driver uses software iSCSI target (tgtd or LIO) on Cinder node to make logical volumes(LV) accessible from multiple nodes.  
 +
 
 +
When a volume-group on a shared storage volume(over fibre, iscsi or whatever) is visible from all nodes such as a cinder node and compute nodes, these nodes can access the volume-group simultaneously(*1) without software iSCSI target.
 +
 
 +
If LVM driver support this type of volume-group, qemu-kvm on each compute node can attach/detach a created LV to an instance via device path of “/dev/VG Name/LV Name” on each node, without a need of software iSCSI operations.(*2)
 +
 
 +
So the enhancement will support “a volume-group visible from multiple nodes” instead of iSCSI target.
 +
 
 +
 
 +
(*1) Some operations are required to exclusive access control. See the section “Exclusive access control for metadata”
 +
(*2) After running “lvchange -ay /dev/VG Name/LV Name” command on a compute node, the device file to access the LV is created.
  
 
== Benefits of this driver ==
 
== Benefits of this driver ==
# Improve I/O bandwidth and response of guest VM by using LVM on backend storage directly
+
*General benefits
# Less traffic of data transfer between Cinder Node and Compute nodes
+
# In regard to one storage, reduce hardware based storage workload by offloading the workload to software based volume operation.
# Basic features of Cinder such as snapshot and backup etc. are appropriable from standard cinder LVM driver.
+
# Provide quicker volume creation and snapshot creation without storage workloads.
 +
# Enable cinder to any kinds of shared storage volumes without specific cinder storage driver.
 +
# Better I/O performance using direct volume access via fibre channel.
 +
 
 +
=== Support target environment ===
 +
A volume-group visible from multiple nodes
 +
 
 +
[[File:SharedLVMsupport.png|border|600px|A volume-group on shared storage volume]]
 +
 
 +
== Detail of Design ==
 +
* The enhancement introduce a feature to supports a volume- group(*3) visible from multiple nodes into the LVM driver.
 +
* Most parts of volume handling features can be inherited from current LVM driver.
 +
* Software iSCSI operations are bypassed.
 +
* Additional works are,
 +
** [Cinder]
 +
**# Add a new driver class as a subclass of LVMVolumeDriver and driver.VolumeDriver to store a path of the device file “/dev/VG Name/LV Name”.
 +
**# Add a new connector as a subclass of InitiatorConnector for volume migrate.
 +
** [Nova]
 +
**# Add a new connector as a subclass of LibvirtBaseVolumeDriver to  run “lvs” and “lvchange” in order to create/delete a device file of LV.
 +
 
 +
 
 +
(*3) The volume_group is necessary to be prepared by storage administrators.
 +
Ex. In case of FC storage.
 +
(a) Create the LU1 using storage management tool.
 +
(b) Register WWNs of control node and compute nodes into  Host group to permit an access to the LU1. The LU1 is recognized each node as a SCSI disk(sdx) after SCSI scan or reboot.
 +
(c) Create VG(1) on the LU1. The VG(1) is also recognized on each node after executing “vgscan”command.
 +
(d) Configure the VG1 to the cinder "volume_group“ parameter.
 +
 
 +
== Prerequisites ==
 +
* Use QEMU/KVM as a hypervisor (via libvirt compute driver)
 +
* A volume-group on a volume attached to multiple nodes
 +
* Exclude a volume group from target of lvmetad on compute nodes
 +
** When compute node attaches created volume to a virtual machine, latest LVM metadata is necessary. However the lvmetad caches LVM metadata and this prevent to obtain latest LVM metadata.
 +
 
 +
== Exclusive access control for metadata ==
 +
* LVM holds management region including metadata of volume-group configuration information. If multiple nodes update the metadata simultaneously, the metadata will be broken. Therefore, exclusive access control is necessary.
 +
* Specifically, operation of updating metadata is permited only cinder node.
  
== Basic Design ==
 
* This driver uses shared block storages between Cinder node and compute nodes via iSCSI or Fibre Channel.
 
* To create Volume Group or Logical Volume on a LVM of shared storage, multiple servers can refer Volume Group or Logical Volume.
 
* The driver can attach LV which is created from shared LVM to a virtual machine as a volume same as traditional LVM driver.
 
* LVM holds management region including metadata of LVM configuration information. If multiple servers update the metadata at the same time, metadata will be broken. Therefore, it is necessary that operation of updating metadata can permit Cinder node.
 
 
* The operations of updating metadata are followings.
 
* The operations of updating metadata are followings.
# Volume create
+
** Volume creation
# Volume delete
+
*** When Cinder node create a new LV on a volume-group, metadata of LVM is renewed but the update does not notify other compute nodes. Only cinder node knows the update this point.
# Volume extend
+
** Volume deletion
# Snapshot create
+
*** Delete a LV on a volume-group from Cinder node.
# Snapshot delete
+
** Volume extend
 +
*** Extend a LV on a volume-group from Cinder node.
 +
** Snapshot creation
 +
*** Create a snapthot of a LV on a volume-group from Cinder node.
 +
** Snapshot deletion
 +
*** Delete a snapthot of a LV on a volume-group from Cinder node.
 +
 
 +
* The operations without updating metadata are followings. These operations are permitted every compute node.
 +
** Volume attachment
 +
*** When attaching a LV to guest instance on a compute nodes, compute node have to reload LVM metadata using "lvscan" or "lvs" because compute node does not know latest LVM metadata.
 +
***After reloading metadata, compute node recognise latest status of volume-group and LVs.
 +
*** And then, in order to attach new LV, compute nodes need to create a device file such as /dev/"volume-group name"/"LV name" using "lvchane -ay" command.
 +
*** After activation of LV, nova compute can attach the LV into guest VM.
  
* On the other hands, the operations without updating metadata are followings. These operations are permitted every compute node.
+
* Volume detachment
# Volume attach
+
** After detaching a volume from guest VM, compute node deactivate the LV using "lvchange -an". As a result, unnecessary device file is removed from the compute node.
# Volume detach
 
  
 
{| class="wikitable"
 
{| class="wikitable"
Line 53: Line 109:
 
|}
 
|}
  
== Prerequisites ==
 
* Use QEMU/KVM as a hypervisor (via libvirt compute driver)
 
* Shared block storages between Cinder node and compute nodes via iSCSI or Fibre Channel.
 
* Disable lvmetad.
 
When compute node attaches created volume to a virtual machine, latest LVM metadata is necessary. However the lvmetad caches LVM metadata and this prevent to obtain latest LVM metadata.
 
  
 
== Configuration ==
 
== Configuration ==
In order to enable Shared LVM driver, need to define theses values at /etc/cinder/cinder.conf
+
In order to enable LVM multiple attached driver, need to define theses values at /etc/cinder/cinder.conf
  
 
Example
 
Example

Latest revision as of 22:20, 13 June 2014

Enhancement of LVM driver to support a volume-group on shared storage volume

Related blueprints

Documents:

Goals

  • The goal of this blue print is support a volume-group on shared storage volume which is visible from multiple nodes.

Overview

Currently, many vendor storages have vendor specific cinder driver and these drivers supports rich features to create/delete/snapshot volumes. On the other hands, there are also many storages which do not have specific cinder driver. LVM is one of an approach to provide light weight features such as volume creation/deletion/extend, snapshot, etc. By using "cinder LVM driver" with "a volume-group on shared storage volume", cinder will be able to handle these storages.

The purpose of this blue print is support of storages which does not have specific cinder driver using LVM and VG on shared storage volume.

Approach

Current LVM driver uses software iSCSI target (tgtd or LIO) on Cinder node to make logical volumes(LV) accessible from multiple nodes.

When a volume-group on a shared storage volume(over fibre, iscsi or whatever) is visible from all nodes such as a cinder node and compute nodes, these nodes can access the volume-group simultaneously(*1) without software iSCSI target.

If LVM driver support this type of volume-group, qemu-kvm on each compute node can attach/detach a created LV to an instance via device path of “/dev/VG Name/LV Name” on each node, without a need of software iSCSI operations.(*2)

So the enhancement will support “a volume-group visible from multiple nodes” instead of iSCSI target.


(*1) Some operations are required to exclusive access control. See the section “Exclusive access control for metadata”
(*2) After running “lvchange -ay /dev/VG Name/LV Name” command on a compute node, the device file to access the LV is created.

Benefits of this driver

  • General benefits
  1. In regard to one storage, reduce hardware based storage workload by offloading the workload to software based volume operation.
  2. Provide quicker volume creation and snapshot creation without storage workloads.
  3. Enable cinder to any kinds of shared storage volumes without specific cinder storage driver.
  4. Better I/O performance using direct volume access via fibre channel.

Support target environment

A volume-group visible from multiple nodes

A volume-group on shared storage volume

Detail of Design

  • The enhancement introduce a feature to supports a volume- group(*3) visible from multiple nodes into the LVM driver.
  • Most parts of volume handling features can be inherited from current LVM driver.
  • Software iSCSI operations are bypassed.
  • Additional works are,
    • [Cinder]
      1. Add a new driver class as a subclass of LVMVolumeDriver and driver.VolumeDriver to store a path of the device file “/dev/VG Name/LV Name”.
      2. Add a new connector as a subclass of InitiatorConnector for volume migrate.
    • [Nova]
      1. Add a new connector as a subclass of LibvirtBaseVolumeDriver to run “lvs” and “lvchange” in order to create/delete a device file of LV.


(*3) The volume_group is necessary to be prepared by storage administrators.
Ex. In case of FC storage.
(a) Create the LU1 using storage management tool.
(b) Register WWNs of control node and compute nodes into  Host group to permit an access to the LU1. The LU1 is recognized each node as a SCSI disk(sdx) after SCSI scan or reboot.
(c) Create VG(1) on the LU1. The VG(1) is also recognized on each node after executing “vgscan”command.
(d) Configure the VG1 to the cinder "volume_group“ parameter.

Prerequisites

  • Use QEMU/KVM as a hypervisor (via libvirt compute driver)
  • A volume-group on a volume attached to multiple nodes
  • Exclude a volume group from target of lvmetad on compute nodes
    • When compute node attaches created volume to a virtual machine, latest LVM metadata is necessary. However the lvmetad caches LVM metadata and this prevent to obtain latest LVM metadata.

Exclusive access control for metadata

  • LVM holds management region including metadata of volume-group configuration information. If multiple nodes update the metadata simultaneously, the metadata will be broken. Therefore, exclusive access control is necessary.
  • Specifically, operation of updating metadata is permited only cinder node.
  • The operations of updating metadata are followings.
    • Volume creation
      • When Cinder node create a new LV on a volume-group, metadata of LVM is renewed but the update does not notify other compute nodes. Only cinder node knows the update this point.
    • Volume deletion
      • Delete a LV on a volume-group from Cinder node.
    • Volume extend
      • Extend a LV on a volume-group from Cinder node.
    • Snapshot creation
      • Create a snapthot of a LV on a volume-group from Cinder node.
    • Snapshot deletion
      • Delete a snapthot of a LV on a volume-group from Cinder node.
  • The operations without updating metadata are followings. These operations are permitted every compute node.
    • Volume attachment
      • When attaching a LV to guest instance on a compute nodes, compute node have to reload LVM metadata using "lvscan" or "lvs" because compute node does not know latest LVM metadata.
      • After reloading metadata, compute node recognise latest status of volume-group and LVs.
      • And then, in order to attach new LV, compute nodes need to create a device file such as /dev/"volume-group name"/"LV name" using "lvchane -ay" command.
      • After activation of LV, nova compute can attach the LV into guest VM.
  • Volume detachment
    • After detaching a volume from guest VM, compute node deactivate the LV using "lvchange -an". As a result, unnecessary device file is removed from the compute node.
Permitted operation matrix
Operations Volume create Volume delete Volume extend Snapshot create Snapshot delete Volume attach Volume detach
Cinder node x x x x x - -
Compute node - - - - - x x
Cinder node with compute x x x x x x x


Configuration

In order to enable LVM multiple attached driver, need to define theses values at /etc/cinder/cinder.conf

Example

[LVM_shared]
volume_group=cinder-volumes-shared
volume_driver=cinder.volume.drivers.lvm.LVMSharedDriver
volume_backend_name=LVM_shared