https://wiki.openstack.org/w/api.php?action=feedcontributions&user=Ronenkat&feedformat=atomOpenStack - User contributions [en]2024-03-28T08:24:08ZUser contributionsMediaWiki 1.28.2https://wiki.openstack.org/w/index.php?title=CinderMeetings&diff=60996CinderMeetings2014-08-20T16:18:45Z<p>Ronenkat: /* Next meeting */</p>
<hr />
<div><br />
= Weekly Cinder team meeting =<br />
'''NOTE MEETING TIME: Wed's at 16:00 UTC'''<br />
<br />
If you're interested in Cinder or Block Storage in general for OpenStack, we have a weekly meetings in <code><nowiki>#openstack-meeting</nowiki></code>, on Wednesdays at 16:00 UTC. Please feel free to add items to the agenda below. NOTE: When adding topics please include your IRC name so we know who's topic it is and how to get more info.<br />
<br />
== Next meeting ==<br />
'''NOTE:''' ''Include your IRC nickname next to agenda items so that you can be called upon in the meeting and arrive at the meeting promptly if placing items in agenda. You might want to put this on your calendar if you are adding items.''<br />
<br />
'''Aug 20th, 2014 16:00 UTC'''<br />
<br />
* FPF Tomorrow Aug 21'st (jgriffith)<br />
* Time to start thinking about the Summit and how to be more effective with our time there (jgriffith)<br />
* The idea of a maintenance/ish release (jgriffith)<br />
* Volume replication (ronenkat)<br />
** The patch - https://review.openstack.org/#/c/113054/11<br />
** Depends on https://review.openstack.org/#/c/115078/ to fix pylint error<br />
** Reference replication implementation for IBM Storwize as an example - https://review.openstack.org/#/c/112224/4/<br />
* MAINTAINERS file (DuncanT)<br />
** https://etherpad.openstack.org/p/cinder-driver-maintainers<br />
* Simple CI<br />
** https://github.com/Funcan/kiss-ci (Code will be up any moment now)<br />
<br />
== Previous meetings ==<br />
'''Aug 13th, 2014 16:00 UTC'''<br />
NO MEETING TODAY, MID CYCLE MEETUP<br />
<br />
'''Aug 6th, 2014 16:00 UTC'''<br />
* Cinder mid cycle meetup next week August 12-14 (scottda)<br />
** https://etherpad.openstack.org/p/CinderMidCycleMeetupAug2014 <br />
** HP site should be set. Ping scottda with any issues/problems/concerns<br />
** Virtual meetup will need to be taken care of<br />
* Volume replication (ronenkat)<br />
** Alternative approach based on jgriffith driver based replication: https://etherpad.openstack.org/p/juno-cinder-volume-replication-apparochs<br />
<br />
'''July 30'th, 2014 16:00 UTC'''<br />
* Planning cinderclient tag for Thursday morning July 31'st, let's catch up on client changes and testing prior to that (jgriffith)<br />
* Breaking the inheritance between data and control path in Volume drivers https://review.openstack.org/#/c/105923/ (jgriffith)<br />
* Consistency groups https://review.openstack.org/#/c/104732/ (xyang)<br />
* Hitachi Block Storage cinder driver https://review.openstack.org/#/c/90379/ (saguchi)<br />
* Volume replication https://review.openstack.org/#/c/106718/ (ronenkat)<br />
** 17:00 UTC - Volume replication driver owner overview and Q & A<br />
** Callin information: passcode: 6406941 call-in numbers: https://www.teleconference.att.com/servlet/glbAccess?process=1&accessCode=6406941&accessNumber=1809417783#C2<br />
* NFS secure option -- default to 666 vs 660 vs force admin choice (bswartz)<br />
* It is code cleanup tag merge week (DuncanT)<br />
** https://review.openstack.org/#/q/project:openstack/cinder+comment:code_cleanup_batching+-status:merged,n,z<br />
<br />
'''July 23th, 2014 16:00 UTC'''<br />
<br />
* J2 Milestone (DuncanT)<br />
** JGriffith favours a freeze exception for all drivers taht currently have code / BP up, but bouncing all new ones <br />
** Review priorities<br />
*** Driver specs<br />
*** CG groups - a big change that requires driver changes, so needs lots of eyes and time for driver maintainers to do their thing too: https://review.openstack.org/#/c/104743/<br />
*** Pool scheduling - https://review.openstack.org/#/c/98715/<br />
*** Others?<br />
* Plug for weekly 3rd Party CI meeting (Mondays at 18:00 UTC [1 pm Central]) (jungelboyj)<br />
** I attended this week's meeting and gave a high level status. They are looking for more participation.<br />
* ProphetStor Cinder drivers (stevetan)<br />
** Get feedback on progress of our DPL driver and documentation required https://review.openstack.org/#/c/95829/<br />
** Get direction from community for our Federator SDS driver https://review.openstack.org/#/c/99616/<br />
* Volume replication - work in progress (ronenkat)<br />
** https://review.openstack.org/#/c/106718/2<br />
* NFS Security, if there's time.<br />
** https://blueprints.launchpad.net/cinder/+spec/secure-nfs<br />
<br />
'''July 16th, 2014 16:00 UTC'''<br />
* Putting the fun back into cinder development.<br />
** There's been a mailing list thread recently about how nit-picky reviews are getting about typos, white space and the like, and how it is a motivation killer. I'm inclinded to agree - the formatting of the doc strings, fullstops at the end of comments etc doesn't actually improve the code much at all, and getting a -1 for it is a buzz kill of the highest order. Should we leave that sort of thing to the gate, and say that if there is no hacking check for it then it isn't important in general? (DuncanT)<br />
<br />
* How to proceed with cinder/openstack requirements? python-dbus for https://review.openstack.org/99013, see mailing list conclusion http://lists.openstack.org/pipermail/openstack-dev/2014-July/040182.html (flip214)<br />
* code churn, not sure where/when to start, fear of merge conflicts (flip214)<br />
<br />
* Hitachi Block Storage cinder driver (tsekiyama)<br />
** We want to get some feedback about how we can make this forward<br />
** Review: https://review.openstack.org/#/c/90379/<br />
<br />
* Log translations https://review.openstack.org/#/c/105665/ is still stuck - any thoughts? Options I can see: (DuncanT)<br />
** A better technical solution - should be possible where the message format is not expanded outside the logging call i.e.:<br />
Ok:<br />
<nowiki><br />
LOG.warning("The flubigar id %d exploded messily", flu_id)<br />
</nowiki><br />
Not ok:<br />
<nowiki><br />
msg = _("The flubigar id %d exploded messily") % flu_id<br />
LOG.warning(msg)<br />
</nowiki><br />
<br />
** We don't break up our message categories<br />
*** This makes life harder for the translation team, makes us inconsistent with Openstack in general but keeps the code from descending into ugliness<br />
** Related discussion on enabling translation (jungleboyj):<br />
*** Have two patches awaiting approval: Explicit import of _() https://review.openstack.org/105315 and enable lazy translation: https://review.openstack.org/105561<br />
*** Need to get these merged so we are running with the changes.<br />
* 3rd Party CI (jungleboyj):<br />
** Clarification on when drivers are going to be removed.<br />
<br />
== Previous meetings ==<br />
<br />
'''July 9, 2014 16:00 UTC'''<br />
<br />
* flip214 to jgriffith: Status of Separation of Connectors from Driver/Device Interface?<br />
* Quick check: Is everybody happy in principle with the text of https://wiki.openstack.org/wiki/CinderCodeCleanupPatches ? (DuncanT)<br />
<br />
<br />
'''July 2nd, 2014 16:00 UTC'''<br />
* Batching up mechanical code cleanup until the one week after each milestone (DuncanT)<br />
** See https://review.openstack.org/#/c/102872/ for example and https://review.openstack.org/#/c/101847<br />
** Log translations and hacking fixes fall into this class<br />
** Means you only take one bit hit per milestone for rebases<br />
** Does require some tracking so they don't get missed (and I will suck at said tracking, enviably)<br />
* LVM: Support a volume-group on shared storage (mtanino)<br />
** Want to quickly discuss the driver benefit, driver comparison, performance(P8-P14): https://wiki.openstack.org/w/images/0/08/Cinder-Support_LVM_on_a_sharedLU.pdf<br />
** Review comments? https://review.openstack.org/#/c/92479/<br />
* Cinder Third Party CI Names (asselin)<br />
** Online discussion of this thread: http://lists.openstack.org/pipermail/openstack-dev/2014-July/039103.html<br />
<br />
'''June 25th, 2014 16:00 UTC'''<br />
* Consistency groups [xyang]<br />
** Cinder spec review: https://review.openstack.org/#/c/96665/<br />
* CI status [xyang]<br />
** [http://www.mail-archive.com/openstack-dev@lists.openstack.org/msg26258.html Third-Party CI Issue (asselin): direct access to review.openstack.org port 29418 required]l<br />
* Pools implementation [navneet]<br />
** Comparison etherpad https://etherpad.openstack.org/p/cinder-pool-impl-comparison<br />
** Decision to select implementation<br />
* keystoneclient integration with cinderclient [hrybacki / ayoung]<br />
** Discuss integration and collaboration possibilities<br />
<br />
<br />
<br />
'''June 18th, 2014 16:00 UTC'''<br />
* It's review day !?! [jdg]<br />
* Mid cycle meetup plans/updates [jdg]<br />
** https://etherpad.openstack.org/p/CinderMidCycleMeetupAug2014<br />
* Separation of Connectors from Driver/Device Interface (status update) [jdg]<br />
* Updates on 3'rd party CI [jdg]<br />
* Things we need to decide upon (not today, but do your homework for next week)<br />
** Software Define Storage layers/drivers<br />
** Pools implementation<br />
<br />
'''June 11th, 2014 16:00 UTC'''<br />
* Volume replication (ronenkat)<br />
** Blueprint and spec review/comments? https://review.openstack.org/#/c/98308<br />
* oslo.db (jungleboyj)<br />
** Want to quickly discuss the review out there for this: https://review.openstack.org/#/c/77125/<br />
** Move to current oslo.db? Wait for library work?<br />
* oslo logging discussion (jungleboyj)<br />
** Removing translation of debug messages<br />
** Adding _LE, _LI, _LW<br />
* 3rd party cinder ci (asselin)<br />
** Looking for volunteers to test out my fork of jaypipe's 3rd party ci setup which has support for nodepool & http proxies.<br />
** https://github.com/rasselin/os-ext-testing<br />
** https://github.com/rasselin/os-ext-testing-data<br />
** [http://www.mail-archive.com/openstack-dev@lists.openstack.org/msg26258.html Third-Party CI Issue: direct access to review.openstack.org port 29418 required]l<br />
* HDS HNAS Cinder drivers (sombrafam)<br />
** As we are trying check-in for quite a while, we want to get some feedback on the missing steps<br />
** First thread: https://review.openstack.org/#/c/74371/<br />
** Continuation: https://review.openstack.org/#/c/82505/<br />
** Current thread in discussion: https://review.openstack.org/#/c/84244/<br />
* Mid-cyce Sprint (scottda)<br />
** HP in Fort Collins, CO can host on site<br />
** The thought was 10-20 developers<br />
** Large room is available July 14,15,17,18, 21-25, 27-Aug 1 ... Other options exist<br />
* Backend Pools (navneet)<br />
** which way to go? There are two WIPs.<br />
** comparisons between the two approaches? Any wiki/etherpad present or to be prepared for documenting opinions?<br />
<br />
'''June 4th, 2014 16:00 UTC'''<br />
* Volume backup modification (navneet)<br />
** Blueprint and spec review/comments? https://blueprints.launchpad.net/cinder/+spec/vol-backup-service-per-backend<br />
* Dynamic multi pool (navneet)<br />
** Review comments? https://review.openstack.org/#/c/85760/<br />
** Implementation approach comparison.<br />
* 3rd party ci (asselin)<br />
** I have a conflict with another meeting, but my WIP to add nodepool into jaypipe's 3rd party ci solution is available here: https://github.com/rasselin/os-ext-testing/tree/nodepool<br />
* oslo.db (jungleboyj)<br />
** Want to quickly discuss the review out there for this: https://review.openstack.org/#/c/77125/<br />
** Move to current oslo.db? Wait for library work?<br />
** Need to drop off the meeting about 40 minutes in so if we can cover before then it would be appreciated. :-)<br />
<br />
'''May 28th, 2014 16:00 UTC'''<br />
* 3rd Party CI (jungleboyj)<br />
** What tempest test cases to run?<br />
** iSCSI only? What about for FC only drivers then?<br />
** Progress on where to record results?<br />
* SSH host keys (jungleboyj)<br />
** https://launchpad.net/bugs/1320050 and https://bugs.launchpad.net/cinder/+bug/1320056<br />
** Need plan to get this addressed by all drivers using SSH. (New config options?)<br />
** Way to get this backported to Havana?<br />
* Dynamic multi-pools (navneet)<br />
** Status and WIP review (https://review.openstack.org/#/c/85760/)<br />
** Back manager design improvement/rewriting for better rpc message handling.<br />
** Back up service for multi pools.<br />
* cinder-specs (jgriffith)<br />
** Specs repo is live<br />
** Process<br />
** Reviews<br />
<br />
'''May 21st, 2014 16:00 UTC'''<br />
* Consistency Groups (xyang)<br />
** A few people have concerns on the restriction of one volume type per CG. Should we allow one CG to have multiple volume types on the same backend? Let's discuss about it.<br />
* Third-Party CI (jgriffith)<br />
** Who's started, who's planning to and how can we help support each other to get this going smoothly<br />
* Moving GlusterFS snapshot code into the NFS RemoteFs driver (mberlin)<br />
** The GlusterFS snapshot code using qcow2 snapshots is useful for all file based storage systems. I would volunteer to move the GlusterFS snapshot code into the general RemoteFs driver - making it easier to get [https://review.openstack.org/#/c/94186/ our driver] accepted ;-)<br />
** Eric Harney is fine with this and planned to do this for Juno anyway ([https://blueprints.launchpad.net/cinder/+spec/remotefs-snaps see his blueprint]). I've put it on the agenda to make sure others also agree with this approach.<br />
<br />
'''May 7th, 2014 16:00 UTC'''<br />
* Limit == 0 in API [https://review.openstack.org/#/c/86207/ patch review] - thingee<br />
<br />
'''April 16th, 2014 16:00 UTC'''<br />
* Release Status<br />
* Summit Session Updates<br />
* Next Stop ATL!!!<br />
* Cinder resource status - thingee<br />
<br />
'''April 9th, 2014 16:00 UTC'''<br />
(Agenda entered retrospectively)<br />
* Cinder Spec (jgriffith) <br />
** Just a heads up that cinder blueprints will move to a gerrit based process shortly, a la nova. Details and wiki entry to follow.<br />
* RC2 status (jgriffith) <br />
** Just after cutting RC2, a bunch of bugs<br />
* Testing RC code (jgriffith)<br />
** Get on it, folks!<br />
**Looks like theres some serious, intermittent performance issues in the API somewhere...<br />
<br />
'''April 2, 2014 16:00 UTC'''<br />
_Meeting cancelled and summary discussion held on #openstack-cinder_<br />
<br />
* Release status and bugs<br />
* -2s left on reviews from before Junos opened - please check if they are still valid<br />
- https://review.openstack.org/#/c/73446/ (JGriffith)<br />
- https://review.openstack.org/#/c/80550/ (JBryant)<br />
- https://review.openstack.org/#/c/82100/ (Avishay)<br />
- https://review.openstack.org/#/c/74158/ (Avishay)<br />
- + a whole bunch of stable branch stuff<br />
<br />
== Previous meetings ==<br />
<br />
'''Mar 26, 2014 16:00 UTC'''<br />
* RC1 updates (jgriffith)<br />
* Design Summit Sessions (jgriffith)<br />
http://eavesdrop.openstack.org/meetings/cinder/2014/cinder.2014-03-26-16.00.log.html<br />
<br />
'''Mar 19, 2014 16:00 UTC'''<br />
* ProphetStor Driver Exception request for Icehouse (jgriffith)<br />
* Bug status/updates (jgriffith)<br />
* What we should be punting to Juno (aka immediate -2 in Gerrit) (jgriffith)<br />
* Continuous Integration for Cinder Certification (jungleboyj)<br />
<br />
'''Mar 12, 2014 16:00 UTC'''<br />
* Cancelled due to nothing on the agenda. Ad-hoc discussion on #openstack-cinder instead<br />
<br />
'''Mar 5, 2014 16:00 UTC'''<br />
* Volume replication - avishay<br />
* [https://blueprints.launchpad.net/cinder/+spec/lvm-driver-for-shared-storage New LVM-based driver for shared storage] - mtanino<br />
* DRBD/drbdmanage driver for cinder - philr<br />
<br />
'''Feb 19, 2014 16:00 UTC'''<br />
* I3 Status check/updates<br />
* Cert test<br />
* [https://review.openstack.org/#/c/73745 Milestone Consideration for Drivers] -thingee<br />
* [https://etherpad.openstack.org/p/cinder-hack-201402 Hack-a-thon details] -thingee<br />
* [https://review.openstack.org/#/c/66737/ scheduling for local storage] -DuncanT<br />
<br />
'''Feb 5, 2014 16:00 UTC'''<br />
* I3 Status check/updates<br />
* Cert test<br />
* Multiple pools per backend (bswartz)<br />
'''Jan 8, 2014 16:00 UTC'''<br />
* I2 is just around the corner, blueprint updates<br />
* Alternating meeting time proposal, results on feedback<br />
* Driver cert test, it's there... use it<br />
* Prioritizing patches and reviews<br />
'''December 18, 2013 16:00 UTC'''<br />
* Blueprint discussion [https://blueprints.launchpad.net/cinder/+spec/cinder-backup-recover-api cinder backup recovery api - import/export backups] - avishay<br />
* Blueprint discussion [https://blueprints.launchpad.net/cinder/+spec/copy-volume-to-image-task-flow] - Griffith<br />
* [https://blueprints.launchpad.net/cinder/+spec/admin-defined-capabilities Admin-defined capabilities] - Ollie<br />
* Why is type manage an extension? -Thingee<br />
<br />
'''December 11, 2013 16:00 UTC'''<br />
* Proposal of [https://etherpad.openstack.org/p/cinder-extensions extension packages] -Thingee<br />
<br />
'''December 4, 2013 16:00 UTC'''<br />
* Progressing with [https://wiki.openstack.org/wiki/Cinder/blueprints/multi-attach-volume multi-attach / shared-volume] - sgordon<br />
* [https://etherpad.openstack.org/p/icehouse-cinder-acls-for-volumes Access Control List design discussion] - alatynskaya<br />
<br />
'''November 27, 2013 16:00 UTC'''<br />
* [https://etherpad.openstack.org/p/icehouse-cinder-continuous-volume-replication-v2 Updated volume mirroring design] - avishay<br />
* Start using only Mock for new tests... [http://lists.openstack.org/pipermail/openstack-dev/2013-November/018501.html Related Nova Discussion] - Thingee<br />
* Rate limiting came up in the summit, and [http://lists.openstack.org/pipermail/openstack-dev/2013-November/020291.html on openstack-dev] - avishay<br />
* Metadata backup (https://review.openstack.org/#/c/51900/) progress RFC - dosaboy<br />
<br />
<br />
'''November 20, 2013 16:00 UTC'''<br />
* I-1 scheduling - JGriffith<br />
<br />
<br />
'''November 13, 2013 16:00 UTC'''<br />
* patches should update doc files where necessary to ease writing of release notes (Avishay?)<br />
* fencing host from storage (Ehud Trainin)<br />
* Summarize priority of tasks from summit discussions (https://wiki.openstack.org/wiki/Summit/Icehouse/Etherpads#Cinder and https://etherpad.openstack.org/p/cinder-icehouse-summary) Griff<br />
<br />
<br />
'''October 30, 2013 16:00 UTC'''<br />
* cinder backup metadata support - http://goo.gl/Jkg2FV (dosaboy)<br />
* fencing and unfencing host from storage - https://blueprints.launchpad.net/cinder/+spec/fencing-and-unfencing (Ehud Trainin)<br />
<br />
'''October 23, 2013 16:00 UTC'''<br />
* Nexenta backup driver https://review.openstack.org/#/c/47005/ - DuncanT<br />
<br />
<br />
'''October 2, 2013 16:00 UTC'''<br />
* What's still broken in Havana<br />
:* Backups and multibackend (https://code.launchpad.net/bugs/1228223): Fix committed<br />
:* Configuration - Global CONF settings in brick don't belong, and a number of them break multi-backend (Bug #1230066): Fix committed<br />
:* TaskFlow retry mechanism - The majority felt this should be left as a white list, but no work has been done to fix it so we still have ugly failures/roll-backs (3 bugs logged here): '''???'''<br />
:* Quotas - Don't know that anybody has gotten to the bottom of the quota syncing issue (Bug #1202896): '''Still open'''<br />
:* iSCSI Target creation failures - This was thought to have been fixed but showed up last night (Bug #1223469): Probable fix committed<br />
:* Summary of gate issue pertaining to Cinder can be viewed here: http://paste.openstack.org/show/47798/<br />
:* Moving to taskflow - avishay<br />
<br />
<br />
'''Sept 25, 2013, 16:00 UTC'''<br />
* PTL nomination process is open until the 26'th, if you want to run send your nomination proposal out to the dev ML<br />
* What's broken in Havana<br />
:* Backups (specifically when configured with multi-backend volumes)<br />
:* Configuration - Global CONF settings in brick don't belong, and a number of them break multi-backend (Bug #1230066)<br />
:* TaskFlow retry mechanism - The majority felt this should be left as a white list, but no work has been done to fix it so we still have ugly failures/roll-backs (3 bugs logged here)<br />
:* Quotas - Don't know that anybody has gotten to the bottom of the quota syncing issue (Bug #1202896)<br />
:* iSCSI Target creation failures - This was thought to have been fixed but showed up last night (Bug #1223469)<br />
:* ????<br />
* Cinderclient release plans/status? (Eharney)<br />
* OSLO imports (DuncanT)<br />
* bp/cinder-backup-improvements (dosaboy)<br />
* bp/multi-attach (zhiyan)<br />
<br />
<br />
<br />
'''Aug 21, 2013, 16:00 UTC'''<br />
# No agenda, no meeting.<br />
<br />
'''Aug 14, 2013, 16:00 UTC'''<br />
# Volume migration status - avishay<br />
# API extensions using metadata. This comes from the [https://review.openstack.org/#/c/38322/ readonly volume attach support]. - thingee<br />
[http://eavesdrop.openstack.org/meetings/cinder/2013/cinder.2013-08-14-16.00.log.html IRC Log]<br />
<br />
'''Aug 7, 2013, 16:00 UTC'''<br />
# [https://bugs.launchpad.net/cinder/+bug/1209199 RFC - make all rbd clones copy-on-write] -- Dosaboy<br />
# V1 API removal issues, plans and timescales - DuncanT<br />
<br />
== Meeting Minutes ==<br />
<br />
http://eavesdrop.openstack.org/meetings/cinder/2014/<br />
<br />
http://eavesdrop.openstack.org/meetings/cinder/2013/<br />
<br />
http://eavesdrop.openstack.org/meetings/cinder/2012/</div>Ronenkathttps://wiki.openstack.org/w/index.php?title=CinderMeetings&diff=60991CinderMeetings2014-08-20T15:17:57Z<p>Ronenkat: /* Next meeting */</p>
<hr />
<div><br />
= Weekly Cinder team meeting =<br />
'''NOTE MEETING TIME: Wed's at 16:00 UTC'''<br />
<br />
If you're interested in Cinder or Block Storage in general for OpenStack, we have a weekly meetings in <code><nowiki>#openstack-meeting</nowiki></code>, on Wednesdays at 16:00 UTC. Please feel free to add items to the agenda below. NOTE: When adding topics please include your IRC name so we know who's topic it is and how to get more info.<br />
<br />
== Next meeting ==<br />
'''NOTE:''' ''Include your IRC nickname next to agenda items so that you can be called upon in the meeting and arrive at the meeting promptly if placing items in agenda. You might want to put this on your calendar if you are adding items.''<br />
<br />
'''Aug 20th, 2014 16:00 UTC'''<br />
<br />
* FPF Tomorrow Aug 21'st (jgriffith)<br />
* Time to start thinking about the Summit and how to be more effective with our time there (jgriffith)<br />
* The idea of a maintenance/ish release (jgriffith)<br />
* Volume replication (ronenkat)<br />
** The patch - https://review.openstack.org/#/c/113054/11<br />
** Depends on https://review.openstack.org/#/c/115078/ to fix pylint error<br />
** Reference replication implementation for IBM Storwize as an example<br />
<br />
== Previous meetings ==<br />
'''Aug 13th, 2014 16:00 UTC'''<br />
NO MEETING TODAY, MID CYCLE MEETUP<br />
<br />
'''Aug 6th, 2014 16:00 UTC'''<br />
* Cinder mid cycle meetup next week August 12-14 (scottda)<br />
** https://etherpad.openstack.org/p/CinderMidCycleMeetupAug2014 <br />
** HP site should be set. Ping scottda with any issues/problems/concerns<br />
** Virtual meetup will need to be taken care of<br />
* Volume replication (ronenkat)<br />
** Alternative approach based on jgriffith driver based replication: https://etherpad.openstack.org/p/juno-cinder-volume-replication-apparochs<br />
<br />
'''July 30'th, 2014 16:00 UTC'''<br />
* Planning cinderclient tag for Thursday morning July 31'st, let's catch up on client changes and testing prior to that (jgriffith)<br />
* Breaking the inheritance between data and control path in Volume drivers https://review.openstack.org/#/c/105923/ (jgriffith)<br />
* Consistency groups https://review.openstack.org/#/c/104732/ (xyang)<br />
* Hitachi Block Storage cinder driver https://review.openstack.org/#/c/90379/ (saguchi)<br />
* Volume replication https://review.openstack.org/#/c/106718/ (ronenkat)<br />
** 17:00 UTC - Volume replication driver owner overview and Q & A<br />
** Callin information: passcode: 6406941 call-in numbers: https://www.teleconference.att.com/servlet/glbAccess?process=1&accessCode=6406941&accessNumber=1809417783#C2<br />
* NFS secure option -- default to 666 vs 660 vs force admin choice (bswartz)<br />
* It is code cleanup tag merge week (DuncanT)<br />
** https://review.openstack.org/#/q/project:openstack/cinder+comment:code_cleanup_batching+-status:merged,n,z<br />
<br />
'''July 23th, 2014 16:00 UTC'''<br />
<br />
* J2 Milestone (DuncanT)<br />
** JGriffith favours a freeze exception for all drivers taht currently have code / BP up, but bouncing all new ones <br />
** Review priorities<br />
*** Driver specs<br />
*** CG groups - a big change that requires driver changes, so needs lots of eyes and time for driver maintainers to do their thing too: https://review.openstack.org/#/c/104743/<br />
*** Pool scheduling - https://review.openstack.org/#/c/98715/<br />
*** Others?<br />
* Plug for weekly 3rd Party CI meeting (Mondays at 18:00 UTC [1 pm Central]) (jungelboyj)<br />
** I attended this week's meeting and gave a high level status. They are looking for more participation.<br />
* ProphetStor Cinder drivers (stevetan)<br />
** Get feedback on progress of our DPL driver and documentation required https://review.openstack.org/#/c/95829/<br />
** Get direction from community for our Federator SDS driver https://review.openstack.org/#/c/99616/<br />
* Volume replication - work in progress (ronenkat)<br />
** https://review.openstack.org/#/c/106718/2<br />
* NFS Security, if there's time.<br />
** https://blueprints.launchpad.net/cinder/+spec/secure-nfs<br />
<br />
'''July 16th, 2014 16:00 UTC'''<br />
* Putting the fun back into cinder development.<br />
** There's been a mailing list thread recently about how nit-picky reviews are getting about typos, white space and the like, and how it is a motivation killer. I'm inclinded to agree - the formatting of the doc strings, fullstops at the end of comments etc doesn't actually improve the code much at all, and getting a -1 for it is a buzz kill of the highest order. Should we leave that sort of thing to the gate, and say that if there is no hacking check for it then it isn't important in general? (DuncanT)<br />
<br />
* How to proceed with cinder/openstack requirements? python-dbus for https://review.openstack.org/99013, see mailing list conclusion http://lists.openstack.org/pipermail/openstack-dev/2014-July/040182.html (flip214)<br />
* code churn, not sure where/when to start, fear of merge conflicts (flip214)<br />
<br />
* Hitachi Block Storage cinder driver (tsekiyama)<br />
** We want to get some feedback about how we can make this forward<br />
** Review: https://review.openstack.org/#/c/90379/<br />
<br />
* Log translations https://review.openstack.org/#/c/105665/ is still stuck - any thoughts? Options I can see: (DuncanT)<br />
** A better technical solution - should be possible where the message format is not expanded outside the logging call i.e.:<br />
Ok:<br />
<nowiki><br />
LOG.warning("The flubigar id %d exploded messily", flu_id)<br />
</nowiki><br />
Not ok:<br />
<nowiki><br />
msg = _("The flubigar id %d exploded messily") % flu_id<br />
LOG.warning(msg)<br />
</nowiki><br />
<br />
** We don't break up our message categories<br />
*** This makes life harder for the translation team, makes us inconsistent with Openstack in general but keeps the code from descending into ugliness<br />
** Related discussion on enabling translation (jungleboyj):<br />
*** Have two patches awaiting approval: Explicit import of _() https://review.openstack.org/105315 and enable lazy translation: https://review.openstack.org/105561<br />
*** Need to get these merged so we are running with the changes.<br />
* 3rd Party CI (jungleboyj):<br />
** Clarification on when drivers are going to be removed.<br />
<br />
== Previous meetings ==<br />
<br />
'''July 9, 2014 16:00 UTC'''<br />
<br />
* flip214 to jgriffith: Status of Separation of Connectors from Driver/Device Interface?<br />
* Quick check: Is everybody happy in principle with the text of https://wiki.openstack.org/wiki/CinderCodeCleanupPatches ? (DuncanT)<br />
<br />
<br />
'''July 2nd, 2014 16:00 UTC'''<br />
* Batching up mechanical code cleanup until the one week after each milestone (DuncanT)<br />
** See https://review.openstack.org/#/c/102872/ for example and https://review.openstack.org/#/c/101847<br />
** Log translations and hacking fixes fall into this class<br />
** Means you only take one bit hit per milestone for rebases<br />
** Does require some tracking so they don't get missed (and I will suck at said tracking, enviably)<br />
* LVM: Support a volume-group on shared storage (mtanino)<br />
** Want to quickly discuss the driver benefit, driver comparison, performance(P8-P14): https://wiki.openstack.org/w/images/0/08/Cinder-Support_LVM_on_a_sharedLU.pdf<br />
** Review comments? https://review.openstack.org/#/c/92479/<br />
* Cinder Third Party CI Names (asselin)<br />
** Online discussion of this thread: http://lists.openstack.org/pipermail/openstack-dev/2014-July/039103.html<br />
<br />
'''June 25th, 2014 16:00 UTC'''<br />
* Consistency groups [xyang]<br />
** Cinder spec review: https://review.openstack.org/#/c/96665/<br />
* CI status [xyang]<br />
** [http://www.mail-archive.com/openstack-dev@lists.openstack.org/msg26258.html Third-Party CI Issue (asselin): direct access to review.openstack.org port 29418 required]l<br />
* Pools implementation [navneet]<br />
** Comparison etherpad https://etherpad.openstack.org/p/cinder-pool-impl-comparison<br />
** Decision to select implementation<br />
* keystoneclient integration with cinderclient [hrybacki / ayoung]<br />
** Discuss integration and collaboration possibilities<br />
<br />
<br />
<br />
'''June 18th, 2014 16:00 UTC'''<br />
* It's review day !?! [jdg]<br />
* Mid cycle meetup plans/updates [jdg]<br />
** https://etherpad.openstack.org/p/CinderMidCycleMeetupAug2014<br />
* Separation of Connectors from Driver/Device Interface (status update) [jdg]<br />
* Updates on 3'rd party CI [jdg]<br />
* Things we need to decide upon (not today, but do your homework for next week)<br />
** Software Define Storage layers/drivers<br />
** Pools implementation<br />
<br />
'''June 11th, 2014 16:00 UTC'''<br />
* Volume replication (ronenkat)<br />
** Blueprint and spec review/comments? https://review.openstack.org/#/c/98308<br />
* oslo.db (jungleboyj)<br />
** Want to quickly discuss the review out there for this: https://review.openstack.org/#/c/77125/<br />
** Move to current oslo.db? Wait for library work?<br />
* oslo logging discussion (jungleboyj)<br />
** Removing translation of debug messages<br />
** Adding _LE, _LI, _LW<br />
* 3rd party cinder ci (asselin)<br />
** Looking for volunteers to test out my fork of jaypipe's 3rd party ci setup which has support for nodepool & http proxies.<br />
** https://github.com/rasselin/os-ext-testing<br />
** https://github.com/rasselin/os-ext-testing-data<br />
** [http://www.mail-archive.com/openstack-dev@lists.openstack.org/msg26258.html Third-Party CI Issue: direct access to review.openstack.org port 29418 required]l<br />
* HDS HNAS Cinder drivers (sombrafam)<br />
** As we are trying check-in for quite a while, we want to get some feedback on the missing steps<br />
** First thread: https://review.openstack.org/#/c/74371/<br />
** Continuation: https://review.openstack.org/#/c/82505/<br />
** Current thread in discussion: https://review.openstack.org/#/c/84244/<br />
* Mid-cyce Sprint (scottda)<br />
** HP in Fort Collins, CO can host on site<br />
** The thought was 10-20 developers<br />
** Large room is available July 14,15,17,18, 21-25, 27-Aug 1 ... Other options exist<br />
* Backend Pools (navneet)<br />
** which way to go? There are two WIPs.<br />
** comparisons between the two approaches? Any wiki/etherpad present or to be prepared for documenting opinions?<br />
<br />
'''June 4th, 2014 16:00 UTC'''<br />
* Volume backup modification (navneet)<br />
** Blueprint and spec review/comments? https://blueprints.launchpad.net/cinder/+spec/vol-backup-service-per-backend<br />
* Dynamic multi pool (navneet)<br />
** Review comments? https://review.openstack.org/#/c/85760/<br />
** Implementation approach comparison.<br />
* 3rd party ci (asselin)<br />
** I have a conflict with another meeting, but my WIP to add nodepool into jaypipe's 3rd party ci solution is available here: https://github.com/rasselin/os-ext-testing/tree/nodepool<br />
* oslo.db (jungleboyj)<br />
** Want to quickly discuss the review out there for this: https://review.openstack.org/#/c/77125/<br />
** Move to current oslo.db? Wait for library work?<br />
** Need to drop off the meeting about 40 minutes in so if we can cover before then it would be appreciated. :-)<br />
<br />
'''May 28th, 2014 16:00 UTC'''<br />
* 3rd Party CI (jungleboyj)<br />
** What tempest test cases to run?<br />
** iSCSI only? What about for FC only drivers then?<br />
** Progress on where to record results?<br />
* SSH host keys (jungleboyj)<br />
** https://launchpad.net/bugs/1320050 and https://bugs.launchpad.net/cinder/+bug/1320056<br />
** Need plan to get this addressed by all drivers using SSH. (New config options?)<br />
** Way to get this backported to Havana?<br />
* Dynamic multi-pools (navneet)<br />
** Status and WIP review (https://review.openstack.org/#/c/85760/)<br />
** Back manager design improvement/rewriting for better rpc message handling.<br />
** Back up service for multi pools.<br />
* cinder-specs (jgriffith)<br />
** Specs repo is live<br />
** Process<br />
** Reviews<br />
<br />
'''May 21st, 2014 16:00 UTC'''<br />
* Consistency Groups (xyang)<br />
** A few people have concerns on the restriction of one volume type per CG. Should we allow one CG to have multiple volume types on the same backend? Let's discuss about it.<br />
* Third-Party CI (jgriffith)<br />
** Who's started, who's planning to and how can we help support each other to get this going smoothly<br />
* Moving GlusterFS snapshot code into the NFS RemoteFs driver (mberlin)<br />
** The GlusterFS snapshot code using qcow2 snapshots is useful for all file based storage systems. I would volunteer to move the GlusterFS snapshot code into the general RemoteFs driver - making it easier to get [https://review.openstack.org/#/c/94186/ our driver] accepted ;-)<br />
** Eric Harney is fine with this and planned to do this for Juno anyway ([https://blueprints.launchpad.net/cinder/+spec/remotefs-snaps see his blueprint]). I've put it on the agenda to make sure others also agree with this approach.<br />
<br />
'''May 7th, 2014 16:00 UTC'''<br />
* Limit == 0 in API [https://review.openstack.org/#/c/86207/ patch review] - thingee<br />
<br />
'''April 16th, 2014 16:00 UTC'''<br />
* Release Status<br />
* Summit Session Updates<br />
* Next Stop ATL!!!<br />
* Cinder resource status - thingee<br />
<br />
'''April 9th, 2014 16:00 UTC'''<br />
(Agenda entered retrospectively)<br />
* Cinder Spec (jgriffith) <br />
** Just a heads up that cinder blueprints will move to a gerrit based process shortly, a la nova. Details and wiki entry to follow.<br />
* RC2 status (jgriffith) <br />
** Just after cutting RC2, a bunch of bugs<br />
* Testing RC code (jgriffith)<br />
** Get on it, folks!<br />
**Looks like theres some serious, intermittent performance issues in the API somewhere...<br />
<br />
'''April 2, 2014 16:00 UTC'''<br />
_Meeting cancelled and summary discussion held on #openstack-cinder_<br />
<br />
* Release status and bugs<br />
* -2s left on reviews from before Junos opened - please check if they are still valid<br />
- https://review.openstack.org/#/c/73446/ (JGriffith)<br />
- https://review.openstack.org/#/c/80550/ (JBryant)<br />
- https://review.openstack.org/#/c/82100/ (Avishay)<br />
- https://review.openstack.org/#/c/74158/ (Avishay)<br />
- + a whole bunch of stable branch stuff<br />
<br />
== Previous meetings ==<br />
<br />
'''Mar 26, 2014 16:00 UTC'''<br />
* RC1 updates (jgriffith)<br />
* Design Summit Sessions (jgriffith)<br />
http://eavesdrop.openstack.org/meetings/cinder/2014/cinder.2014-03-26-16.00.log.html<br />
<br />
'''Mar 19, 2014 16:00 UTC'''<br />
* ProphetStor Driver Exception request for Icehouse (jgriffith)<br />
* Bug status/updates (jgriffith)<br />
* What we should be punting to Juno (aka immediate -2 in Gerrit) (jgriffith)<br />
* Continuous Integration for Cinder Certification (jungleboyj)<br />
<br />
'''Mar 12, 2014 16:00 UTC'''<br />
* Cancelled due to nothing on the agenda. Ad-hoc discussion on #openstack-cinder instead<br />
<br />
'''Mar 5, 2014 16:00 UTC'''<br />
* Volume replication - avishay<br />
* [https://blueprints.launchpad.net/cinder/+spec/lvm-driver-for-shared-storage New LVM-based driver for shared storage] - mtanino<br />
* DRBD/drbdmanage driver for cinder - philr<br />
<br />
'''Feb 19, 2014 16:00 UTC'''<br />
* I3 Status check/updates<br />
* Cert test<br />
* [https://review.openstack.org/#/c/73745 Milestone Consideration for Drivers] -thingee<br />
* [https://etherpad.openstack.org/p/cinder-hack-201402 Hack-a-thon details] -thingee<br />
* [https://review.openstack.org/#/c/66737/ scheduling for local storage] -DuncanT<br />
<br />
'''Feb 5, 2014 16:00 UTC'''<br />
* I3 Status check/updates<br />
* Cert test<br />
* Multiple pools per backend (bswartz)<br />
'''Jan 8, 2014 16:00 UTC'''<br />
* I2 is just around the corner, blueprint updates<br />
* Alternating meeting time proposal, results on feedback<br />
* Driver cert test, it's there... use it<br />
* Prioritizing patches and reviews<br />
'''December 18, 2013 16:00 UTC'''<br />
* Blueprint discussion [https://blueprints.launchpad.net/cinder/+spec/cinder-backup-recover-api cinder backup recovery api - import/export backups] - avishay<br />
* Blueprint discussion [https://blueprints.launchpad.net/cinder/+spec/copy-volume-to-image-task-flow] - Griffith<br />
* [https://blueprints.launchpad.net/cinder/+spec/admin-defined-capabilities Admin-defined capabilities] - Ollie<br />
* Why is type manage an extension? -Thingee<br />
<br />
'''December 11, 2013 16:00 UTC'''<br />
* Proposal of [https://etherpad.openstack.org/p/cinder-extensions extension packages] -Thingee<br />
<br />
'''December 4, 2013 16:00 UTC'''<br />
* Progressing with [https://wiki.openstack.org/wiki/Cinder/blueprints/multi-attach-volume multi-attach / shared-volume] - sgordon<br />
* [https://etherpad.openstack.org/p/icehouse-cinder-acls-for-volumes Access Control List design discussion] - alatynskaya<br />
<br />
'''November 27, 2013 16:00 UTC'''<br />
* [https://etherpad.openstack.org/p/icehouse-cinder-continuous-volume-replication-v2 Updated volume mirroring design] - avishay<br />
* Start using only Mock for new tests... [http://lists.openstack.org/pipermail/openstack-dev/2013-November/018501.html Related Nova Discussion] - Thingee<br />
* Rate limiting came up in the summit, and [http://lists.openstack.org/pipermail/openstack-dev/2013-November/020291.html on openstack-dev] - avishay<br />
* Metadata backup (https://review.openstack.org/#/c/51900/) progress RFC - dosaboy<br />
<br />
<br />
'''November 20, 2013 16:00 UTC'''<br />
* I-1 scheduling - JGriffith<br />
<br />
<br />
'''November 13, 2013 16:00 UTC'''<br />
* patches should update doc files where necessary to ease writing of release notes (Avishay?)<br />
* fencing host from storage (Ehud Trainin)<br />
* Summarize priority of tasks from summit discussions (https://wiki.openstack.org/wiki/Summit/Icehouse/Etherpads#Cinder and https://etherpad.openstack.org/p/cinder-icehouse-summary) Griff<br />
<br />
<br />
'''October 30, 2013 16:00 UTC'''<br />
* cinder backup metadata support - http://goo.gl/Jkg2FV (dosaboy)<br />
* fencing and unfencing host from storage - https://blueprints.launchpad.net/cinder/+spec/fencing-and-unfencing (Ehud Trainin)<br />
<br />
'''October 23, 2013 16:00 UTC'''<br />
* Nexenta backup driver https://review.openstack.org/#/c/47005/ - DuncanT<br />
<br />
<br />
'''October 2, 2013 16:00 UTC'''<br />
* What's still broken in Havana<br />
:* Backups and multibackend (https://code.launchpad.net/bugs/1228223): Fix committed<br />
:* Configuration - Global CONF settings in brick don't belong, and a number of them break multi-backend (Bug #1230066): Fix committed<br />
:* TaskFlow retry mechanism - The majority felt this should be left as a white list, but no work has been done to fix it so we still have ugly failures/roll-backs (3 bugs logged here): '''???'''<br />
:* Quotas - Don't know that anybody has gotten to the bottom of the quota syncing issue (Bug #1202896): '''Still open'''<br />
:* iSCSI Target creation failures - This was thought to have been fixed but showed up last night (Bug #1223469): Probable fix committed<br />
:* Summary of gate issue pertaining to Cinder can be viewed here: http://paste.openstack.org/show/47798/<br />
:* Moving to taskflow - avishay<br />
<br />
<br />
'''Sept 25, 2013, 16:00 UTC'''<br />
* PTL nomination process is open until the 26'th, if you want to run send your nomination proposal out to the dev ML<br />
* What's broken in Havana<br />
:* Backups (specifically when configured with multi-backend volumes)<br />
:* Configuration - Global CONF settings in brick don't belong, and a number of them break multi-backend (Bug #1230066)<br />
:* TaskFlow retry mechanism - The majority felt this should be left as a white list, but no work has been done to fix it so we still have ugly failures/roll-backs (3 bugs logged here)<br />
:* Quotas - Don't know that anybody has gotten to the bottom of the quota syncing issue (Bug #1202896)<br />
:* iSCSI Target creation failures - This was thought to have been fixed but showed up last night (Bug #1223469)<br />
:* ????<br />
* Cinderclient release plans/status? (Eharney)<br />
* OSLO imports (DuncanT)<br />
* bp/cinder-backup-improvements (dosaboy)<br />
* bp/multi-attach (zhiyan)<br />
<br />
<br />
<br />
'''Aug 21, 2013, 16:00 UTC'''<br />
# No agenda, no meeting.<br />
<br />
'''Aug 14, 2013, 16:00 UTC'''<br />
# Volume migration status - avishay<br />
# API extensions using metadata. This comes from the [https://review.openstack.org/#/c/38322/ readonly volume attach support]. - thingee<br />
[http://eavesdrop.openstack.org/meetings/cinder/2013/cinder.2013-08-14-16.00.log.html IRC Log]<br />
<br />
'''Aug 7, 2013, 16:00 UTC'''<br />
# [https://bugs.launchpad.net/cinder/+bug/1209199 RFC - make all rbd clones copy-on-write] -- Dosaboy<br />
# V1 API removal issues, plans and timescales - DuncanT<br />
<br />
== Meeting Minutes ==<br />
<br />
http://eavesdrop.openstack.org/meetings/cinder/2014/<br />
<br />
http://eavesdrop.openstack.org/meetings/cinder/2013/<br />
<br />
http://eavesdrop.openstack.org/meetings/cinder/2012/</div>Ronenkathttps://wiki.openstack.org/w/index.php?title=CinderMeetings&diff=59824CinderMeetings2014-08-06T13:17:52Z<p>Ronenkat: /* Next meeting */</p>
<hr />
<div><br />
= Weekly Cinder team meeting =<br />
'''NOTE MEETING TIME: Wed's at 16:00 UTC'''<br />
<br />
If you're interested in Cinder or Block Storage in general for OpenStack, we have a weekly meetings in <code><nowiki>#openstack-meeting</nowiki></code>, on Wednesdays at 16:00 UTC. Please feel free to add items to the agenda below. NOTE: When adding topics please include your IRC name so we know who's topic it is and how to get more info.<br />
<br />
== Next meeting ==<br />
'''NOTE:''' ''Include your IRC nickname next to agenda items so that you can be called upon in the meeting and arrive at the meeting promptly if placing items in agenda. You might want to put this on your calendar if you are adding items.''<br />
<br />
'''Aug 6th, 2014 16:00 UTC'''<br />
* Cinder mid cycle meetup next week August 12-14 (scottda)<br />
** https://etherpad.openstack.org/p/CinderMidCycleMeetupAug2014 <br />
** HP site should be set. Ping scottda with any issues/problems/concerns<br />
** Virtual meetup will need to be taken care of<br />
* Volume replication (ronenkat)<br />
** Alternative approach based on jgriffith driver based replication: https://etherpad.openstack.org/p/juno-cinder-volume-replication-apparochs<br />
<br />
== Previous meetings ==<br />
'''July 30'th, 2014 16:00 UTC'''<br />
* Planning cinderclient tag for Thursday morning July 31'st, let's catch up on client changes and testing prior to that (jgriffith)<br />
* Breaking the inheritance between data and control path in Volume drivers https://review.openstack.org/#/c/105923/ (jgriffith)<br />
* Consistency groups https://review.openstack.org/#/c/104732/ (xyang)<br />
* Hitachi Block Storage cinder driver https://review.openstack.org/#/c/90379/ (saguchi)<br />
* Volume replication https://review.openstack.org/#/c/106718/ (ronenkat)<br />
** 17:00 UTC - Volume replication driver owner overview and Q & A<br />
** Callin information: passcode: 6406941 call-in numbers: https://www.teleconference.att.com/servlet/glbAccess?process=1&accessCode=6406941&accessNumber=1809417783#C2<br />
* NFS secure option -- default to 666 vs 660 vs force admin choice (bswartz)<br />
* It is code cleanup tag merge week (DuncanT)<br />
** https://review.openstack.org/#/q/project:openstack/cinder+comment:code_cleanup_batching+-status:merged,n,z<br />
<br />
'''July 23th, 2014 16:00 UTC'''<br />
<br />
* J2 Milestone (DuncanT)<br />
** JGriffith favours a freeze exception for all drivers taht currently have code / BP up, but bouncing all new ones <br />
** Review priorities<br />
*** Driver specs<br />
*** CG groups - a big change that requires driver changes, so needs lots of eyes and time for driver maintainers to do their thing too: https://review.openstack.org/#/c/104743/<br />
*** Pool scheduling - https://review.openstack.org/#/c/98715/<br />
*** Others?<br />
* Plug for weekly 3rd Party CI meeting (Mondays at 18:00 UTC [1 pm Central]) (jungelboyj)<br />
** I attended this week's meeting and gave a high level status. They are looking for more participation.<br />
* ProphetStor Cinder drivers (stevetan)<br />
** Get feedback on progress of our DPL driver and documentation required https://review.openstack.org/#/c/95829/<br />
** Get direction from community for our Federator SDS driver https://review.openstack.org/#/c/99616/<br />
* Volume replication - work in progress (ronenkat)<br />
** https://review.openstack.org/#/c/106718/2<br />
* NFS Security, if there's time.<br />
** https://blueprints.launchpad.net/cinder/+spec/secure-nfs<br />
<br />
'''July 16th, 2014 16:00 UTC'''<br />
* Putting the fun back into cinder development.<br />
** There's been a mailing list thread recently about how nit-picky reviews are getting about typos, white space and the like, and how it is a motivation killer. I'm inclinded to agree - the formatting of the doc strings, fullstops at the end of comments etc doesn't actually improve the code much at all, and getting a -1 for it is a buzz kill of the highest order. Should we leave that sort of thing to the gate, and say that if there is no hacking check for it then it isn't important in general? (DuncanT)<br />
<br />
* How to proceed with cinder/openstack requirements? python-dbus for https://review.openstack.org/99013, see mailing list conclusion http://lists.openstack.org/pipermail/openstack-dev/2014-July/040182.html (flip214)<br />
* code churn, not sure where/when to start, fear of merge conflicts (flip214)<br />
<br />
* Hitachi Block Storage cinder driver (tsekiyama)<br />
** We want to get some feedback about how we can make this forward<br />
** Review: https://review.openstack.org/#/c/90379/<br />
<br />
* Log translations https://review.openstack.org/#/c/105665/ is still stuck - any thoughts? Options I can see: (DuncanT)<br />
** A better technical solution - should be possible where the message format is not expanded outside the logging call i.e.:<br />
Ok:<br />
<nowiki><br />
LOG.warning("The flubigar id %d exploded messily", flu_id)<br />
</nowiki><br />
Not ok:<br />
<nowiki><br />
msg = _("The flubigar id %d exploded messily") % flu_id<br />
LOG.warning(msg)<br />
</nowiki><br />
<br />
** We don't break up our message categories<br />
*** This makes life harder for the translation team, makes us inconsistent with Openstack in general but keeps the code from descending into ugliness<br />
** Related discussion on enabling translation (jungleboyj):<br />
*** Have two patches awaiting approval: Explicit import of _() https://review.openstack.org/105315 and enable lazy translation: https://review.openstack.org/105561<br />
*** Need to get these merged so we are running with the changes.<br />
* 3rd Party CI (jungleboyj):<br />
** Clarification on when drivers are going to be removed.<br />
<br />
== Previous meetings ==<br />
<br />
'''July 9, 2014 16:00 UTC'''<br />
<br />
* flip214 to jgriffith: Status of Separation of Connectors from Driver/Device Interface?<br />
* Quick check: Is everybody happy in principle with the text of https://wiki.openstack.org/wiki/CinderCodeCleanupPatches ? (DuncanT)<br />
<br />
<br />
'''July 2nd, 2014 16:00 UTC'''<br />
* Batching up mechanical code cleanup until the one week after each milestone (DuncanT)<br />
** See https://review.openstack.org/#/c/102872/ for example and https://review.openstack.org/#/c/101847<br />
** Log translations and hacking fixes fall into this class<br />
** Means you only take one bit hit per milestone for rebases<br />
** Does require some tracking so they don't get missed (and I will suck at said tracking, enviably)<br />
* LVM: Support a volume-group on shared storage (mtanino)<br />
** Want to quickly discuss the driver benefit, driver comparison, performance(P8-P14): https://wiki.openstack.org/w/images/0/08/Cinder-Support_LVM_on_a_sharedLU.pdf<br />
** Review comments? https://review.openstack.org/#/c/92479/<br />
* Cinder Third Party CI Names (asselin)<br />
** Online discussion of this thread: http://lists.openstack.org/pipermail/openstack-dev/2014-July/039103.html<br />
<br />
'''June 25th, 2014 16:00 UTC'''<br />
* Consistency groups [xyang]<br />
** Cinder spec review: https://review.openstack.org/#/c/96665/<br />
* CI status [xyang]<br />
** [http://www.mail-archive.com/openstack-dev@lists.openstack.org/msg26258.html Third-Party CI Issue (asselin): direct access to review.openstack.org port 29418 required]l<br />
* Pools implementation [navneet]<br />
** Comparison etherpad https://etherpad.openstack.org/p/cinder-pool-impl-comparison<br />
** Decision to select implementation<br />
* keystoneclient integration with cinderclient [hrybacki / ayoung]<br />
** Discuss integration and collaboration possibilities<br />
<br />
<br />
<br />
'''June 18th, 2014 16:00 UTC'''<br />
* It's review day !?! [jdg]<br />
* Mid cycle meetup plans/updates [jdg]<br />
** https://etherpad.openstack.org/p/CinderMidCycleMeetupAug2014<br />
* Separation of Connectors from Driver/Device Interface (status update) [jdg]<br />
* Updates on 3'rd party CI [jdg]<br />
* Things we need to decide upon (not today, but do your homework for next week)<br />
** Software Define Storage layers/drivers<br />
** Pools implementation<br />
<br />
'''June 11th, 2014 16:00 UTC'''<br />
* Volume replication (ronenkat)<br />
** Blueprint and spec review/comments? https://review.openstack.org/#/c/98308<br />
* oslo.db (jungleboyj)<br />
** Want to quickly discuss the review out there for this: https://review.openstack.org/#/c/77125/<br />
** Move to current oslo.db? Wait for library work?<br />
* oslo logging discussion (jungleboyj)<br />
** Removing translation of debug messages<br />
** Adding _LE, _LI, _LW<br />
* 3rd party cinder ci (asselin)<br />
** Looking for volunteers to test out my fork of jaypipe's 3rd party ci setup which has support for nodepool & http proxies.<br />
** https://github.com/rasselin/os-ext-testing<br />
** https://github.com/rasselin/os-ext-testing-data<br />
** [http://www.mail-archive.com/openstack-dev@lists.openstack.org/msg26258.html Third-Party CI Issue: direct access to review.openstack.org port 29418 required]l<br />
* HDS HNAS Cinder drivers (sombrafam)<br />
** As we are trying check-in for quite a while, we want to get some feedback on the missing steps<br />
** First thread: https://review.openstack.org/#/c/74371/<br />
** Continuation: https://review.openstack.org/#/c/82505/<br />
** Current thread in discussion: https://review.openstack.org/#/c/84244/<br />
* Mid-cyce Sprint (scottda)<br />
** HP in Fort Collins, CO can host on site<br />
** The thought was 10-20 developers<br />
** Large room is available July 14,15,17,18, 21-25, 27-Aug 1 ... Other options exist<br />
* Backend Pools (navneet)<br />
** which way to go? There are two WIPs.<br />
** comparisons between the two approaches? Any wiki/etherpad present or to be prepared for documenting opinions?<br />
<br />
'''June 4th, 2014 16:00 UTC'''<br />
* Volume backup modification (navneet)<br />
** Blueprint and spec review/comments? https://blueprints.launchpad.net/cinder/+spec/vol-backup-service-per-backend<br />
* Dynamic multi pool (navneet)<br />
** Review comments? https://review.openstack.org/#/c/85760/<br />
** Implementation approach comparison.<br />
* 3rd party ci (asselin)<br />
** I have a conflict with another meeting, but my WIP to add nodepool into jaypipe's 3rd party ci solution is available here: https://github.com/rasselin/os-ext-testing/tree/nodepool<br />
* oslo.db (jungleboyj)<br />
** Want to quickly discuss the review out there for this: https://review.openstack.org/#/c/77125/<br />
** Move to current oslo.db? Wait for library work?<br />
** Need to drop off the meeting about 40 minutes in so if we can cover before then it would be appreciated. :-)<br />
<br />
'''May 28th, 2014 16:00 UTC'''<br />
* 3rd Party CI (jungleboyj)<br />
** What tempest test cases to run?<br />
** iSCSI only? What about for FC only drivers then?<br />
** Progress on where to record results?<br />
* SSH host keys (jungleboyj)<br />
** https://launchpad.net/bugs/1320050 and https://bugs.launchpad.net/cinder/+bug/1320056<br />
** Need plan to get this addressed by all drivers using SSH. (New config options?)<br />
** Way to get this backported to Havana?<br />
* Dynamic multi-pools (navneet)<br />
** Status and WIP review (https://review.openstack.org/#/c/85760/)<br />
** Back manager design improvement/rewriting for better rpc message handling.<br />
** Back up service for multi pools.<br />
* cinder-specs (jgriffith)<br />
** Specs repo is live<br />
** Process<br />
** Reviews<br />
<br />
'''May 21st, 2014 16:00 UTC'''<br />
* Consistency Groups (xyang)<br />
** A few people have concerns on the restriction of one volume type per CG. Should we allow one CG to have multiple volume types on the same backend? Let's discuss about it.<br />
* Third-Party CI (jgriffith)<br />
** Who's started, who's planning to and how can we help support each other to get this going smoothly<br />
* Moving GlusterFS snapshot code into the NFS RemoteFs driver (mberlin)<br />
** The GlusterFS snapshot code using qcow2 snapshots is useful for all file based storage systems. I would volunteer to move the GlusterFS snapshot code into the general RemoteFs driver - making it easier to get [https://review.openstack.org/#/c/94186/ our driver] accepted ;-)<br />
** Eric Harney is fine with this and planned to do this for Juno anyway ([https://blueprints.launchpad.net/cinder/+spec/remotefs-snaps see his blueprint]). I've put it on the agenda to make sure others also agree with this approach.<br />
<br />
'''May 7th, 2014 16:00 UTC'''<br />
* Limit == 0 in API [https://review.openstack.org/#/c/86207/ patch review] - thingee<br />
<br />
'''April 16th, 2014 16:00 UTC'''<br />
* Release Status<br />
* Summit Session Updates<br />
* Next Stop ATL!!!<br />
* Cinder resource status - thingee<br />
<br />
'''April 9th, 2014 16:00 UTC'''<br />
(Agenda entered retrospectively)<br />
* Cinder Spec (jgriffith) <br />
** Just a heads up that cinder blueprints will move to a gerrit based process shortly, a la nova. Details and wiki entry to follow.<br />
* RC2 status (jgriffith) <br />
** Just after cutting RC2, a bunch of bugs<br />
* Testing RC code (jgriffith)<br />
** Get on it, folks!<br />
**Looks like theres some serious, intermittent performance issues in the API somewhere...<br />
<br />
'''April 2, 2014 16:00 UTC'''<br />
_Meeting cancelled and summary discussion held on #openstack-cinder_<br />
<br />
* Release status and bugs<br />
* -2s left on reviews from before Junos opened - please check if they are still valid<br />
- https://review.openstack.org/#/c/73446/ (JGriffith)<br />
- https://review.openstack.org/#/c/80550/ (JBryant)<br />
- https://review.openstack.org/#/c/82100/ (Avishay)<br />
- https://review.openstack.org/#/c/74158/ (Avishay)<br />
- + a whole bunch of stable branch stuff<br />
<br />
== Previous meetings ==<br />
<br />
'''Mar 26, 2014 16:00 UTC'''<br />
* RC1 updates (jgriffith)<br />
* Design Summit Sessions (jgriffith)<br />
http://eavesdrop.openstack.org/meetings/cinder/2014/cinder.2014-03-26-16.00.log.html<br />
<br />
'''Mar 19, 2014 16:00 UTC'''<br />
* ProphetStor Driver Exception request for Icehouse (jgriffith)<br />
* Bug status/updates (jgriffith)<br />
* What we should be punting to Juno (aka immediate -2 in Gerrit) (jgriffith)<br />
* Continuous Integration for Cinder Certification (jungleboyj)<br />
<br />
'''Mar 12, 2014 16:00 UTC'''<br />
* Cancelled due to nothing on the agenda. Ad-hoc discussion on #openstack-cinder instead<br />
<br />
'''Mar 5, 2014 16:00 UTC'''<br />
* Volume replication - avishay<br />
* [https://blueprints.launchpad.net/cinder/+spec/lvm-driver-for-shared-storage New LVM-based driver for shared storage] - mtanino<br />
* DRBD/drbdmanage driver for cinder - philr<br />
<br />
'''Feb 19, 2014 16:00 UTC'''<br />
* I3 Status check/updates<br />
* Cert test<br />
* [https://review.openstack.org/#/c/73745 Milestone Consideration for Drivers] -thingee<br />
* [https://etherpad.openstack.org/p/cinder-hack-201402 Hack-a-thon details] -thingee<br />
* [https://review.openstack.org/#/c/66737/ scheduling for local storage] -DuncanT<br />
<br />
'''Feb 5, 2014 16:00 UTC'''<br />
* I3 Status check/updates<br />
* Cert test<br />
* Multiple pools per backend (bswartz)<br />
'''Jan 8, 2014 16:00 UTC'''<br />
* I2 is just around the corner, blueprint updates<br />
* Alternating meeting time proposal, results on feedback<br />
* Driver cert test, it's there... use it<br />
* Prioritizing patches and reviews<br />
'''December 18, 2013 16:00 UTC'''<br />
* Blueprint discussion [https://blueprints.launchpad.net/cinder/+spec/cinder-backup-recover-api cinder backup recovery api - import/export backups] - avishay<br />
* Blueprint discussion [https://blueprints.launchpad.net/cinder/+spec/copy-volume-to-image-task-flow] - Griffith<br />
* [https://blueprints.launchpad.net/cinder/+spec/admin-defined-capabilities Admin-defined capabilities] - Ollie<br />
* Why is type manage an extension? -Thingee<br />
<br />
'''December 11, 2013 16:00 UTC'''<br />
* Proposal of [https://etherpad.openstack.org/p/cinder-extensions extension packages] -Thingee<br />
<br />
'''December 4, 2013 16:00 UTC'''<br />
* Progressing with [https://wiki.openstack.org/wiki/Cinder/blueprints/multi-attach-volume multi-attach / shared-volume] - sgordon<br />
* [https://etherpad.openstack.org/p/icehouse-cinder-acls-for-volumes Access Control List design discussion] - alatynskaya<br />
<br />
'''November 27, 2013 16:00 UTC'''<br />
* [https://etherpad.openstack.org/p/icehouse-cinder-continuous-volume-replication-v2 Updated volume mirroring design] - avishay<br />
* Start using only Mock for new tests... [http://lists.openstack.org/pipermail/openstack-dev/2013-November/018501.html Related Nova Discussion] - Thingee<br />
* Rate limiting came up in the summit, and [http://lists.openstack.org/pipermail/openstack-dev/2013-November/020291.html on openstack-dev] - avishay<br />
* Metadata backup (https://review.openstack.org/#/c/51900/) progress RFC - dosaboy<br />
<br />
<br />
'''November 20, 2013 16:00 UTC'''<br />
* I-1 scheduling - JGriffith<br />
<br />
<br />
'''November 13, 2013 16:00 UTC'''<br />
* patches should update doc files where necessary to ease writing of release notes (Avishay?)<br />
* fencing host from storage (Ehud Trainin)<br />
* Summarize priority of tasks from summit discussions (https://wiki.openstack.org/wiki/Summit/Icehouse/Etherpads#Cinder and https://etherpad.openstack.org/p/cinder-icehouse-summary) Griff<br />
<br />
<br />
'''October 30, 2013 16:00 UTC'''<br />
* cinder backup metadata support - http://goo.gl/Jkg2FV (dosaboy)<br />
* fencing and unfencing host from storage - https://blueprints.launchpad.net/cinder/+spec/fencing-and-unfencing (Ehud Trainin)<br />
<br />
'''October 23, 2013 16:00 UTC'''<br />
* Nexenta backup driver https://review.openstack.org/#/c/47005/ - DuncanT<br />
<br />
<br />
'''October 2, 2013 16:00 UTC'''<br />
* What's still broken in Havana<br />
:* Backups and multibackend (https://code.launchpad.net/bugs/1228223): Fix committed<br />
:* Configuration - Global CONF settings in brick don't belong, and a number of them break multi-backend (Bug #1230066): Fix committed<br />
:* TaskFlow retry mechanism - The majority felt this should be left as a white list, but no work has been done to fix it so we still have ugly failures/roll-backs (3 bugs logged here): '''???'''<br />
:* Quotas - Don't know that anybody has gotten to the bottom of the quota syncing issue (Bug #1202896): '''Still open'''<br />
:* iSCSI Target creation failures - This was thought to have been fixed but showed up last night (Bug #1223469): Probable fix committed<br />
:* Summary of gate issue pertaining to Cinder can be viewed here: http://paste.openstack.org/show/47798/<br />
:* Moving to taskflow - avishay<br />
<br />
<br />
'''Sept 25, 2013, 16:00 UTC'''<br />
* PTL nomination process is open until the 26'th, if you want to run send your nomination proposal out to the dev ML<br />
* What's broken in Havana<br />
:* Backups (specifically when configured with multi-backend volumes)<br />
:* Configuration - Global CONF settings in brick don't belong, and a number of them break multi-backend (Bug #1230066)<br />
:* TaskFlow retry mechanism - The majority felt this should be left as a white list, but no work has been done to fix it so we still have ugly failures/roll-backs (3 bugs logged here)<br />
:* Quotas - Don't know that anybody has gotten to the bottom of the quota syncing issue (Bug #1202896)<br />
:* iSCSI Target creation failures - This was thought to have been fixed but showed up last night (Bug #1223469)<br />
:* ????<br />
* Cinderclient release plans/status? (Eharney)<br />
* OSLO imports (DuncanT)<br />
* bp/cinder-backup-improvements (dosaboy)<br />
* bp/multi-attach (zhiyan)<br />
<br />
<br />
<br />
'''Aug 21, 2013, 16:00 UTC'''<br />
# No agenda, no meeting.<br />
<br />
'''Aug 14, 2013, 16:00 UTC'''<br />
# Volume migration status - avishay<br />
# API extensions using metadata. This comes from the [https://review.openstack.org/#/c/38322/ readonly volume attach support]. - thingee<br />
[http://eavesdrop.openstack.org/meetings/cinder/2013/cinder.2013-08-14-16.00.log.html IRC Log]<br />
<br />
'''Aug 7, 2013, 16:00 UTC'''<br />
# [https://bugs.launchpad.net/cinder/+bug/1209199 RFC - make all rbd clones copy-on-write] -- Dosaboy<br />
# V1 API removal issues, plans and timescales - DuncanT<br />
<br />
== Meeting Minutes ==<br />
<br />
http://eavesdrop.openstack.org/meetings/cinder/2014/<br />
<br />
http://eavesdrop.openstack.org/meetings/cinder/2013/<br />
<br />
http://eavesdrop.openstack.org/meetings/cinder/2012/</div>Ronenkathttps://wiki.openstack.org/w/index.php?title=CinderMeetings&diff=59800CinderMeetings2014-08-06T06:34:52Z<p>Ronenkat: /* Next meeting */</p>
<hr />
<div><br />
= Weekly Cinder team meeting =<br />
'''NOTE MEETING TIME: Wed's at 16:00 UTC'''<br />
<br />
If you're interested in Cinder or Block Storage in general for OpenStack, we have a weekly meetings in <code><nowiki>#openstack-meeting</nowiki></code>, on Wednesdays at 16:00 UTC. Please feel free to add items to the agenda below. NOTE: When adding topics please include your IRC name so we know who's topic it is and how to get more info.<br />
<br />
== Next meeting ==<br />
'''NOTE:''' ''Include your IRC nickname next to agenda items so that you can be called upon in the meeting and arrive at the meeting promptly if placing items in agenda. You might want to put this on your calendar if you are adding items.''<br />
<br />
'''Aug 6th, 2014 16:00 UTC'''<br />
* Cinder mid cycle meetup next week August 12-14 (scottda)<br />
** https://etherpad.openstack.org/p/CinderMidCycleMeetupAug2014 <br />
** HP site should be set. Ping scottda with any issues/problems/concerns<br />
** Virtual meetup will need to be taken care of<br />
* Volume replication (ronenkat)<br />
** Alternative approach based jgriffith driver based replication: https://etherpad.openstack.org/p/juno-cinder-volume-replication-apparochs<br />
<br />
== Previous meetings ==<br />
'''July 30'th, 2014 16:00 UTC'''<br />
* Planning cinderclient tag for Thursday morning July 31'st, let's catch up on client changes and testing prior to that (jgriffith)<br />
* Breaking the inheritance between data and control path in Volume drivers https://review.openstack.org/#/c/105923/ (jgriffith)<br />
* Consistency groups https://review.openstack.org/#/c/104732/ (xyang)<br />
* Hitachi Block Storage cinder driver https://review.openstack.org/#/c/90379/ (saguchi)<br />
* Volume replication https://review.openstack.org/#/c/106718/ (ronenkat)<br />
** 17:00 UTC - Volume replication driver owner overview and Q & A<br />
** Callin information: passcode: 6406941 call-in numbers: https://www.teleconference.att.com/servlet/glbAccess?process=1&accessCode=6406941&accessNumber=1809417783#C2<br />
* NFS secure option -- default to 666 vs 660 vs force admin choice (bswartz)<br />
* It is code cleanup tag merge week (DuncanT)<br />
** https://review.openstack.org/#/q/project:openstack/cinder+comment:code_cleanup_batching+-status:merged,n,z<br />
<br />
'''July 23th, 2014 16:00 UTC'''<br />
<br />
* J2 Milestone (DuncanT)<br />
** JGriffith favours a freeze exception for all drivers taht currently have code / BP up, but bouncing all new ones <br />
** Review priorities<br />
*** Driver specs<br />
*** CG groups - a big change that requires driver changes, so needs lots of eyes and time for driver maintainers to do their thing too: https://review.openstack.org/#/c/104743/<br />
*** Pool scheduling - https://review.openstack.org/#/c/98715/<br />
*** Others?<br />
* Plug for weekly 3rd Party CI meeting (Mondays at 18:00 UTC [1 pm Central]) (jungelboyj)<br />
** I attended this week's meeting and gave a high level status. They are looking for more participation.<br />
* ProphetStor Cinder drivers (stevetan)<br />
** Get feedback on progress of our DPL driver and documentation required https://review.openstack.org/#/c/95829/<br />
** Get direction from community for our Federator SDS driver https://review.openstack.org/#/c/99616/<br />
* Volume replication - work in progress (ronenkat)<br />
** https://review.openstack.org/#/c/106718/2<br />
* NFS Security, if there's time.<br />
** https://blueprints.launchpad.net/cinder/+spec/secure-nfs<br />
<br />
'''July 16th, 2014 16:00 UTC'''<br />
* Putting the fun back into cinder development.<br />
** There's been a mailing list thread recently about how nit-picky reviews are getting about typos, white space and the like, and how it is a motivation killer. I'm inclinded to agree - the formatting of the doc strings, fullstops at the end of comments etc doesn't actually improve the code much at all, and getting a -1 for it is a buzz kill of the highest order. Should we leave that sort of thing to the gate, and say that if there is no hacking check for it then it isn't important in general? (DuncanT)<br />
<br />
* How to proceed with cinder/openstack requirements? python-dbus for https://review.openstack.org/99013, see mailing list conclusion http://lists.openstack.org/pipermail/openstack-dev/2014-July/040182.html (flip214)<br />
* code churn, not sure where/when to start, fear of merge conflicts (flip214)<br />
<br />
* Hitachi Block Storage cinder driver (tsekiyama)<br />
** We want to get some feedback about how we can make this forward<br />
** Review: https://review.openstack.org/#/c/90379/<br />
<br />
* Log translations https://review.openstack.org/#/c/105665/ is still stuck - any thoughts? Options I can see: (DuncanT)<br />
** A better technical solution - should be possible where the message format is not expanded outside the logging call i.e.:<br />
Ok:<br />
<nowiki><br />
LOG.warning("The flubigar id %d exploded messily", flu_id)<br />
</nowiki><br />
Not ok:<br />
<nowiki><br />
msg = _("The flubigar id %d exploded messily") % flu_id<br />
LOG.warning(msg)<br />
</nowiki><br />
<br />
** We don't break up our message categories<br />
*** This makes life harder for the translation team, makes us inconsistent with Openstack in general but keeps the code from descending into ugliness<br />
** Related discussion on enabling translation (jungleboyj):<br />
*** Have two patches awaiting approval: Explicit import of _() https://review.openstack.org/105315 and enable lazy translation: https://review.openstack.org/105561<br />
*** Need to get these merged so we are running with the changes.<br />
* 3rd Party CI (jungleboyj):<br />
** Clarification on when drivers are going to be removed.<br />
<br />
== Previous meetings ==<br />
<br />
'''July 9, 2014 16:00 UTC'''<br />
<br />
* flip214 to jgriffith: Status of Separation of Connectors from Driver/Device Interface?<br />
* Quick check: Is everybody happy in principle with the text of https://wiki.openstack.org/wiki/CinderCodeCleanupPatches ? (DuncanT)<br />
<br />
<br />
'''July 2nd, 2014 16:00 UTC'''<br />
* Batching up mechanical code cleanup until the one week after each milestone (DuncanT)<br />
** See https://review.openstack.org/#/c/102872/ for example and https://review.openstack.org/#/c/101847<br />
** Log translations and hacking fixes fall into this class<br />
** Means you only take one bit hit per milestone for rebases<br />
** Does require some tracking so they don't get missed (and I will suck at said tracking, enviably)<br />
* LVM: Support a volume-group on shared storage (mtanino)<br />
** Want to quickly discuss the driver benefit, driver comparison, performance(P8-P14): https://wiki.openstack.org/w/images/0/08/Cinder-Support_LVM_on_a_sharedLU.pdf<br />
** Review comments? https://review.openstack.org/#/c/92479/<br />
* Cinder Third Party CI Names (asselin)<br />
** Online discussion of this thread: http://lists.openstack.org/pipermail/openstack-dev/2014-July/039103.html<br />
<br />
'''June 25th, 2014 16:00 UTC'''<br />
* Consistency groups [xyang]<br />
** Cinder spec review: https://review.openstack.org/#/c/96665/<br />
* CI status [xyang]<br />
** [http://www.mail-archive.com/openstack-dev@lists.openstack.org/msg26258.html Third-Party CI Issue (asselin): direct access to review.openstack.org port 29418 required]l<br />
* Pools implementation [navneet]<br />
** Comparison etherpad https://etherpad.openstack.org/p/cinder-pool-impl-comparison<br />
** Decision to select implementation<br />
* keystoneclient integration with cinderclient [hrybacki / ayoung]<br />
** Discuss integration and collaboration possibilities<br />
<br />
<br />
<br />
'''June 18th, 2014 16:00 UTC'''<br />
* It's review day !?! [jdg]<br />
* Mid cycle meetup plans/updates [jdg]<br />
** https://etherpad.openstack.org/p/CinderMidCycleMeetupAug2014<br />
* Separation of Connectors from Driver/Device Interface (status update) [jdg]<br />
* Updates on 3'rd party CI [jdg]<br />
* Things we need to decide upon (not today, but do your homework for next week)<br />
** Software Define Storage layers/drivers<br />
** Pools implementation<br />
<br />
'''June 11th, 2014 16:00 UTC'''<br />
* Volume replication (ronenkat)<br />
** Blueprint and spec review/comments? https://review.openstack.org/#/c/98308<br />
* oslo.db (jungleboyj)<br />
** Want to quickly discuss the review out there for this: https://review.openstack.org/#/c/77125/<br />
** Move to current oslo.db? Wait for library work?<br />
* oslo logging discussion (jungleboyj)<br />
** Removing translation of debug messages<br />
** Adding _LE, _LI, _LW<br />
* 3rd party cinder ci (asselin)<br />
** Looking for volunteers to test out my fork of jaypipe's 3rd party ci setup which has support for nodepool & http proxies.<br />
** https://github.com/rasselin/os-ext-testing<br />
** https://github.com/rasselin/os-ext-testing-data<br />
** [http://www.mail-archive.com/openstack-dev@lists.openstack.org/msg26258.html Third-Party CI Issue: direct access to review.openstack.org port 29418 required]l<br />
* HDS HNAS Cinder drivers (sombrafam)<br />
** As we are trying check-in for quite a while, we want to get some feedback on the missing steps<br />
** First thread: https://review.openstack.org/#/c/74371/<br />
** Continuation: https://review.openstack.org/#/c/82505/<br />
** Current thread in discussion: https://review.openstack.org/#/c/84244/<br />
* Mid-cyce Sprint (scottda)<br />
** HP in Fort Collins, CO can host on site<br />
** The thought was 10-20 developers<br />
** Large room is available July 14,15,17,18, 21-25, 27-Aug 1 ... Other options exist<br />
* Backend Pools (navneet)<br />
** which way to go? There are two WIPs.<br />
** comparisons between the two approaches? Any wiki/etherpad present or to be prepared for documenting opinions?<br />
<br />
'''June 4th, 2014 16:00 UTC'''<br />
* Volume backup modification (navneet)<br />
** Blueprint and spec review/comments? https://blueprints.launchpad.net/cinder/+spec/vol-backup-service-per-backend<br />
* Dynamic multi pool (navneet)<br />
** Review comments? https://review.openstack.org/#/c/85760/<br />
** Implementation approach comparison.<br />
* 3rd party ci (asselin)<br />
** I have a conflict with another meeting, but my WIP to add nodepool into jaypipe's 3rd party ci solution is available here: https://github.com/rasselin/os-ext-testing/tree/nodepool<br />
* oslo.db (jungleboyj)<br />
** Want to quickly discuss the review out there for this: https://review.openstack.org/#/c/77125/<br />
** Move to current oslo.db? Wait for library work?<br />
** Need to drop off the meeting about 40 minutes in so if we can cover before then it would be appreciated. :-)<br />
<br />
'''May 28th, 2014 16:00 UTC'''<br />
* 3rd Party CI (jungleboyj)<br />
** What tempest test cases to run?<br />
** iSCSI only? What about for FC only drivers then?<br />
** Progress on where to record results?<br />
* SSH host keys (jungleboyj)<br />
** https://launchpad.net/bugs/1320050 and https://bugs.launchpad.net/cinder/+bug/1320056<br />
** Need plan to get this addressed by all drivers using SSH. (New config options?)<br />
** Way to get this backported to Havana?<br />
* Dynamic multi-pools (navneet)<br />
** Status and WIP review (https://review.openstack.org/#/c/85760/)<br />
** Back manager design improvement/rewriting for better rpc message handling.<br />
** Back up service for multi pools.<br />
* cinder-specs (jgriffith)<br />
** Specs repo is live<br />
** Process<br />
** Reviews<br />
<br />
'''May 21st, 2014 16:00 UTC'''<br />
* Consistency Groups (xyang)<br />
** A few people have concerns on the restriction of one volume type per CG. Should we allow one CG to have multiple volume types on the same backend? Let's discuss about it.<br />
* Third-Party CI (jgriffith)<br />
** Who's started, who's planning to and how can we help support each other to get this going smoothly<br />
* Moving GlusterFS snapshot code into the NFS RemoteFs driver (mberlin)<br />
** The GlusterFS snapshot code using qcow2 snapshots is useful for all file based storage systems. I would volunteer to move the GlusterFS snapshot code into the general RemoteFs driver - making it easier to get [https://review.openstack.org/#/c/94186/ our driver] accepted ;-)<br />
** Eric Harney is fine with this and planned to do this for Juno anyway ([https://blueprints.launchpad.net/cinder/+spec/remotefs-snaps see his blueprint]). I've put it on the agenda to make sure others also agree with this approach.<br />
<br />
'''May 7th, 2014 16:00 UTC'''<br />
* Limit == 0 in API [https://review.openstack.org/#/c/86207/ patch review] - thingee<br />
<br />
'''April 16th, 2014 16:00 UTC'''<br />
* Release Status<br />
* Summit Session Updates<br />
* Next Stop ATL!!!<br />
* Cinder resource status - thingee<br />
<br />
'''April 9th, 2014 16:00 UTC'''<br />
(Agenda entered retrospectively)<br />
* Cinder Spec (jgriffith) <br />
** Just a heads up that cinder blueprints will move to a gerrit based process shortly, a la nova. Details and wiki entry to follow.<br />
* RC2 status (jgriffith) <br />
** Just after cutting RC2, a bunch of bugs<br />
* Testing RC code (jgriffith)<br />
** Get on it, folks!<br />
**Looks like theres some serious, intermittent performance issues in the API somewhere...<br />
<br />
'''April 2, 2014 16:00 UTC'''<br />
_Meeting cancelled and summary discussion held on #openstack-cinder_<br />
<br />
* Release status and bugs<br />
* -2s left on reviews from before Junos opened - please check if they are still valid<br />
- https://review.openstack.org/#/c/73446/ (JGriffith)<br />
- https://review.openstack.org/#/c/80550/ (JBryant)<br />
- https://review.openstack.org/#/c/82100/ (Avishay)<br />
- https://review.openstack.org/#/c/74158/ (Avishay)<br />
- + a whole bunch of stable branch stuff<br />
<br />
== Previous meetings ==<br />
<br />
'''Mar 26, 2014 16:00 UTC'''<br />
* RC1 updates (jgriffith)<br />
* Design Summit Sessions (jgriffith)<br />
http://eavesdrop.openstack.org/meetings/cinder/2014/cinder.2014-03-26-16.00.log.html<br />
<br />
'''Mar 19, 2014 16:00 UTC'''<br />
* ProphetStor Driver Exception request for Icehouse (jgriffith)<br />
* Bug status/updates (jgriffith)<br />
* What we should be punting to Juno (aka immediate -2 in Gerrit) (jgriffith)<br />
* Continuous Integration for Cinder Certification (jungleboyj)<br />
<br />
'''Mar 12, 2014 16:00 UTC'''<br />
* Cancelled due to nothing on the agenda. Ad-hoc discussion on #openstack-cinder instead<br />
<br />
'''Mar 5, 2014 16:00 UTC'''<br />
* Volume replication - avishay<br />
* [https://blueprints.launchpad.net/cinder/+spec/lvm-driver-for-shared-storage New LVM-based driver for shared storage] - mtanino<br />
* DRBD/drbdmanage driver for cinder - philr<br />
<br />
'''Feb 19, 2014 16:00 UTC'''<br />
* I3 Status check/updates<br />
* Cert test<br />
* [https://review.openstack.org/#/c/73745 Milestone Consideration for Drivers] -thingee<br />
* [https://etherpad.openstack.org/p/cinder-hack-201402 Hack-a-thon details] -thingee<br />
* [https://review.openstack.org/#/c/66737/ scheduling for local storage] -DuncanT<br />
<br />
'''Feb 5, 2014 16:00 UTC'''<br />
* I3 Status check/updates<br />
* Cert test<br />
* Multiple pools per backend (bswartz)<br />
'''Jan 8, 2014 16:00 UTC'''<br />
* I2 is just around the corner, blueprint updates<br />
* Alternating meeting time proposal, results on feedback<br />
* Driver cert test, it's there... use it<br />
* Prioritizing patches and reviews<br />
'''December 18, 2013 16:00 UTC'''<br />
* Blueprint discussion [https://blueprints.launchpad.net/cinder/+spec/cinder-backup-recover-api cinder backup recovery api - import/export backups] - avishay<br />
* Blueprint discussion [https://blueprints.launchpad.net/cinder/+spec/copy-volume-to-image-task-flow] - Griffith<br />
* [https://blueprints.launchpad.net/cinder/+spec/admin-defined-capabilities Admin-defined capabilities] - Ollie<br />
* Why is type manage an extension? -Thingee<br />
<br />
'''December 11, 2013 16:00 UTC'''<br />
* Proposal of [https://etherpad.openstack.org/p/cinder-extensions extension packages] -Thingee<br />
<br />
'''December 4, 2013 16:00 UTC'''<br />
* Progressing with [https://wiki.openstack.org/wiki/Cinder/blueprints/multi-attach-volume multi-attach / shared-volume] - sgordon<br />
* [https://etherpad.openstack.org/p/icehouse-cinder-acls-for-volumes Access Control List design discussion] - alatynskaya<br />
<br />
'''November 27, 2013 16:00 UTC'''<br />
* [https://etherpad.openstack.org/p/icehouse-cinder-continuous-volume-replication-v2 Updated volume mirroring design] - avishay<br />
* Start using only Mock for new tests... [http://lists.openstack.org/pipermail/openstack-dev/2013-November/018501.html Related Nova Discussion] - Thingee<br />
* Rate limiting came up in the summit, and [http://lists.openstack.org/pipermail/openstack-dev/2013-November/020291.html on openstack-dev] - avishay<br />
* Metadata backup (https://review.openstack.org/#/c/51900/) progress RFC - dosaboy<br />
<br />
<br />
'''November 20, 2013 16:00 UTC'''<br />
* I-1 scheduling - JGriffith<br />
<br />
<br />
'''November 13, 2013 16:00 UTC'''<br />
* patches should update doc files where necessary to ease writing of release notes (Avishay?)<br />
* fencing host from storage (Ehud Trainin)<br />
* Summarize priority of tasks from summit discussions (https://wiki.openstack.org/wiki/Summit/Icehouse/Etherpads#Cinder and https://etherpad.openstack.org/p/cinder-icehouse-summary) Griff<br />
<br />
<br />
'''October 30, 2013 16:00 UTC'''<br />
* cinder backup metadata support - http://goo.gl/Jkg2FV (dosaboy)<br />
* fencing and unfencing host from storage - https://blueprints.launchpad.net/cinder/+spec/fencing-and-unfencing (Ehud Trainin)<br />
<br />
'''October 23, 2013 16:00 UTC'''<br />
* Nexenta backup driver https://review.openstack.org/#/c/47005/ - DuncanT<br />
<br />
<br />
'''October 2, 2013 16:00 UTC'''<br />
* What's still broken in Havana<br />
:* Backups and multibackend (https://code.launchpad.net/bugs/1228223): Fix committed<br />
:* Configuration - Global CONF settings in brick don't belong, and a number of them break multi-backend (Bug #1230066): Fix committed<br />
:* TaskFlow retry mechanism - The majority felt this should be left as a white list, but no work has been done to fix it so we still have ugly failures/roll-backs (3 bugs logged here): '''???'''<br />
:* Quotas - Don't know that anybody has gotten to the bottom of the quota syncing issue (Bug #1202896): '''Still open'''<br />
:* iSCSI Target creation failures - This was thought to have been fixed but showed up last night (Bug #1223469): Probable fix committed<br />
:* Summary of gate issue pertaining to Cinder can be viewed here: http://paste.openstack.org/show/47798/<br />
:* Moving to taskflow - avishay<br />
<br />
<br />
'''Sept 25, 2013, 16:00 UTC'''<br />
* PTL nomination process is open until the 26'th, if you want to run send your nomination proposal out to the dev ML<br />
* What's broken in Havana<br />
:* Backups (specifically when configured with multi-backend volumes)<br />
:* Configuration - Global CONF settings in brick don't belong, and a number of them break multi-backend (Bug #1230066)<br />
:* TaskFlow retry mechanism - The majority felt this should be left as a white list, but no work has been done to fix it so we still have ugly failures/roll-backs (3 bugs logged here)<br />
:* Quotas - Don't know that anybody has gotten to the bottom of the quota syncing issue (Bug #1202896)<br />
:* iSCSI Target creation failures - This was thought to have been fixed but showed up last night (Bug #1223469)<br />
:* ????<br />
* Cinderclient release plans/status? (Eharney)<br />
* OSLO imports (DuncanT)<br />
* bp/cinder-backup-improvements (dosaboy)<br />
* bp/multi-attach (zhiyan)<br />
<br />
<br />
<br />
'''Aug 21, 2013, 16:00 UTC'''<br />
# No agenda, no meeting.<br />
<br />
'''Aug 14, 2013, 16:00 UTC'''<br />
# Volume migration status - avishay<br />
# API extensions using metadata. This comes from the [https://review.openstack.org/#/c/38322/ readonly volume attach support]. - thingee<br />
[http://eavesdrop.openstack.org/meetings/cinder/2013/cinder.2013-08-14-16.00.log.html IRC Log]<br />
<br />
'''Aug 7, 2013, 16:00 UTC'''<br />
# [https://bugs.launchpad.net/cinder/+bug/1209199 RFC - make all rbd clones copy-on-write] -- Dosaboy<br />
# V1 API removal issues, plans and timescales - DuncanT<br />
<br />
== Meeting Minutes ==<br />
<br />
http://eavesdrop.openstack.org/meetings/cinder/2014/<br />
<br />
http://eavesdrop.openstack.org/meetings/cinder/2013/<br />
<br />
http://eavesdrop.openstack.org/meetings/cinder/2012/</div>Ronenkathttps://wiki.openstack.org/w/index.php?title=CinderMeetings&diff=59799CinderMeetings2014-08-06T06:34:14Z<p>Ronenkat: /* Next meeting */</p>
<hr />
<div><br />
= Weekly Cinder team meeting =<br />
'''NOTE MEETING TIME: Wed's at 16:00 UTC'''<br />
<br />
If you're interested in Cinder or Block Storage in general for OpenStack, we have a weekly meetings in <code><nowiki>#openstack-meeting</nowiki></code>, on Wednesdays at 16:00 UTC. Please feel free to add items to the agenda below. NOTE: When adding topics please include your IRC name so we know who's topic it is and how to get more info.<br />
<br />
== Next meeting ==<br />
'''NOTE:''' ''Include your IRC nickname next to agenda items so that you can be called upon in the meeting and arrive at the meeting promptly if placing items in agenda. You might want to put this on your calendar if you are adding items.''<br />
<br />
'''Aug 6th, 2014 16:00 UTC'''<br />
* Cinder mid cycle meetup next week August 12-14 (scottda)<br />
** https://etherpad.openstack.org/p/CinderMidCycleMeetupAug2014 <br />
** HP site should be set. Ping scottda with any issues/problems/concerns<br />
** Virtual meetup will need to be taken care of<br />
* Volume replication (ronenkat)<br />
** Alternative apparoch based jgriffith driver based replication: https://etherpad.openstack.org/p/juno-cinder-volume-replication-apparochs<br />
<br />
== Previous meetings ==<br />
'''July 30'th, 2014 16:00 UTC'''<br />
* Planning cinderclient tag for Thursday morning July 31'st, let's catch up on client changes and testing prior to that (jgriffith)<br />
* Breaking the inheritance between data and control path in Volume drivers https://review.openstack.org/#/c/105923/ (jgriffith)<br />
* Consistency groups https://review.openstack.org/#/c/104732/ (xyang)<br />
* Hitachi Block Storage cinder driver https://review.openstack.org/#/c/90379/ (saguchi)<br />
* Volume replication https://review.openstack.org/#/c/106718/ (ronenkat)<br />
** 17:00 UTC - Volume replication driver owner overview and Q & A<br />
** Callin information: passcode: 6406941 call-in numbers: https://www.teleconference.att.com/servlet/glbAccess?process=1&accessCode=6406941&accessNumber=1809417783#C2<br />
* NFS secure option -- default to 666 vs 660 vs force admin choice (bswartz)<br />
* It is code cleanup tag merge week (DuncanT)<br />
** https://review.openstack.org/#/q/project:openstack/cinder+comment:code_cleanup_batching+-status:merged,n,z<br />
<br />
'''July 23th, 2014 16:00 UTC'''<br />
<br />
* J2 Milestone (DuncanT)<br />
** JGriffith favours a freeze exception for all drivers taht currently have code / BP up, but bouncing all new ones <br />
** Review priorities<br />
*** Driver specs<br />
*** CG groups - a big change that requires driver changes, so needs lots of eyes and time for driver maintainers to do their thing too: https://review.openstack.org/#/c/104743/<br />
*** Pool scheduling - https://review.openstack.org/#/c/98715/<br />
*** Others?<br />
* Plug for weekly 3rd Party CI meeting (Mondays at 18:00 UTC [1 pm Central]) (jungelboyj)<br />
** I attended this week's meeting and gave a high level status. They are looking for more participation.<br />
* ProphetStor Cinder drivers (stevetan)<br />
** Get feedback on progress of our DPL driver and documentation required https://review.openstack.org/#/c/95829/<br />
** Get direction from community for our Federator SDS driver https://review.openstack.org/#/c/99616/<br />
* Volume replication - work in progress (ronenkat)<br />
** https://review.openstack.org/#/c/106718/2<br />
* NFS Security, if there's time.<br />
** https://blueprints.launchpad.net/cinder/+spec/secure-nfs<br />
<br />
'''July 16th, 2014 16:00 UTC'''<br />
* Putting the fun back into cinder development.<br />
** There's been a mailing list thread recently about how nit-picky reviews are getting about typos, white space and the like, and how it is a motivation killer. I'm inclinded to agree - the formatting of the doc strings, fullstops at the end of comments etc doesn't actually improve the code much at all, and getting a -1 for it is a buzz kill of the highest order. Should we leave that sort of thing to the gate, and say that if there is no hacking check for it then it isn't important in general? (DuncanT)<br />
<br />
* How to proceed with cinder/openstack requirements? python-dbus for https://review.openstack.org/99013, see mailing list conclusion http://lists.openstack.org/pipermail/openstack-dev/2014-July/040182.html (flip214)<br />
* code churn, not sure where/when to start, fear of merge conflicts (flip214)<br />
<br />
* Hitachi Block Storage cinder driver (tsekiyama)<br />
** We want to get some feedback about how we can make this forward<br />
** Review: https://review.openstack.org/#/c/90379/<br />
<br />
* Log translations https://review.openstack.org/#/c/105665/ is still stuck - any thoughts? Options I can see: (DuncanT)<br />
** A better technical solution - should be possible where the message format is not expanded outside the logging call i.e.:<br />
Ok:<br />
<nowiki><br />
LOG.warning("The flubigar id %d exploded messily", flu_id)<br />
</nowiki><br />
Not ok:<br />
<nowiki><br />
msg = _("The flubigar id %d exploded messily") % flu_id<br />
LOG.warning(msg)<br />
</nowiki><br />
<br />
** We don't break up our message categories<br />
*** This makes life harder for the translation team, makes us inconsistent with Openstack in general but keeps the code from descending into ugliness<br />
** Related discussion on enabling translation (jungleboyj):<br />
*** Have two patches awaiting approval: Explicit import of _() https://review.openstack.org/105315 and enable lazy translation: https://review.openstack.org/105561<br />
*** Need to get these merged so we are running with the changes.<br />
* 3rd Party CI (jungleboyj):<br />
** Clarification on when drivers are going to be removed.<br />
<br />
== Previous meetings ==<br />
<br />
'''July 9, 2014 16:00 UTC'''<br />
<br />
* flip214 to jgriffith: Status of Separation of Connectors from Driver/Device Interface?<br />
* Quick check: Is everybody happy in principle with the text of https://wiki.openstack.org/wiki/CinderCodeCleanupPatches ? (DuncanT)<br />
<br />
<br />
'''July 2nd, 2014 16:00 UTC'''<br />
* Batching up mechanical code cleanup until the one week after each milestone (DuncanT)<br />
** See https://review.openstack.org/#/c/102872/ for example and https://review.openstack.org/#/c/101847<br />
** Log translations and hacking fixes fall into this class<br />
** Means you only take one bit hit per milestone for rebases<br />
** Does require some tracking so they don't get missed (and I will suck at said tracking, enviably)<br />
* LVM: Support a volume-group on shared storage (mtanino)<br />
** Want to quickly discuss the driver benefit, driver comparison, performance(P8-P14): https://wiki.openstack.org/w/images/0/08/Cinder-Support_LVM_on_a_sharedLU.pdf<br />
** Review comments? https://review.openstack.org/#/c/92479/<br />
* Cinder Third Party CI Names (asselin)<br />
** Online discussion of this thread: http://lists.openstack.org/pipermail/openstack-dev/2014-July/039103.html<br />
<br />
'''June 25th, 2014 16:00 UTC'''<br />
* Consistency groups [xyang]<br />
** Cinder spec review: https://review.openstack.org/#/c/96665/<br />
* CI status [xyang]<br />
** [http://www.mail-archive.com/openstack-dev@lists.openstack.org/msg26258.html Third-Party CI Issue (asselin): direct access to review.openstack.org port 29418 required]l<br />
* Pools implementation [navneet]<br />
** Comparison etherpad https://etherpad.openstack.org/p/cinder-pool-impl-comparison<br />
** Decision to select implementation<br />
* keystoneclient integration with cinderclient [hrybacki / ayoung]<br />
** Discuss integration and collaboration possibilities<br />
<br />
<br />
<br />
'''June 18th, 2014 16:00 UTC'''<br />
* It's review day !?! [jdg]<br />
* Mid cycle meetup plans/updates [jdg]<br />
** https://etherpad.openstack.org/p/CinderMidCycleMeetupAug2014<br />
* Separation of Connectors from Driver/Device Interface (status update) [jdg]<br />
* Updates on 3'rd party CI [jdg]<br />
* Things we need to decide upon (not today, but do your homework for next week)<br />
** Software Define Storage layers/drivers<br />
** Pools implementation<br />
<br />
'''June 11th, 2014 16:00 UTC'''<br />
* Volume replication (ronenkat)<br />
** Blueprint and spec review/comments? https://review.openstack.org/#/c/98308<br />
* oslo.db (jungleboyj)<br />
** Want to quickly discuss the review out there for this: https://review.openstack.org/#/c/77125/<br />
** Move to current oslo.db? Wait for library work?<br />
* oslo logging discussion (jungleboyj)<br />
** Removing translation of debug messages<br />
** Adding _LE, _LI, _LW<br />
* 3rd party cinder ci (asselin)<br />
** Looking for volunteers to test out my fork of jaypipe's 3rd party ci setup which has support for nodepool & http proxies.<br />
** https://github.com/rasselin/os-ext-testing<br />
** https://github.com/rasselin/os-ext-testing-data<br />
** [http://www.mail-archive.com/openstack-dev@lists.openstack.org/msg26258.html Third-Party CI Issue: direct access to review.openstack.org port 29418 required]l<br />
* HDS HNAS Cinder drivers (sombrafam)<br />
** As we are trying check-in for quite a while, we want to get some feedback on the missing steps<br />
** First thread: https://review.openstack.org/#/c/74371/<br />
** Continuation: https://review.openstack.org/#/c/82505/<br />
** Current thread in discussion: https://review.openstack.org/#/c/84244/<br />
* Mid-cyce Sprint (scottda)<br />
** HP in Fort Collins, CO can host on site<br />
** The thought was 10-20 developers<br />
** Large room is available July 14,15,17,18, 21-25, 27-Aug 1 ... Other options exist<br />
* Backend Pools (navneet)<br />
** which way to go? There are two WIPs.<br />
** comparisons between the two approaches? Any wiki/etherpad present or to be prepared for documenting opinions?<br />
<br />
'''June 4th, 2014 16:00 UTC'''<br />
* Volume backup modification (navneet)<br />
** Blueprint and spec review/comments? https://blueprints.launchpad.net/cinder/+spec/vol-backup-service-per-backend<br />
* Dynamic multi pool (navneet)<br />
** Review comments? https://review.openstack.org/#/c/85760/<br />
** Implementation approach comparison.<br />
* 3rd party ci (asselin)<br />
** I have a conflict with another meeting, but my WIP to add nodepool into jaypipe's 3rd party ci solution is available here: https://github.com/rasselin/os-ext-testing/tree/nodepool<br />
* oslo.db (jungleboyj)<br />
** Want to quickly discuss the review out there for this: https://review.openstack.org/#/c/77125/<br />
** Move to current oslo.db? Wait for library work?<br />
** Need to drop off the meeting about 40 minutes in so if we can cover before then it would be appreciated. :-)<br />
<br />
'''May 28th, 2014 16:00 UTC'''<br />
* 3rd Party CI (jungleboyj)<br />
** What tempest test cases to run?<br />
** iSCSI only? What about for FC only drivers then?<br />
** Progress on where to record results?<br />
* SSH host keys (jungleboyj)<br />
** https://launchpad.net/bugs/1320050 and https://bugs.launchpad.net/cinder/+bug/1320056<br />
** Need plan to get this addressed by all drivers using SSH. (New config options?)<br />
** Way to get this backported to Havana?<br />
* Dynamic multi-pools (navneet)<br />
** Status and WIP review (https://review.openstack.org/#/c/85760/)<br />
** Back manager design improvement/rewriting for better rpc message handling.<br />
** Back up service for multi pools.<br />
* cinder-specs (jgriffith)<br />
** Specs repo is live<br />
** Process<br />
** Reviews<br />
<br />
'''May 21st, 2014 16:00 UTC'''<br />
* Consistency Groups (xyang)<br />
** A few people have concerns on the restriction of one volume type per CG. Should we allow one CG to have multiple volume types on the same backend? Let's discuss about it.<br />
* Third-Party CI (jgriffith)<br />
** Who's started, who's planning to and how can we help support each other to get this going smoothly<br />
* Moving GlusterFS snapshot code into the NFS RemoteFs driver (mberlin)<br />
** The GlusterFS snapshot code using qcow2 snapshots is useful for all file based storage systems. I would volunteer to move the GlusterFS snapshot code into the general RemoteFs driver - making it easier to get [https://review.openstack.org/#/c/94186/ our driver] accepted ;-)<br />
** Eric Harney is fine with this and planned to do this for Juno anyway ([https://blueprints.launchpad.net/cinder/+spec/remotefs-snaps see his blueprint]). I've put it on the agenda to make sure others also agree with this approach.<br />
<br />
'''May 7th, 2014 16:00 UTC'''<br />
* Limit == 0 in API [https://review.openstack.org/#/c/86207/ patch review] - thingee<br />
<br />
'''April 16th, 2014 16:00 UTC'''<br />
* Release Status<br />
* Summit Session Updates<br />
* Next Stop ATL!!!<br />
* Cinder resource status - thingee<br />
<br />
'''April 9th, 2014 16:00 UTC'''<br />
(Agenda entered retrospectively)<br />
* Cinder Spec (jgriffith) <br />
** Just a heads up that cinder blueprints will move to a gerrit based process shortly, a la nova. Details and wiki entry to follow.<br />
* RC2 status (jgriffith) <br />
** Just after cutting RC2, a bunch of bugs<br />
* Testing RC code (jgriffith)<br />
** Get on it, folks!<br />
**Looks like theres some serious, intermittent performance issues in the API somewhere...<br />
<br />
'''April 2, 2014 16:00 UTC'''<br />
_Meeting cancelled and summary discussion held on #openstack-cinder_<br />
<br />
* Release status and bugs<br />
* -2s left on reviews from before Junos opened - please check if they are still valid<br />
- https://review.openstack.org/#/c/73446/ (JGriffith)<br />
- https://review.openstack.org/#/c/80550/ (JBryant)<br />
- https://review.openstack.org/#/c/82100/ (Avishay)<br />
- https://review.openstack.org/#/c/74158/ (Avishay)<br />
- + a whole bunch of stable branch stuff<br />
<br />
== Previous meetings ==<br />
<br />
'''Mar 26, 2014 16:00 UTC'''<br />
* RC1 updates (jgriffith)<br />
* Design Summit Sessions (jgriffith)<br />
http://eavesdrop.openstack.org/meetings/cinder/2014/cinder.2014-03-26-16.00.log.html<br />
<br />
'''Mar 19, 2014 16:00 UTC'''<br />
* ProphetStor Driver Exception request for Icehouse (jgriffith)<br />
* Bug status/updates (jgriffith)<br />
* What we should be punting to Juno (aka immediate -2 in Gerrit) (jgriffith)<br />
* Continuous Integration for Cinder Certification (jungleboyj)<br />
<br />
'''Mar 12, 2014 16:00 UTC'''<br />
* Cancelled due to nothing on the agenda. Ad-hoc discussion on #openstack-cinder instead<br />
<br />
'''Mar 5, 2014 16:00 UTC'''<br />
* Volume replication - avishay<br />
* [https://blueprints.launchpad.net/cinder/+spec/lvm-driver-for-shared-storage New LVM-based driver for shared storage] - mtanino<br />
* DRBD/drbdmanage driver for cinder - philr<br />
<br />
'''Feb 19, 2014 16:00 UTC'''<br />
* I3 Status check/updates<br />
* Cert test<br />
* [https://review.openstack.org/#/c/73745 Milestone Consideration for Drivers] -thingee<br />
* [https://etherpad.openstack.org/p/cinder-hack-201402 Hack-a-thon details] -thingee<br />
* [https://review.openstack.org/#/c/66737/ scheduling for local storage] -DuncanT<br />
<br />
'''Feb 5, 2014 16:00 UTC'''<br />
* I3 Status check/updates<br />
* Cert test<br />
* Multiple pools per backend (bswartz)<br />
'''Jan 8, 2014 16:00 UTC'''<br />
* I2 is just around the corner, blueprint updates<br />
* Alternating meeting time proposal, results on feedback<br />
* Driver cert test, it's there... use it<br />
* Prioritizing patches and reviews<br />
'''December 18, 2013 16:00 UTC'''<br />
* Blueprint discussion [https://blueprints.launchpad.net/cinder/+spec/cinder-backup-recover-api cinder backup recovery api - import/export backups] - avishay<br />
* Blueprint discussion [https://blueprints.launchpad.net/cinder/+spec/copy-volume-to-image-task-flow] - Griffith<br />
* [https://blueprints.launchpad.net/cinder/+spec/admin-defined-capabilities Admin-defined capabilities] - Ollie<br />
* Why is type manage an extension? -Thingee<br />
<br />
'''December 11, 2013 16:00 UTC'''<br />
* Proposal of [https://etherpad.openstack.org/p/cinder-extensions extension packages] -Thingee<br />
<br />
'''December 4, 2013 16:00 UTC'''<br />
* Progressing with [https://wiki.openstack.org/wiki/Cinder/blueprints/multi-attach-volume multi-attach / shared-volume] - sgordon<br />
* [https://etherpad.openstack.org/p/icehouse-cinder-acls-for-volumes Access Control List design discussion] - alatynskaya<br />
<br />
'''November 27, 2013 16:00 UTC'''<br />
* [https://etherpad.openstack.org/p/icehouse-cinder-continuous-volume-replication-v2 Updated volume mirroring design] - avishay<br />
* Start using only Mock for new tests... [http://lists.openstack.org/pipermail/openstack-dev/2013-November/018501.html Related Nova Discussion] - Thingee<br />
* Rate limiting came up in the summit, and [http://lists.openstack.org/pipermail/openstack-dev/2013-November/020291.html on openstack-dev] - avishay<br />
* Metadata backup (https://review.openstack.org/#/c/51900/) progress RFC - dosaboy<br />
<br />
<br />
'''November 20, 2013 16:00 UTC'''<br />
* I-1 scheduling - JGriffith<br />
<br />
<br />
'''November 13, 2013 16:00 UTC'''<br />
* patches should update doc files where necessary to ease writing of release notes (Avishay?)<br />
* fencing host from storage (Ehud Trainin)<br />
* Summarize priority of tasks from summit discussions (https://wiki.openstack.org/wiki/Summit/Icehouse/Etherpads#Cinder and https://etherpad.openstack.org/p/cinder-icehouse-summary) Griff<br />
<br />
<br />
'''October 30, 2013 16:00 UTC'''<br />
* cinder backup metadata support - http://goo.gl/Jkg2FV (dosaboy)<br />
* fencing and unfencing host from storage - https://blueprints.launchpad.net/cinder/+spec/fencing-and-unfencing (Ehud Trainin)<br />
<br />
'''October 23, 2013 16:00 UTC'''<br />
* Nexenta backup driver https://review.openstack.org/#/c/47005/ - DuncanT<br />
<br />
<br />
'''October 2, 2013 16:00 UTC'''<br />
* What's still broken in Havana<br />
:* Backups and multibackend (https://code.launchpad.net/bugs/1228223): Fix committed<br />
:* Configuration - Global CONF settings in brick don't belong, and a number of them break multi-backend (Bug #1230066): Fix committed<br />
:* TaskFlow retry mechanism - The majority felt this should be left as a white list, but no work has been done to fix it so we still have ugly failures/roll-backs (3 bugs logged here): '''???'''<br />
:* Quotas - Don't know that anybody has gotten to the bottom of the quota syncing issue (Bug #1202896): '''Still open'''<br />
:* iSCSI Target creation failures - This was thought to have been fixed but showed up last night (Bug #1223469): Probable fix committed<br />
:* Summary of gate issue pertaining to Cinder can be viewed here: http://paste.openstack.org/show/47798/<br />
:* Moving to taskflow - avishay<br />
<br />
<br />
'''Sept 25, 2013, 16:00 UTC'''<br />
* PTL nomination process is open until the 26'th, if you want to run send your nomination proposal out to the dev ML<br />
* What's broken in Havana<br />
:* Backups (specifically when configured with multi-backend volumes)<br />
:* Configuration - Global CONF settings in brick don't belong, and a number of them break multi-backend (Bug #1230066)<br />
:* TaskFlow retry mechanism - The majority felt this should be left as a white list, but no work has been done to fix it so we still have ugly failures/roll-backs (3 bugs logged here)<br />
:* Quotas - Don't know that anybody has gotten to the bottom of the quota syncing issue (Bug #1202896)<br />
:* iSCSI Target creation failures - This was thought to have been fixed but showed up last night (Bug #1223469)<br />
:* ????<br />
* Cinderclient release plans/status? (Eharney)<br />
* OSLO imports (DuncanT)<br />
* bp/cinder-backup-improvements (dosaboy)<br />
* bp/multi-attach (zhiyan)<br />
<br />
<br />
<br />
'''Aug 21, 2013, 16:00 UTC'''<br />
# No agenda, no meeting.<br />
<br />
'''Aug 14, 2013, 16:00 UTC'''<br />
# Volume migration status - avishay<br />
# API extensions using metadata. This comes from the [https://review.openstack.org/#/c/38322/ readonly volume attach support]. - thingee<br />
[http://eavesdrop.openstack.org/meetings/cinder/2013/cinder.2013-08-14-16.00.log.html IRC Log]<br />
<br />
'''Aug 7, 2013, 16:00 UTC'''<br />
# [https://bugs.launchpad.net/cinder/+bug/1209199 RFC - make all rbd clones copy-on-write] -- Dosaboy<br />
# V1 API removal issues, plans and timescales - DuncanT<br />
<br />
== Meeting Minutes ==<br />
<br />
http://eavesdrop.openstack.org/meetings/cinder/2014/<br />
<br />
http://eavesdrop.openstack.org/meetings/cinder/2013/<br />
<br />
http://eavesdrop.openstack.org/meetings/cinder/2012/</div>Ronenkathttps://wiki.openstack.org/w/index.php?title=CinderMeetings&diff=59795CinderMeetings2014-08-06T05:24:07Z<p>Ronenkat: /* Next meeting */</p>
<hr />
<div><br />
= Weekly Cinder team meeting =<br />
'''NOTE MEETING TIME: Wed's at 16:00 UTC'''<br />
<br />
If you're interested in Cinder or Block Storage in general for OpenStack, we have a weekly meetings in <code><nowiki>#openstack-meeting</nowiki></code>, on Wednesdays at 16:00 UTC. Please feel free to add items to the agenda below. NOTE: When adding topics please include your IRC name so we know who's topic it is and how to get more info.<br />
<br />
== Next meeting ==<br />
'''NOTE:''' ''Include your IRC nickname next to agenda items so that you can be called upon in the meeting and arrive at the meeting promptly if placing items in agenda. You might want to put this on your calendar if you are adding items.''<br />
<br />
'''Aug 6th, 2014 16:00 UTC'''<br />
* Cinder mid cycle meetup next week August 12-14 (scottda)<br />
** https://etherpad.openstack.org/p/CinderMidCycleMeetupAug2014 <br />
** HP site should be set. Ping scottda with any issues/problems/concerns<br />
** Virtual meetup will need to be taken care of<br />
* Volume replication (ronenkat)<br />
<br />
== Previous meetings ==<br />
'''July 30'th, 2014 16:00 UTC'''<br />
* Planning cinderclient tag for Thursday morning July 31'st, let's catch up on client changes and testing prior to that (jgriffith)<br />
* Breaking the inheritance between data and control path in Volume drivers https://review.openstack.org/#/c/105923/ (jgriffith)<br />
* Consistency groups https://review.openstack.org/#/c/104732/ (xyang)<br />
* Hitachi Block Storage cinder driver https://review.openstack.org/#/c/90379/ (saguchi)<br />
* Volume replication https://review.openstack.org/#/c/106718/ (ronenkat)<br />
** 17:00 UTC - Volume replication driver owner overview and Q & A<br />
** Callin information: passcode: 6406941 call-in numbers: https://www.teleconference.att.com/servlet/glbAccess?process=1&accessCode=6406941&accessNumber=1809417783#C2<br />
* NFS secure option -- default to 666 vs 660 vs force admin choice (bswartz)<br />
* It is code cleanup tag merge week (DuncanT)<br />
** https://review.openstack.org/#/q/project:openstack/cinder+comment:code_cleanup_batching+-status:merged,n,z<br />
<br />
'''July 23th, 2014 16:00 UTC'''<br />
<br />
* J2 Milestone (DuncanT)<br />
** JGriffith favours a freeze exception for all drivers taht currently have code / BP up, but bouncing all new ones <br />
** Review priorities<br />
*** Driver specs<br />
*** CG groups - a big change that requires driver changes, so needs lots of eyes and time for driver maintainers to do their thing too: https://review.openstack.org/#/c/104743/<br />
*** Pool scheduling - https://review.openstack.org/#/c/98715/<br />
*** Others?<br />
* Plug for weekly 3rd Party CI meeting (Mondays at 18:00 UTC [1 pm Central]) (jungelboyj)<br />
** I attended this week's meeting and gave a high level status. They are looking for more participation.<br />
* ProphetStor Cinder drivers (stevetan)<br />
** Get feedback on progress of our DPL driver and documentation required https://review.openstack.org/#/c/95829/<br />
** Get direction from community for our Federator SDS driver https://review.openstack.org/#/c/99616/<br />
* Volume replication - work in progress (ronenkat)<br />
** https://review.openstack.org/#/c/106718/2<br />
* NFS Security, if there's time.<br />
** https://blueprints.launchpad.net/cinder/+spec/secure-nfs<br />
<br />
'''July 16th, 2014 16:00 UTC'''<br />
* Putting the fun back into cinder development.<br />
** There's been a mailing list thread recently about how nit-picky reviews are getting about typos, white space and the like, and how it is a motivation killer. I'm inclinded to agree - the formatting of the doc strings, fullstops at the end of comments etc doesn't actually improve the code much at all, and getting a -1 for it is a buzz kill of the highest order. Should we leave that sort of thing to the gate, and say that if there is no hacking check for it then it isn't important in general? (DuncanT)<br />
<br />
* How to proceed with cinder/openstack requirements? python-dbus for https://review.openstack.org/99013, see mailing list conclusion http://lists.openstack.org/pipermail/openstack-dev/2014-July/040182.html (flip214)<br />
* code churn, not sure where/when to start, fear of merge conflicts (flip214)<br />
<br />
* Hitachi Block Storage cinder driver (tsekiyama)<br />
** We want to get some feedback about how we can make this forward<br />
** Review: https://review.openstack.org/#/c/90379/<br />
<br />
* Log translations https://review.openstack.org/#/c/105665/ is still stuck - any thoughts? Options I can see: (DuncanT)<br />
** A better technical solution - should be possible where the message format is not expanded outside the logging call i.e.:<br />
Ok:<br />
<nowiki><br />
LOG.warning("The flubigar id %d exploded messily", flu_id)<br />
</nowiki><br />
Not ok:<br />
<nowiki><br />
msg = _("The flubigar id %d exploded messily") % flu_id<br />
LOG.warning(msg)<br />
</nowiki><br />
<br />
** We don't break up our message categories<br />
*** This makes life harder for the translation team, makes us inconsistent with Openstack in general but keeps the code from descending into ugliness<br />
** Related discussion on enabling translation (jungleboyj):<br />
*** Have two patches awaiting approval: Explicit import of _() https://review.openstack.org/105315 and enable lazy translation: https://review.openstack.org/105561<br />
*** Need to get these merged so we are running with the changes.<br />
* 3rd Party CI (jungleboyj):<br />
** Clarification on when drivers are going to be removed.<br />
<br />
== Previous meetings ==<br />
<br />
'''July 9, 2014 16:00 UTC'''<br />
<br />
* flip214 to jgriffith: Status of Separation of Connectors from Driver/Device Interface?<br />
* Quick check: Is everybody happy in principle with the text of https://wiki.openstack.org/wiki/CinderCodeCleanupPatches ? (DuncanT)<br />
<br />
<br />
'''July 2nd, 2014 16:00 UTC'''<br />
* Batching up mechanical code cleanup until the one week after each milestone (DuncanT)<br />
** See https://review.openstack.org/#/c/102872/ for example and https://review.openstack.org/#/c/101847<br />
** Log translations and hacking fixes fall into this class<br />
** Means you only take one bit hit per milestone for rebases<br />
** Does require some tracking so they don't get missed (and I will suck at said tracking, enviably)<br />
* LVM: Support a volume-group on shared storage (mtanino)<br />
** Want to quickly discuss the driver benefit, driver comparison, performance(P8-P14): https://wiki.openstack.org/w/images/0/08/Cinder-Support_LVM_on_a_sharedLU.pdf<br />
** Review comments? https://review.openstack.org/#/c/92479/<br />
* Cinder Third Party CI Names (asselin)<br />
** Online discussion of this thread: http://lists.openstack.org/pipermail/openstack-dev/2014-July/039103.html<br />
<br />
'''June 25th, 2014 16:00 UTC'''<br />
* Consistency groups [xyang]<br />
** Cinder spec review: https://review.openstack.org/#/c/96665/<br />
* CI status [xyang]<br />
** [http://www.mail-archive.com/openstack-dev@lists.openstack.org/msg26258.html Third-Party CI Issue (asselin): direct access to review.openstack.org port 29418 required]l<br />
* Pools implementation [navneet]<br />
** Comparison etherpad https://etherpad.openstack.org/p/cinder-pool-impl-comparison<br />
** Decision to select implementation<br />
* keystoneclient integration with cinderclient [hrybacki / ayoung]<br />
** Discuss integration and collaboration possibilities<br />
<br />
<br />
<br />
'''June 18th, 2014 16:00 UTC'''<br />
* It's review day !?! [jdg]<br />
* Mid cycle meetup plans/updates [jdg]<br />
** https://etherpad.openstack.org/p/CinderMidCycleMeetupAug2014<br />
* Separation of Connectors from Driver/Device Interface (status update) [jdg]<br />
* Updates on 3'rd party CI [jdg]<br />
* Things we need to decide upon (not today, but do your homework for next week)<br />
** Software Define Storage layers/drivers<br />
** Pools implementation<br />
<br />
'''June 11th, 2014 16:00 UTC'''<br />
* Volume replication (ronenkat)<br />
** Blueprint and spec review/comments? https://review.openstack.org/#/c/98308<br />
* oslo.db (jungleboyj)<br />
** Want to quickly discuss the review out there for this: https://review.openstack.org/#/c/77125/<br />
** Move to current oslo.db? Wait for library work?<br />
* oslo logging discussion (jungleboyj)<br />
** Removing translation of debug messages<br />
** Adding _LE, _LI, _LW<br />
* 3rd party cinder ci (asselin)<br />
** Looking for volunteers to test out my fork of jaypipe's 3rd party ci setup which has support for nodepool & http proxies.<br />
** https://github.com/rasselin/os-ext-testing<br />
** https://github.com/rasselin/os-ext-testing-data<br />
** [http://www.mail-archive.com/openstack-dev@lists.openstack.org/msg26258.html Third-Party CI Issue: direct access to review.openstack.org port 29418 required]l<br />
* HDS HNAS Cinder drivers (sombrafam)<br />
** As we are trying check-in for quite a while, we want to get some feedback on the missing steps<br />
** First thread: https://review.openstack.org/#/c/74371/<br />
** Continuation: https://review.openstack.org/#/c/82505/<br />
** Current thread in discussion: https://review.openstack.org/#/c/84244/<br />
* Mid-cyce Sprint (scottda)<br />
** HP in Fort Collins, CO can host on site<br />
** The thought was 10-20 developers<br />
** Large room is available July 14,15,17,18, 21-25, 27-Aug 1 ... Other options exist<br />
* Backend Pools (navneet)<br />
** which way to go? There are two WIPs.<br />
** comparisons between the two approaches? Any wiki/etherpad present or to be prepared for documenting opinions?<br />
<br />
'''June 4th, 2014 16:00 UTC'''<br />
* Volume backup modification (navneet)<br />
** Blueprint and spec review/comments? https://blueprints.launchpad.net/cinder/+spec/vol-backup-service-per-backend<br />
* Dynamic multi pool (navneet)<br />
** Review comments? https://review.openstack.org/#/c/85760/<br />
** Implementation approach comparison.<br />
* 3rd party ci (asselin)<br />
** I have a conflict with another meeting, but my WIP to add nodepool into jaypipe's 3rd party ci solution is available here: https://github.com/rasselin/os-ext-testing/tree/nodepool<br />
* oslo.db (jungleboyj)<br />
** Want to quickly discuss the review out there for this: https://review.openstack.org/#/c/77125/<br />
** Move to current oslo.db? Wait for library work?<br />
** Need to drop off the meeting about 40 minutes in so if we can cover before then it would be appreciated. :-)<br />
<br />
'''May 28th, 2014 16:00 UTC'''<br />
* 3rd Party CI (jungleboyj)<br />
** What tempest test cases to run?<br />
** iSCSI only? What about for FC only drivers then?<br />
** Progress on where to record results?<br />
* SSH host keys (jungleboyj)<br />
** https://launchpad.net/bugs/1320050 and https://bugs.launchpad.net/cinder/+bug/1320056<br />
** Need plan to get this addressed by all drivers using SSH. (New config options?)<br />
** Way to get this backported to Havana?<br />
* Dynamic multi-pools (navneet)<br />
** Status and WIP review (https://review.openstack.org/#/c/85760/)<br />
** Back manager design improvement/rewriting for better rpc message handling.<br />
** Back up service for multi pools.<br />
* cinder-specs (jgriffith)<br />
** Specs repo is live<br />
** Process<br />
** Reviews<br />
<br />
'''May 21st, 2014 16:00 UTC'''<br />
* Consistency Groups (xyang)<br />
** A few people have concerns on the restriction of one volume type per CG. Should we allow one CG to have multiple volume types on the same backend? Let's discuss about it.<br />
* Third-Party CI (jgriffith)<br />
** Who's started, who's planning to and how can we help support each other to get this going smoothly<br />
* Moving GlusterFS snapshot code into the NFS RemoteFs driver (mberlin)<br />
** The GlusterFS snapshot code using qcow2 snapshots is useful for all file based storage systems. I would volunteer to move the GlusterFS snapshot code into the general RemoteFs driver - making it easier to get [https://review.openstack.org/#/c/94186/ our driver] accepted ;-)<br />
** Eric Harney is fine with this and planned to do this for Juno anyway ([https://blueprints.launchpad.net/cinder/+spec/remotefs-snaps see his blueprint]). I've put it on the agenda to make sure others also agree with this approach.<br />
<br />
'''May 7th, 2014 16:00 UTC'''<br />
* Limit == 0 in API [https://review.openstack.org/#/c/86207/ patch review] - thingee<br />
<br />
'''April 16th, 2014 16:00 UTC'''<br />
* Release Status<br />
* Summit Session Updates<br />
* Next Stop ATL!!!<br />
* Cinder resource status - thingee<br />
<br />
'''April 9th, 2014 16:00 UTC'''<br />
(Agenda entered retrospectively)<br />
* Cinder Spec (jgriffith) <br />
** Just a heads up that cinder blueprints will move to a gerrit based process shortly, a la nova. Details and wiki entry to follow.<br />
* RC2 status (jgriffith) <br />
** Just after cutting RC2, a bunch of bugs<br />
* Testing RC code (jgriffith)<br />
** Get on it, folks!<br />
**Looks like theres some serious, intermittent performance issues in the API somewhere...<br />
<br />
'''April 2, 2014 16:00 UTC'''<br />
_Meeting cancelled and summary discussion held on #openstack-cinder_<br />
<br />
* Release status and bugs<br />
* -2s left on reviews from before Junos opened - please check if they are still valid<br />
- https://review.openstack.org/#/c/73446/ (JGriffith)<br />
- https://review.openstack.org/#/c/80550/ (JBryant)<br />
- https://review.openstack.org/#/c/82100/ (Avishay)<br />
- https://review.openstack.org/#/c/74158/ (Avishay)<br />
- + a whole bunch of stable branch stuff<br />
<br />
== Previous meetings ==<br />
<br />
'''Mar 26, 2014 16:00 UTC'''<br />
* RC1 updates (jgriffith)<br />
* Design Summit Sessions (jgriffith)<br />
http://eavesdrop.openstack.org/meetings/cinder/2014/cinder.2014-03-26-16.00.log.html<br />
<br />
'''Mar 19, 2014 16:00 UTC'''<br />
* ProphetStor Driver Exception request for Icehouse (jgriffith)<br />
* Bug status/updates (jgriffith)<br />
* What we should be punting to Juno (aka immediate -2 in Gerrit) (jgriffith)<br />
* Continuous Integration for Cinder Certification (jungleboyj)<br />
<br />
'''Mar 12, 2014 16:00 UTC'''<br />
* Cancelled due to nothing on the agenda. Ad-hoc discussion on #openstack-cinder instead<br />
<br />
'''Mar 5, 2014 16:00 UTC'''<br />
* Volume replication - avishay<br />
* [https://blueprints.launchpad.net/cinder/+spec/lvm-driver-for-shared-storage New LVM-based driver for shared storage] - mtanino<br />
* DRBD/drbdmanage driver for cinder - philr<br />
<br />
'''Feb 19, 2014 16:00 UTC'''<br />
* I3 Status check/updates<br />
* Cert test<br />
* [https://review.openstack.org/#/c/73745 Milestone Consideration for Drivers] -thingee<br />
* [https://etherpad.openstack.org/p/cinder-hack-201402 Hack-a-thon details] -thingee<br />
* [https://review.openstack.org/#/c/66737/ scheduling for local storage] -DuncanT<br />
<br />
'''Feb 5, 2014 16:00 UTC'''<br />
* I3 Status check/updates<br />
* Cert test<br />
* Multiple pools per backend (bswartz)<br />
'''Jan 8, 2014 16:00 UTC'''<br />
* I2 is just around the corner, blueprint updates<br />
* Alternating meeting time proposal, results on feedback<br />
* Driver cert test, it's there... use it<br />
* Prioritizing patches and reviews<br />
'''December 18, 2013 16:00 UTC'''<br />
* Blueprint discussion [https://blueprints.launchpad.net/cinder/+spec/cinder-backup-recover-api cinder backup recovery api - import/export backups] - avishay<br />
* Blueprint discussion [https://blueprints.launchpad.net/cinder/+spec/copy-volume-to-image-task-flow] - Griffith<br />
* [https://blueprints.launchpad.net/cinder/+spec/admin-defined-capabilities Admin-defined capabilities] - Ollie<br />
* Why is type manage an extension? -Thingee<br />
<br />
'''December 11, 2013 16:00 UTC'''<br />
* Proposal of [https://etherpad.openstack.org/p/cinder-extensions extension packages] -Thingee<br />
<br />
'''December 4, 2013 16:00 UTC'''<br />
* Progressing with [https://wiki.openstack.org/wiki/Cinder/blueprints/multi-attach-volume multi-attach / shared-volume] - sgordon<br />
* [https://etherpad.openstack.org/p/icehouse-cinder-acls-for-volumes Access Control List design discussion] - alatynskaya<br />
<br />
'''November 27, 2013 16:00 UTC'''<br />
* [https://etherpad.openstack.org/p/icehouse-cinder-continuous-volume-replication-v2 Updated volume mirroring design] - avishay<br />
* Start using only Mock for new tests... [http://lists.openstack.org/pipermail/openstack-dev/2013-November/018501.html Related Nova Discussion] - Thingee<br />
* Rate limiting came up in the summit, and [http://lists.openstack.org/pipermail/openstack-dev/2013-November/020291.html on openstack-dev] - avishay<br />
* Metadata backup (https://review.openstack.org/#/c/51900/) progress RFC - dosaboy<br />
<br />
<br />
'''November 20, 2013 16:00 UTC'''<br />
* I-1 scheduling - JGriffith<br />
<br />
<br />
'''November 13, 2013 16:00 UTC'''<br />
* patches should update doc files where necessary to ease writing of release notes (Avishay?)<br />
* fencing host from storage (Ehud Trainin)<br />
* Summarize priority of tasks from summit discussions (https://wiki.openstack.org/wiki/Summit/Icehouse/Etherpads#Cinder and https://etherpad.openstack.org/p/cinder-icehouse-summary) Griff<br />
<br />
<br />
'''October 30, 2013 16:00 UTC'''<br />
* cinder backup metadata support - http://goo.gl/Jkg2FV (dosaboy)<br />
* fencing and unfencing host from storage - https://blueprints.launchpad.net/cinder/+spec/fencing-and-unfencing (Ehud Trainin)<br />
<br />
'''October 23, 2013 16:00 UTC'''<br />
* Nexenta backup driver https://review.openstack.org/#/c/47005/ - DuncanT<br />
<br />
<br />
'''October 2, 2013 16:00 UTC'''<br />
* What's still broken in Havana<br />
:* Backups and multibackend (https://code.launchpad.net/bugs/1228223): Fix committed<br />
:* Configuration - Global CONF settings in brick don't belong, and a number of them break multi-backend (Bug #1230066): Fix committed<br />
:* TaskFlow retry mechanism - The majority felt this should be left as a white list, but no work has been done to fix it so we still have ugly failures/roll-backs (3 bugs logged here): '''???'''<br />
:* Quotas - Don't know that anybody has gotten to the bottom of the quota syncing issue (Bug #1202896): '''Still open'''<br />
:* iSCSI Target creation failures - This was thought to have been fixed but showed up last night (Bug #1223469): Probable fix committed<br />
:* Summary of gate issue pertaining to Cinder can be viewed here: http://paste.openstack.org/show/47798/<br />
:* Moving to taskflow - avishay<br />
<br />
<br />
'''Sept 25, 2013, 16:00 UTC'''<br />
* PTL nomination process is open until the 26'th, if you want to run send your nomination proposal out to the dev ML<br />
* What's broken in Havana<br />
:* Backups (specifically when configured with multi-backend volumes)<br />
:* Configuration - Global CONF settings in brick don't belong, and a number of them break multi-backend (Bug #1230066)<br />
:* TaskFlow retry mechanism - The majority felt this should be left as a white list, but no work has been done to fix it so we still have ugly failures/roll-backs (3 bugs logged here)<br />
:* Quotas - Don't know that anybody has gotten to the bottom of the quota syncing issue (Bug #1202896)<br />
:* iSCSI Target creation failures - This was thought to have been fixed but showed up last night (Bug #1223469)<br />
:* ????<br />
* Cinderclient release plans/status? (Eharney)<br />
* OSLO imports (DuncanT)<br />
* bp/cinder-backup-improvements (dosaboy)<br />
* bp/multi-attach (zhiyan)<br />
<br />
<br />
<br />
'''Aug 21, 2013, 16:00 UTC'''<br />
# No agenda, no meeting.<br />
<br />
'''Aug 14, 2013, 16:00 UTC'''<br />
# Volume migration status - avishay<br />
# API extensions using metadata. This comes from the [https://review.openstack.org/#/c/38322/ readonly volume attach support]. - thingee<br />
[http://eavesdrop.openstack.org/meetings/cinder/2013/cinder.2013-08-14-16.00.log.html IRC Log]<br />
<br />
'''Aug 7, 2013, 16:00 UTC'''<br />
# [https://bugs.launchpad.net/cinder/+bug/1209199 RFC - make all rbd clones copy-on-write] -- Dosaboy<br />
# V1 API removal issues, plans and timescales - DuncanT<br />
<br />
== Meeting Minutes ==<br />
<br />
http://eavesdrop.openstack.org/meetings/cinder/2014/<br />
<br />
http://eavesdrop.openstack.org/meetings/cinder/2013/<br />
<br />
http://eavesdrop.openstack.org/meetings/cinder/2012/</div>Ronenkathttps://wiki.openstack.org/w/index.php?title=CinderMeetings&diff=59411CinderMeetings2014-07-30T12:13:50Z<p>Ronenkat: /* Next meeting */</p>
<hr />
<div><br />
= Weekly Cinder team meeting =<br />
'''NOTE MEETING TIME: Wed's at 16:00 UTC'''<br />
<br />
If you're interested in Cinder or Block Storage in general for OpenStack, we have a weekly meetings in <code><nowiki>#openstack-meeting</nowiki></code>, on Wednesdays at 16:00 UTC. Please feel free to add items to the agenda below. NOTE: When adding topics please include your IRC name so we know who's topic it is and how to get more info.<br />
<br />
== Next meeting ==<br />
'''NOTE:''' ''Include your IRC nickname next to agenda items so that you can be called upon in the meeting and arrive at the meeting promptly if placing items in agenda. You might want to put this on your calendar if you are adding items.''<br />
<br />
'''July 30'th, 2014 16:00 UTC'''<br />
* Planning cinderclient tag for Thursday morning July 31'st, let's catch up on client changes and testing prior to that (jgriffith)<br />
* Breaking the inheritance between data and control path in Volume drivers https://review.openstack.org/#/c/107205/ (jgriffith)<br />
* Consistency groups https://review.openstack.org/#/c/104732/ (xyang)<br />
* Hitachi Block Storage cinder driver https://review.openstack.org/#/c/90379/ (saguchi)<br />
* Volume replication https://review.openstack.org/#/c/106718/ (ronenkat)<br />
** 17:00 UTC - Volume replication driver owner overview and Q & A<br />
** Callin information: passcode: 6406941 call-in numbers: https://www.teleconference.att.com/servlet/glbAccess?process=1&accessCode=6406941&accessNumber=1809417783#C2<br />
<br />
== Previous meetings ==<br />
'''July 23th, 2014 16:00 UTC'''<br />
<br />
* J2 Milestone (DuncanT)<br />
** JGriffith favours a freeze exception for all drivers taht currently have code / BP up, but bouncing all new ones <br />
** Review priorities<br />
*** Driver specs<br />
*** CG groups - a big change that requires driver changes, so needs lots of eyes and time for driver maintainers to do their thing too: https://review.openstack.org/#/c/104743/<br />
*** Pool scheduling - https://review.openstack.org/#/c/98715/<br />
*** Others?<br />
* Plug for weekly 3rd Party CI meeting (Mondays at 18:00 UTC [1 pm Central]) (jungelboyj)<br />
** I attended this week's meeting and gave a high level status. They are looking for more participation.<br />
* ProphetStor Cinder drivers (stevetan)<br />
** Get feedback on progress of our DPL driver and documentation required https://review.openstack.org/#/c/95829/<br />
** Get direction from community for our Federator SDS driver https://review.openstack.org/#/c/99616/<br />
* Volume replication - work in progress (ronenkat)<br />
** https://review.openstack.org/#/c/106718/2<br />
* NFS Security, if there's time.<br />
** https://blueprints.launchpad.net/cinder/+spec/secure-nfs<br />
<br />
'''July 16th, 2014 16:00 UTC'''<br />
* Putting the fun back into cinder development.<br />
** There's been a mailing list thread recently about how nit-picky reviews are getting about typos, white space and the like, and how it is a motivation killer. I'm inclinded to agree - the formatting of the doc strings, fullstops at the end of comments etc doesn't actually improve the code much at all, and getting a -1 for it is a buzz kill of the highest order. Should we leave that sort of thing to the gate, and say that if there is no hacking check for it then it isn't important in general? (DuncanT)<br />
<br />
* How to proceed with cinder/openstack requirements? python-dbus for https://review.openstack.org/99013, see mailing list conclusion http://lists.openstack.org/pipermail/openstack-dev/2014-July/040182.html (flip214)<br />
* code churn, not sure where/when to start, fear of merge conflicts (flip214)<br />
<br />
* Hitachi Block Storage cinder driver (tsekiyama)<br />
** We want to get some feedback about how we can make this forward<br />
** Review: https://review.openstack.org/#/c/90379/<br />
<br />
* Log translations https://review.openstack.org/#/c/105665/ is still stuck - any thoughts? Options I can see: (DuncanT)<br />
** A better technical solution - should be possible where the message format is not expanded outside the logging call i.e.:<br />
Ok:<br />
<nowiki><br />
LOG.warning("The flubigar id %d exploded messily", flu_id)<br />
</nowiki><br />
Not ok:<br />
<nowiki><br />
msg = _("The flubigar id %d exploded messily") % flu_id<br />
LOG.warning(msg)<br />
</nowiki><br />
<br />
** We don't break up our message categories<br />
*** This makes life harder for the translation team, makes us inconsistent with Openstack in general but keeps the code from descending into ugliness<br />
** Related discussion on enabling translation (jungleboyj):<br />
*** Have two patches awaiting approval: Explicit import of _() https://review.openstack.org/105315 and enable lazy translation: https://review.openstack.org/105561<br />
*** Need to get these merged so we are running with the changes.<br />
* 3rd Party CI (jungleboyj):<br />
** Clarification on when drivers are going to be removed.<br />
<br />
== Previous meetings ==<br />
<br />
'''July 9, 2014 16:00 UTC'''<br />
<br />
* flip214 to jgriffith: Status of Separation of Connectors from Driver/Device Interface?<br />
* Quick check: Is everybody happy in principle with the text of https://wiki.openstack.org/wiki/CinderCodeCleanupPatches ? (DuncanT)<br />
<br />
<br />
'''July 2nd, 2014 16:00 UTC'''<br />
* Batching up mechanical code cleanup until the one week after each milestone (DuncanT)<br />
** See https://review.openstack.org/#/c/102872/ for example and https://review.openstack.org/#/c/101847<br />
** Log translations and hacking fixes fall into this class<br />
** Means you only take one bit hit per milestone for rebases<br />
** Does require some tracking so they don't get missed (and I will suck at said tracking, enviably)<br />
* LVM: Support a volume-group on shared storage (mtanino)<br />
** Want to quickly discuss the driver benefit, driver comparison, performance(P8-P14): https://wiki.openstack.org/w/images/0/08/Cinder-Support_LVM_on_a_sharedLU.pdf<br />
** Review comments? https://review.openstack.org/#/c/92479/<br />
* Cinder Third Party CI Names (asselin)<br />
** Online discussion of this thread: http://lists.openstack.org/pipermail/openstack-dev/2014-July/039103.html<br />
<br />
'''June 25th, 2014 16:00 UTC'''<br />
* Consistency groups [xyang]<br />
** Cinder spec review: https://review.openstack.org/#/c/96665/<br />
* CI status [xyang]<br />
** [http://www.mail-archive.com/openstack-dev@lists.openstack.org/msg26258.html Third-Party CI Issue (asselin): direct access to review.openstack.org port 29418 required]l<br />
* Pools implementation [navneet]<br />
** Comparison etherpad https://etherpad.openstack.org/p/cinder-pool-impl-comparison<br />
** Decision to select implementation<br />
* keystoneclient integration with cinderclient [hrybacki / ayoung]<br />
** Discuss integration and collaboration possibilities<br />
<br />
<br />
<br />
'''June 18th, 2014 16:00 UTC'''<br />
* It's review day !?! [jdg]<br />
* Mid cycle meetup plans/updates [jdg]<br />
** https://etherpad.openstack.org/p/CinderMidCycleMeetupAug2014<br />
* Separation of Connectors from Driver/Device Interface (status update) [jdg]<br />
* Updates on 3'rd party CI [jdg]<br />
* Things we need to decide upon (not today, but do your homework for next week)<br />
** Software Define Storage layers/drivers<br />
** Pools implementation<br />
<br />
'''June 11th, 2014 16:00 UTC'''<br />
* Volume replication (ronenkat)<br />
** Blueprint and spec review/comments? https://review.openstack.org/#/c/98308<br />
* oslo.db (jungleboyj)<br />
** Want to quickly discuss the review out there for this: https://review.openstack.org/#/c/77125/<br />
** Move to current oslo.db? Wait for library work?<br />
* oslo logging discussion (jungleboyj)<br />
** Removing translation of debug messages<br />
** Adding _LE, _LI, _LW<br />
* 3rd party cinder ci (asselin)<br />
** Looking for volunteers to test out my fork of jaypipe's 3rd party ci setup which has support for nodepool & http proxies.<br />
** https://github.com/rasselin/os-ext-testing<br />
** https://github.com/rasselin/os-ext-testing-data<br />
** [http://www.mail-archive.com/openstack-dev@lists.openstack.org/msg26258.html Third-Party CI Issue: direct access to review.openstack.org port 29418 required]l<br />
* HDS HNAS Cinder drivers (sombrafam)<br />
** As we are trying check-in for quite a while, we want to get some feedback on the missing steps<br />
** First thread: https://review.openstack.org/#/c/74371/<br />
** Continuation: https://review.openstack.org/#/c/82505/<br />
** Current thread in discussion: https://review.openstack.org/#/c/84244/<br />
* Mid-cyce Sprint (scottda)<br />
** HP in Fort Collins, CO can host on site<br />
** The thought was 10-20 developers<br />
** Large room is available July 14,15,17,18, 21-25, 27-Aug 1 ... Other options exist<br />
* Backend Pools (navneet)<br />
** which way to go? There are two WIPs.<br />
** comparisons between the two approaches? Any wiki/etherpad present or to be prepared for documenting opinions?<br />
<br />
'''June 4th, 2014 16:00 UTC'''<br />
* Volume backup modification (navneet)<br />
** Blueprint and spec review/comments? https://blueprints.launchpad.net/cinder/+spec/vol-backup-service-per-backend<br />
* Dynamic multi pool (navneet)<br />
** Review comments? https://review.openstack.org/#/c/85760/<br />
** Implementation approach comparison.<br />
* 3rd party ci (asselin)<br />
** I have a conflict with another meeting, but my WIP to add nodepool into jaypipe's 3rd party ci solution is available here: https://github.com/rasselin/os-ext-testing/tree/nodepool<br />
* oslo.db (jungleboyj)<br />
** Want to quickly discuss the review out there for this: https://review.openstack.org/#/c/77125/<br />
** Move to current oslo.db? Wait for library work?<br />
** Need to drop off the meeting about 40 minutes in so if we can cover before then it would be appreciated. :-)<br />
<br />
'''May 28th, 2014 16:00 UTC'''<br />
* 3rd Party CI (jungleboyj)<br />
** What tempest test cases to run?<br />
** iSCSI only? What about for FC only drivers then?<br />
** Progress on where to record results?<br />
* SSH host keys (jungleboyj)<br />
** https://launchpad.net/bugs/1320050 and https://bugs.launchpad.net/cinder/+bug/1320056<br />
** Need plan to get this addressed by all drivers using SSH. (New config options?)<br />
** Way to get this backported to Havana?<br />
* Dynamic multi-pools (navneet)<br />
** Status and WIP review (https://review.openstack.org/#/c/85760/)<br />
** Back manager design improvement/rewriting for better rpc message handling.<br />
** Back up service for multi pools.<br />
* cinder-specs (jgriffith)<br />
** Specs repo is live<br />
** Process<br />
** Reviews<br />
<br />
'''May 21st, 2014 16:00 UTC'''<br />
* Consistency Groups (xyang)<br />
** A few people have concerns on the restriction of one volume type per CG. Should we allow one CG to have multiple volume types on the same backend? Let's discuss about it.<br />
* Third-Party CI (jgriffith)<br />
** Who's started, who's planning to and how can we help support each other to get this going smoothly<br />
* Moving GlusterFS snapshot code into the NFS RemoteFs driver (mberlin)<br />
** The GlusterFS snapshot code using qcow2 snapshots is useful for all file based storage systems. I would volunteer to move the GlusterFS snapshot code into the general RemoteFs driver - making it easier to get [https://review.openstack.org/#/c/94186/ our driver] accepted ;-)<br />
** Eric Harney is fine with this and planned to do this for Juno anyway ([https://blueprints.launchpad.net/cinder/+spec/remotefs-snaps see his blueprint]). I've put it on the agenda to make sure others also agree with this approach.<br />
<br />
'''May 7th, 2014 16:00 UTC'''<br />
* Limit == 0 in API [https://review.openstack.org/#/c/86207/ patch review] - thingee<br />
<br />
'''April 16th, 2014 16:00 UTC'''<br />
* Release Status<br />
* Summit Session Updates<br />
* Next Stop ATL!!!<br />
* Cinder resource status - thingee<br />
<br />
'''April 9th, 2014 16:00 UTC'''<br />
(Agenda entered retrospectively)<br />
* Cinder Spec (jgriffith) <br />
** Just a heads up that cinder blueprints will move to a gerrit based process shortly, a la nova. Details and wiki entry to follow.<br />
* RC2 status (jgriffith) <br />
** Just after cutting RC2, a bunch of bugs<br />
* Testing RC code (jgriffith)<br />
** Get on it, folks!<br />
**Looks like theres some serious, intermittent performance issues in the API somewhere...<br />
<br />
'''April 2, 2014 16:00 UTC'''<br />
_Meeting cancelled and summary discussion held on #openstack-cinder_<br />
<br />
* Release status and bugs<br />
* -2s left on reviews from before Junos opened - please check if they are still valid<br />
- https://review.openstack.org/#/c/73446/ (JGriffith)<br />
- https://review.openstack.org/#/c/80550/ (JBryant)<br />
- https://review.openstack.org/#/c/82100/ (Avishay)<br />
- https://review.openstack.org/#/c/74158/ (Avishay)<br />
- + a whole bunch of stable branch stuff<br />
<br />
== Previous meetings ==<br />
<br />
'''Mar 26, 2014 16:00 UTC'''<br />
* RC1 updates (jgriffith)<br />
* Design Summit Sessions (jgriffith)<br />
http://eavesdrop.openstack.org/meetings/cinder/2014/cinder.2014-03-26-16.00.log.html<br />
<br />
'''Mar 19, 2014 16:00 UTC'''<br />
* ProphetStor Driver Exception request for Icehouse (jgriffith)<br />
* Bug status/updates (jgriffith)<br />
* What we should be punting to Juno (aka immediate -2 in Gerrit) (jgriffith)<br />
* Continuous Integration for Cinder Certification (jungleboyj)<br />
<br />
'''Mar 12, 2014 16:00 UTC'''<br />
* Cancelled due to nothing on the agenda. Ad-hoc discussion on #openstack-cinder instead<br />
<br />
'''Mar 5, 2014 16:00 UTC'''<br />
* Volume replication - avishay<br />
* [https://blueprints.launchpad.net/cinder/+spec/lvm-driver-for-shared-storage New LVM-based driver for shared storage] - mtanino<br />
* DRBD/drbdmanage driver for cinder - philr<br />
<br />
'''Feb 19, 2014 16:00 UTC'''<br />
* I3 Status check/updates<br />
* Cert test<br />
* [https://review.openstack.org/#/c/73745 Milestone Consideration for Drivers] -thingee<br />
* [https://etherpad.openstack.org/p/cinder-hack-201402 Hack-a-thon details] -thingee<br />
* [https://review.openstack.org/#/c/66737/ scheduling for local storage] -DuncanT<br />
<br />
'''Feb 5, 2014 16:00 UTC'''<br />
* I3 Status check/updates<br />
* Cert test<br />
* Multiple pools per backend (bswartz)<br />
'''Jan 8, 2014 16:00 UTC'''<br />
* I2 is just around the corner, blueprint updates<br />
* Alternating meeting time proposal, results on feedback<br />
* Driver cert test, it's there... use it<br />
* Prioritizing patches and reviews<br />
'''December 18, 2013 16:00 UTC'''<br />
* Blueprint discussion [https://blueprints.launchpad.net/cinder/+spec/cinder-backup-recover-api cinder backup recovery api - import/export backups] - avishay<br />
* Blueprint discussion [https://blueprints.launchpad.net/cinder/+spec/copy-volume-to-image-task-flow] - Griffith<br />
* [https://blueprints.launchpad.net/cinder/+spec/admin-defined-capabilities Admin-defined capabilities] - Ollie<br />
* Why is type manage an extension? -Thingee<br />
<br />
'''December 11, 2013 16:00 UTC'''<br />
* Proposal of [https://etherpad.openstack.org/p/cinder-extensions extension packages] -Thingee<br />
<br />
'''December 4, 2013 16:00 UTC'''<br />
* Progressing with [https://wiki.openstack.org/wiki/Cinder/blueprints/multi-attach-volume multi-attach / shared-volume] - sgordon<br />
* [https://etherpad.openstack.org/p/icehouse-cinder-acls-for-volumes Access Control List design discussion] - alatynskaya<br />
<br />
'''November 27, 2013 16:00 UTC'''<br />
* [https://etherpad.openstack.org/p/icehouse-cinder-continuous-volume-replication-v2 Updated volume mirroring design] - avishay<br />
* Start using only Mock for new tests... [http://lists.openstack.org/pipermail/openstack-dev/2013-November/018501.html Related Nova Discussion] - Thingee<br />
* Rate limiting came up in the summit, and [http://lists.openstack.org/pipermail/openstack-dev/2013-November/020291.html on openstack-dev] - avishay<br />
* Metadata backup (https://review.openstack.org/#/c/51900/) progress RFC - dosaboy<br />
<br />
<br />
'''November 20, 2013 16:00 UTC'''<br />
* I-1 scheduling - JGriffith<br />
<br />
<br />
'''November 13, 2013 16:00 UTC'''<br />
* patches should update doc files where necessary to ease writing of release notes (Avishay?)<br />
* fencing host from storage (Ehud Trainin)<br />
* Summarize priority of tasks from summit discussions (https://wiki.openstack.org/wiki/Summit/Icehouse/Etherpads#Cinder and https://etherpad.openstack.org/p/cinder-icehouse-summary) Griff<br />
<br />
<br />
'''October 30, 2013 16:00 UTC'''<br />
* cinder backup metadata support - http://goo.gl/Jkg2FV (dosaboy)<br />
* fencing and unfencing host from storage - https://blueprints.launchpad.net/cinder/+spec/fencing-and-unfencing (Ehud Trainin)<br />
<br />
'''October 23, 2013 16:00 UTC'''<br />
* Nexenta backup driver https://review.openstack.org/#/c/47005/ - DuncanT<br />
<br />
<br />
'''October 2, 2013 16:00 UTC'''<br />
* What's still broken in Havana<br />
:* Backups and multibackend (https://code.launchpad.net/bugs/1228223): Fix committed<br />
:* Configuration - Global CONF settings in brick don't belong, and a number of them break multi-backend (Bug #1230066): Fix committed<br />
:* TaskFlow retry mechanism - The majority felt this should be left as a white list, but no work has been done to fix it so we still have ugly failures/roll-backs (3 bugs logged here): '''???'''<br />
:* Quotas - Don't know that anybody has gotten to the bottom of the quota syncing issue (Bug #1202896): '''Still open'''<br />
:* iSCSI Target creation failures - This was thought to have been fixed but showed up last night (Bug #1223469): Probable fix committed<br />
:* Summary of gate issue pertaining to Cinder can be viewed here: http://paste.openstack.org/show/47798/<br />
:* Moving to taskflow - avishay<br />
<br />
<br />
'''Sept 25, 2013, 16:00 UTC'''<br />
* PTL nomination process is open until the 26'th, if you want to run send your nomination proposal out to the dev ML<br />
* What's broken in Havana<br />
:* Backups (specifically when configured with multi-backend volumes)<br />
:* Configuration - Global CONF settings in brick don't belong, and a number of them break multi-backend (Bug #1230066)<br />
:* TaskFlow retry mechanism - The majority felt this should be left as a white list, but no work has been done to fix it so we still have ugly failures/roll-backs (3 bugs logged here)<br />
:* Quotas - Don't know that anybody has gotten to the bottom of the quota syncing issue (Bug #1202896)<br />
:* iSCSI Target creation failures - This was thought to have been fixed but showed up last night (Bug #1223469)<br />
:* ????<br />
* Cinderclient release plans/status? (Eharney)<br />
* OSLO imports (DuncanT)<br />
* bp/cinder-backup-improvements (dosaboy)<br />
* bp/multi-attach (zhiyan)<br />
<br />
<br />
<br />
'''Aug 21, 2013, 16:00 UTC'''<br />
# No agenda, no meeting.<br />
<br />
'''Aug 14, 2013, 16:00 UTC'''<br />
# Volume migration status - avishay<br />
# API extensions using metadata. This comes from the [https://review.openstack.org/#/c/38322/ readonly volume attach support]. - thingee<br />
[http://eavesdrop.openstack.org/meetings/cinder/2013/cinder.2013-08-14-16.00.log.html IRC Log]<br />
<br />
'''Aug 7, 2013, 16:00 UTC'''<br />
# [https://bugs.launchpad.net/cinder/+bug/1209199 RFC - make all rbd clones copy-on-write] -- Dosaboy<br />
# V1 API removal issues, plans and timescales - DuncanT<br />
<br />
== Meeting Minutes ==<br />
<br />
http://eavesdrop.openstack.org/meetings/cinder/2014/<br />
<br />
http://eavesdrop.openstack.org/meetings/cinder/2013/<br />
<br />
http://eavesdrop.openstack.org/meetings/cinder/2012/</div>Ronenkathttps://wiki.openstack.org/w/index.php?title=CinderMeetings&diff=59409CinderMeetings2014-07-30T12:13:15Z<p>Ronenkat: /* Next meeting */</p>
<hr />
<div><br />
= Weekly Cinder team meeting =<br />
'''NOTE MEETING TIME: Wed's at 16:00 UTC'''<br />
<br />
If you're interested in Cinder or Block Storage in general for OpenStack, we have a weekly meetings in <code><nowiki>#openstack-meeting</nowiki></code>, on Wednesdays at 16:00 UTC. Please feel free to add items to the agenda below. NOTE: When adding topics please include your IRC name so we know who's topic it is and how to get more info.<br />
<br />
== Next meeting ==<br />
'''NOTE:''' ''Include your IRC nickname next to agenda items so that you can be called upon in the meeting and arrive at the meeting promptly if placing items in agenda. You might want to put this on your calendar if you are adding items.''<br />
<br />
'''July 30'th, 2014 16:00 UTC'''<br />
* Planning cinderclient tag for Thursday morning July 31'st, let's catch up on client changes and testing prior to that (jgriffith)<br />
* Breaking the inheritance between data and control path in Volume drivers https://review.openstack.org/#/c/107205/ (jgriffith)<br />
* Consistency groups https://review.openstack.org/#/c/104732/ (xyang)<br />
* Hitachi Block Storage cinder driver https://review.openstack.org/#/c/90379/ (saguchi)<br />
* Volume replication https://review.openstack.org/#/c/106718/ (ronenkat)<br />
* 17:00 UTC - Volume replication driver owner overview and Q & A<br />
* Callin information: passcode: 6406941 call-in numbers: https://www.teleconference.att.com/servlet/glbAccess?process=1&accessCode=6406941&accessNumber=1809417783#C2<br />
<br />
== Previous meetings ==<br />
'''July 23th, 2014 16:00 UTC'''<br />
<br />
* J2 Milestone (DuncanT)<br />
** JGriffith favours a freeze exception for all drivers taht currently have code / BP up, but bouncing all new ones <br />
** Review priorities<br />
*** Driver specs<br />
*** CG groups - a big change that requires driver changes, so needs lots of eyes and time for driver maintainers to do their thing too: https://review.openstack.org/#/c/104743/<br />
*** Pool scheduling - https://review.openstack.org/#/c/98715/<br />
*** Others?<br />
* Plug for weekly 3rd Party CI meeting (Mondays at 18:00 UTC [1 pm Central]) (jungelboyj)<br />
** I attended this week's meeting and gave a high level status. They are looking for more participation.<br />
* ProphetStor Cinder drivers (stevetan)<br />
** Get feedback on progress of our DPL driver and documentation required https://review.openstack.org/#/c/95829/<br />
** Get direction from community for our Federator SDS driver https://review.openstack.org/#/c/99616/<br />
* Volume replication - work in progress (ronenkat)<br />
** https://review.openstack.org/#/c/106718/2<br />
* NFS Security, if there's time.<br />
** https://blueprints.launchpad.net/cinder/+spec/secure-nfs<br />
<br />
'''July 16th, 2014 16:00 UTC'''<br />
* Putting the fun back into cinder development.<br />
** There's been a mailing list thread recently about how nit-picky reviews are getting about typos, white space and the like, and how it is a motivation killer. I'm inclinded to agree - the formatting of the doc strings, fullstops at the end of comments etc doesn't actually improve the code much at all, and getting a -1 for it is a buzz kill of the highest order. Should we leave that sort of thing to the gate, and say that if there is no hacking check for it then it isn't important in general? (DuncanT)<br />
<br />
* How to proceed with cinder/openstack requirements? python-dbus for https://review.openstack.org/99013, see mailing list conclusion http://lists.openstack.org/pipermail/openstack-dev/2014-July/040182.html (flip214)<br />
* code churn, not sure where/when to start, fear of merge conflicts (flip214)<br />
<br />
* Hitachi Block Storage cinder driver (tsekiyama)<br />
** We want to get some feedback about how we can make this forward<br />
** Review: https://review.openstack.org/#/c/90379/<br />
<br />
* Log translations https://review.openstack.org/#/c/105665/ is still stuck - any thoughts? Options I can see: (DuncanT)<br />
** A better technical solution - should be possible where the message format is not expanded outside the logging call i.e.:<br />
Ok:<br />
<nowiki><br />
LOG.warning("The flubigar id %d exploded messily", flu_id)<br />
</nowiki><br />
Not ok:<br />
<nowiki><br />
msg = _("The flubigar id %d exploded messily") % flu_id<br />
LOG.warning(msg)<br />
</nowiki><br />
<br />
** We don't break up our message categories<br />
*** This makes life harder for the translation team, makes us inconsistent with Openstack in general but keeps the code from descending into ugliness<br />
** Related discussion on enabling translation (jungleboyj):<br />
*** Have two patches awaiting approval: Explicit import of _() https://review.openstack.org/105315 and enable lazy translation: https://review.openstack.org/105561<br />
*** Need to get these merged so we are running with the changes.<br />
* 3rd Party CI (jungleboyj):<br />
** Clarification on when drivers are going to be removed.<br />
<br />
== Previous meetings ==<br />
<br />
'''July 9, 2014 16:00 UTC'''<br />
<br />
* flip214 to jgriffith: Status of Separation of Connectors from Driver/Device Interface?<br />
* Quick check: Is everybody happy in principle with the text of https://wiki.openstack.org/wiki/CinderCodeCleanupPatches ? (DuncanT)<br />
<br />
<br />
'''July 2nd, 2014 16:00 UTC'''<br />
* Batching up mechanical code cleanup until the one week after each milestone (DuncanT)<br />
** See https://review.openstack.org/#/c/102872/ for example and https://review.openstack.org/#/c/101847<br />
** Log translations and hacking fixes fall into this class<br />
** Means you only take one bit hit per milestone for rebases<br />
** Does require some tracking so they don't get missed (and I will suck at said tracking, enviably)<br />
* LVM: Support a volume-group on shared storage (mtanino)<br />
** Want to quickly discuss the driver benefit, driver comparison, performance(P8-P14): https://wiki.openstack.org/w/images/0/08/Cinder-Support_LVM_on_a_sharedLU.pdf<br />
** Review comments? https://review.openstack.org/#/c/92479/<br />
* Cinder Third Party CI Names (asselin)<br />
** Online discussion of this thread: http://lists.openstack.org/pipermail/openstack-dev/2014-July/039103.html<br />
<br />
'''June 25th, 2014 16:00 UTC'''<br />
* Consistency groups [xyang]<br />
** Cinder spec review: https://review.openstack.org/#/c/96665/<br />
* CI status [xyang]<br />
** [http://www.mail-archive.com/openstack-dev@lists.openstack.org/msg26258.html Third-Party CI Issue (asselin): direct access to review.openstack.org port 29418 required]l<br />
* Pools implementation [navneet]<br />
** Comparison etherpad https://etherpad.openstack.org/p/cinder-pool-impl-comparison<br />
** Decision to select implementation<br />
* keystoneclient integration with cinderclient [hrybacki / ayoung]<br />
** Discuss integration and collaboration possibilities<br />
<br />
<br />
<br />
'''June 18th, 2014 16:00 UTC'''<br />
* It's review day !?! [jdg]<br />
* Mid cycle meetup plans/updates [jdg]<br />
** https://etherpad.openstack.org/p/CinderMidCycleMeetupAug2014<br />
* Separation of Connectors from Driver/Device Interface (status update) [jdg]<br />
* Updates on 3'rd party CI [jdg]<br />
* Things we need to decide upon (not today, but do your homework for next week)<br />
** Software Define Storage layers/drivers<br />
** Pools implementation<br />
<br />
'''June 11th, 2014 16:00 UTC'''<br />
* Volume replication (ronenkat)<br />
** Blueprint and spec review/comments? https://review.openstack.org/#/c/98308<br />
* oslo.db (jungleboyj)<br />
** Want to quickly discuss the review out there for this: https://review.openstack.org/#/c/77125/<br />
** Move to current oslo.db? Wait for library work?<br />
* oslo logging discussion (jungleboyj)<br />
** Removing translation of debug messages<br />
** Adding _LE, _LI, _LW<br />
* 3rd party cinder ci (asselin)<br />
** Looking for volunteers to test out my fork of jaypipe's 3rd party ci setup which has support for nodepool & http proxies.<br />
** https://github.com/rasselin/os-ext-testing<br />
** https://github.com/rasselin/os-ext-testing-data<br />
** [http://www.mail-archive.com/openstack-dev@lists.openstack.org/msg26258.html Third-Party CI Issue: direct access to review.openstack.org port 29418 required]l<br />
* HDS HNAS Cinder drivers (sombrafam)<br />
** As we are trying check-in for quite a while, we want to get some feedback on the missing steps<br />
** First thread: https://review.openstack.org/#/c/74371/<br />
** Continuation: https://review.openstack.org/#/c/82505/<br />
** Current thread in discussion: https://review.openstack.org/#/c/84244/<br />
* Mid-cyce Sprint (scottda)<br />
** HP in Fort Collins, CO can host on site<br />
** The thought was 10-20 developers<br />
** Large room is available July 14,15,17,18, 21-25, 27-Aug 1 ... Other options exist<br />
* Backend Pools (navneet)<br />
** which way to go? There are two WIPs.<br />
** comparisons between the two approaches? Any wiki/etherpad present or to be prepared for documenting opinions?<br />
<br />
'''June 4th, 2014 16:00 UTC'''<br />
* Volume backup modification (navneet)<br />
** Blueprint and spec review/comments? https://blueprints.launchpad.net/cinder/+spec/vol-backup-service-per-backend<br />
* Dynamic multi pool (navneet)<br />
** Review comments? https://review.openstack.org/#/c/85760/<br />
** Implementation approach comparison.<br />
* 3rd party ci (asselin)<br />
** I have a conflict with another meeting, but my WIP to add nodepool into jaypipe's 3rd party ci solution is available here: https://github.com/rasselin/os-ext-testing/tree/nodepool<br />
* oslo.db (jungleboyj)<br />
** Want to quickly discuss the review out there for this: https://review.openstack.org/#/c/77125/<br />
** Move to current oslo.db? Wait for library work?<br />
** Need to drop off the meeting about 40 minutes in so if we can cover before then it would be appreciated. :-)<br />
<br />
'''May 28th, 2014 16:00 UTC'''<br />
* 3rd Party CI (jungleboyj)<br />
** What tempest test cases to run?<br />
** iSCSI only? What about for FC only drivers then?<br />
** Progress on where to record results?<br />
* SSH host keys (jungleboyj)<br />
** https://launchpad.net/bugs/1320050 and https://bugs.launchpad.net/cinder/+bug/1320056<br />
** Need plan to get this addressed by all drivers using SSH. (New config options?)<br />
** Way to get this backported to Havana?<br />
* Dynamic multi-pools (navneet)<br />
** Status and WIP review (https://review.openstack.org/#/c/85760/)<br />
** Back manager design improvement/rewriting for better rpc message handling.<br />
** Back up service for multi pools.<br />
* cinder-specs (jgriffith)<br />
** Specs repo is live<br />
** Process<br />
** Reviews<br />
<br />
'''May 21st, 2014 16:00 UTC'''<br />
* Consistency Groups (xyang)<br />
** A few people have concerns on the restriction of one volume type per CG. Should we allow one CG to have multiple volume types on the same backend? Let's discuss about it.<br />
* Third-Party CI (jgriffith)<br />
** Who's started, who's planning to and how can we help support each other to get this going smoothly<br />
* Moving GlusterFS snapshot code into the NFS RemoteFs driver (mberlin)<br />
** The GlusterFS snapshot code using qcow2 snapshots is useful for all file based storage systems. I would volunteer to move the GlusterFS snapshot code into the general RemoteFs driver - making it easier to get [https://review.openstack.org/#/c/94186/ our driver] accepted ;-)<br />
** Eric Harney is fine with this and planned to do this for Juno anyway ([https://blueprints.launchpad.net/cinder/+spec/remotefs-snaps see his blueprint]). I've put it on the agenda to make sure others also agree with this approach.<br />
<br />
'''May 7th, 2014 16:00 UTC'''<br />
* Limit == 0 in API [https://review.openstack.org/#/c/86207/ patch review] - thingee<br />
<br />
'''April 16th, 2014 16:00 UTC'''<br />
* Release Status<br />
* Summit Session Updates<br />
* Next Stop ATL!!!<br />
* Cinder resource status - thingee<br />
<br />
'''April 9th, 2014 16:00 UTC'''<br />
(Agenda entered retrospectively)<br />
* Cinder Spec (jgriffith) <br />
** Just a heads up that cinder blueprints will move to a gerrit based process shortly, a la nova. Details and wiki entry to follow.<br />
* RC2 status (jgriffith) <br />
** Just after cutting RC2, a bunch of bugs<br />
* Testing RC code (jgriffith)<br />
** Get on it, folks!<br />
**Looks like theres some serious, intermittent performance issues in the API somewhere...<br />
<br />
'''April 2, 2014 16:00 UTC'''<br />
_Meeting cancelled and summary discussion held on #openstack-cinder_<br />
<br />
* Release status and bugs<br />
* -2s left on reviews from before Junos opened - please check if they are still valid<br />
- https://review.openstack.org/#/c/73446/ (JGriffith)<br />
- https://review.openstack.org/#/c/80550/ (JBryant)<br />
- https://review.openstack.org/#/c/82100/ (Avishay)<br />
- https://review.openstack.org/#/c/74158/ (Avishay)<br />
- + a whole bunch of stable branch stuff<br />
<br />
== Previous meetings ==<br />
<br />
'''Mar 26, 2014 16:00 UTC'''<br />
* RC1 updates (jgriffith)<br />
* Design Summit Sessions (jgriffith)<br />
http://eavesdrop.openstack.org/meetings/cinder/2014/cinder.2014-03-26-16.00.log.html<br />
<br />
'''Mar 19, 2014 16:00 UTC'''<br />
* ProphetStor Driver Exception request for Icehouse (jgriffith)<br />
* Bug status/updates (jgriffith)<br />
* What we should be punting to Juno (aka immediate -2 in Gerrit) (jgriffith)<br />
* Continuous Integration for Cinder Certification (jungleboyj)<br />
<br />
'''Mar 12, 2014 16:00 UTC'''<br />
* Cancelled due to nothing on the agenda. Ad-hoc discussion on #openstack-cinder instead<br />
<br />
'''Mar 5, 2014 16:00 UTC'''<br />
* Volume replication - avishay<br />
* [https://blueprints.launchpad.net/cinder/+spec/lvm-driver-for-shared-storage New LVM-based driver for shared storage] - mtanino<br />
* DRBD/drbdmanage driver for cinder - philr<br />
<br />
'''Feb 19, 2014 16:00 UTC'''<br />
* I3 Status check/updates<br />
* Cert test<br />
* [https://review.openstack.org/#/c/73745 Milestone Consideration for Drivers] -thingee<br />
* [https://etherpad.openstack.org/p/cinder-hack-201402 Hack-a-thon details] -thingee<br />
* [https://review.openstack.org/#/c/66737/ scheduling for local storage] -DuncanT<br />
<br />
'''Feb 5, 2014 16:00 UTC'''<br />
* I3 Status check/updates<br />
* Cert test<br />
* Multiple pools per backend (bswartz)<br />
'''Jan 8, 2014 16:00 UTC'''<br />
* I2 is just around the corner, blueprint updates<br />
* Alternating meeting time proposal, results on feedback<br />
* Driver cert test, it's there... use it<br />
* Prioritizing patches and reviews<br />
'''December 18, 2013 16:00 UTC'''<br />
* Blueprint discussion [https://blueprints.launchpad.net/cinder/+spec/cinder-backup-recover-api cinder backup recovery api - import/export backups] - avishay<br />
* Blueprint discussion [https://blueprints.launchpad.net/cinder/+spec/copy-volume-to-image-task-flow] - Griffith<br />
* [https://blueprints.launchpad.net/cinder/+spec/admin-defined-capabilities Admin-defined capabilities] - Ollie<br />
* Why is type manage an extension? -Thingee<br />
<br />
'''December 11, 2013 16:00 UTC'''<br />
* Proposal of [https://etherpad.openstack.org/p/cinder-extensions extension packages] -Thingee<br />
<br />
'''December 4, 2013 16:00 UTC'''<br />
* Progressing with [https://wiki.openstack.org/wiki/Cinder/blueprints/multi-attach-volume multi-attach / shared-volume] - sgordon<br />
* [https://etherpad.openstack.org/p/icehouse-cinder-acls-for-volumes Access Control List design discussion] - alatynskaya<br />
<br />
'''November 27, 2013 16:00 UTC'''<br />
* [https://etherpad.openstack.org/p/icehouse-cinder-continuous-volume-replication-v2 Updated volume mirroring design] - avishay<br />
* Start using only Mock for new tests... [http://lists.openstack.org/pipermail/openstack-dev/2013-November/018501.html Related Nova Discussion] - Thingee<br />
* Rate limiting came up in the summit, and [http://lists.openstack.org/pipermail/openstack-dev/2013-November/020291.html on openstack-dev] - avishay<br />
* Metadata backup (https://review.openstack.org/#/c/51900/) progress RFC - dosaboy<br />
<br />
<br />
'''November 20, 2013 16:00 UTC'''<br />
* I-1 scheduling - JGriffith<br />
<br />
<br />
'''November 13, 2013 16:00 UTC'''<br />
* patches should update doc files where necessary to ease writing of release notes (Avishay?)<br />
* fencing host from storage (Ehud Trainin)<br />
* Summarize priority of tasks from summit discussions (https://wiki.openstack.org/wiki/Summit/Icehouse/Etherpads#Cinder and https://etherpad.openstack.org/p/cinder-icehouse-summary) Griff<br />
<br />
<br />
'''October 30, 2013 16:00 UTC'''<br />
* cinder backup metadata support - http://goo.gl/Jkg2FV (dosaboy)<br />
* fencing and unfencing host from storage - https://blueprints.launchpad.net/cinder/+spec/fencing-and-unfencing (Ehud Trainin)<br />
<br />
'''October 23, 2013 16:00 UTC'''<br />
* Nexenta backup driver https://review.openstack.org/#/c/47005/ - DuncanT<br />
<br />
<br />
'''October 2, 2013 16:00 UTC'''<br />
* What's still broken in Havana<br />
:* Backups and multibackend (https://code.launchpad.net/bugs/1228223): Fix committed<br />
:* Configuration - Global CONF settings in brick don't belong, and a number of them break multi-backend (Bug #1230066): Fix committed<br />
:* TaskFlow retry mechanism - The majority felt this should be left as a white list, but no work has been done to fix it so we still have ugly failures/roll-backs (3 bugs logged here): '''???'''<br />
:* Quotas - Don't know that anybody has gotten to the bottom of the quota syncing issue (Bug #1202896): '''Still open'''<br />
:* iSCSI Target creation failures - This was thought to have been fixed but showed up last night (Bug #1223469): Probable fix committed<br />
:* Summary of gate issue pertaining to Cinder can be viewed here: http://paste.openstack.org/show/47798/<br />
:* Moving to taskflow - avishay<br />
<br />
<br />
'''Sept 25, 2013, 16:00 UTC'''<br />
* PTL nomination process is open until the 26'th, if you want to run send your nomination proposal out to the dev ML<br />
* What's broken in Havana<br />
:* Backups (specifically when configured with multi-backend volumes)<br />
:* Configuration - Global CONF settings in brick don't belong, and a number of them break multi-backend (Bug #1230066)<br />
:* TaskFlow retry mechanism - The majority felt this should be left as a white list, but no work has been done to fix it so we still have ugly failures/roll-backs (3 bugs logged here)<br />
:* Quotas - Don't know that anybody has gotten to the bottom of the quota syncing issue (Bug #1202896)<br />
:* iSCSI Target creation failures - This was thought to have been fixed but showed up last night (Bug #1223469)<br />
:* ????<br />
* Cinderclient release plans/status? (Eharney)<br />
* OSLO imports (DuncanT)<br />
* bp/cinder-backup-improvements (dosaboy)<br />
* bp/multi-attach (zhiyan)<br />
<br />
<br />
<br />
'''Aug 21, 2013, 16:00 UTC'''<br />
# No agenda, no meeting.<br />
<br />
'''Aug 14, 2013, 16:00 UTC'''<br />
# Volume migration status - avishay<br />
# API extensions using metadata. This comes from the [https://review.openstack.org/#/c/38322/ readonly volume attach support]. - thingee<br />
[http://eavesdrop.openstack.org/meetings/cinder/2013/cinder.2013-08-14-16.00.log.html IRC Log]<br />
<br />
'''Aug 7, 2013, 16:00 UTC'''<br />
# [https://bugs.launchpad.net/cinder/+bug/1209199 RFC - make all rbd clones copy-on-write] -- Dosaboy<br />
# V1 API removal issues, plans and timescales - DuncanT<br />
<br />
== Meeting Minutes ==<br />
<br />
http://eavesdrop.openstack.org/meetings/cinder/2014/<br />
<br />
http://eavesdrop.openstack.org/meetings/cinder/2013/<br />
<br />
http://eavesdrop.openstack.org/meetings/cinder/2012/</div>Ronenkathttps://wiki.openstack.org/w/index.php?title=CinderMeetings&diff=58787CinderMeetings2014-07-23T15:29:37Z<p>Ronenkat: /* Next meeting */</p>
<hr />
<div><br />
= Weekly Cinder team meeting =<br />
'''NOTE MEETING TIME: Wed's at 16:00 UTC'''<br />
<br />
If you're interested in Cinder or Block Storage in general for OpenStack, we have a weekly meetings in <code><nowiki>#openstack-meeting</nowiki></code>, on Wednesdays at 16:00 UTC. Please feel free to add items to the agenda below. NOTE: When adding topics please include your IRC name so we know who's topic it is and how to get more info.<br />
<br />
== Next meeting ==<br />
'''NOTE:''' ''Include your IRC nickname next to agenda items so that you can be called upon in the meeting and arrive at the meeting promptly if placing items in agenda. You might want to put this on your calendar if you are adding items.''<br />
<br />
'''July 23th, 2014 16:00 UTC'''<br />
<br />
* Plug for weekly 3rd Party CI meeting (Mondays at 18:00 UTC [1 pm Central]) (jungelboyj)<br />
** I attended this week's meeting and gave a high level status. They are looking for more participation.<br />
* ProphetStor Cinder drivers (stevetan)<br />
** Get feedback on progress of our DPL driver and documentation required https://review.openstack.org/#/c/95829/<br />
** Get direction from community for our Federator SDS driver https://review.openstack.org/#/c/99616/<br />
* Volume replication - work in progress (ronenkat)<br />
** https://review.openstack.org/#/c/106718/2<br />
<br />
== Previous meetings ==<br />
'''July 16th, 2014 16:00 UTC'''<br />
* Putting the fun back into cinder development.<br />
** There's been a mailing list thread recently about how nit-picky reviews are getting about typos, white space and the like, and how it is a motivation killer. I'm inclinded to agree - the formatting of the doc strings, fullstops at the end of comments etc doesn't actually improve the code much at all, and getting a -1 for it is a buzz kill of the highest order. Should we leave that sort of thing to the gate, and say that if there is no hacking check for it then it isn't important in general? (DuncanT)<br />
<br />
* How to proceed with cinder/openstack requirements? python-dbus for https://review.openstack.org/99013, see mailing list conclusion http://lists.openstack.org/pipermail/openstack-dev/2014-July/040182.html (flip214)<br />
* code churn, not sure where/when to start, fear of merge conflicts (flip214)<br />
<br />
* Hitachi Block Storage cinder driver (tsekiyama)<br />
** We want to get some feedback about how we can make this forward<br />
** Review: https://review.openstack.org/#/c/90379/<br />
<br />
* Log translations https://review.openstack.org/#/c/105665/ is still stuck - any thoughts? Options I can see: (DuncanT)<br />
** A better technical solution - should be possible where the message format is not expanded outside the logging call i.e.:<br />
Ok:<br />
<nowiki><br />
LOG.warning("The flubigar id %d exploded messily", flu_id)<br />
</nowiki><br />
Not ok:<br />
<nowiki><br />
msg = _("The flubigar id %d exploded messily") % flu_id<br />
LOG.warning(msg)<br />
</nowiki><br />
<br />
** We don't break up our message categories<br />
*** This makes life harder for the translation team, makes us inconsistent with Openstack in general but keeps the code from descending into ugliness<br />
** Related discussion on enabling translation (jungleboyj):<br />
*** Have two patches awaiting approval: Explicit import of _() https://review.openstack.org/105315 and enable lazy translation: https://review.openstack.org/105561<br />
*** Need to get these merged so we are running with the changes.<br />
* 3rd Party CI (jungleboyj):<br />
** Clarification on when drivers are going to be removed.<br />
<br />
== Previous meetings ==<br />
<br />
'''July 9, 2014 16:00 UTC'''<br />
<br />
* flip214 to jgriffith: Status of Separation of Connectors from Driver/Device Interface?<br />
* Quick check: Is everybody happy in principle with the text of https://wiki.openstack.org/wiki/CinderCodeCleanupPatches ? (DuncanT)<br />
<br />
<br />
'''July 2nd, 2014 16:00 UTC'''<br />
* Batching up mechanical code cleanup until the one week after each milestone (DuncanT)<br />
** See https://review.openstack.org/#/c/102872/ for example and https://review.openstack.org/#/c/101847<br />
** Log translations and hacking fixes fall into this class<br />
** Means you only take one bit hit per milestone for rebases<br />
** Does require some tracking so they don't get missed (and I will suck at said tracking, enviably)<br />
* LVM: Support a volume-group on shared storage (mtanino)<br />
** Want to quickly discuss the driver benefit, driver comparison, performance(P8-P14): https://wiki.openstack.org/w/images/0/08/Cinder-Support_LVM_on_a_sharedLU.pdf<br />
** Review comments? https://review.openstack.org/#/c/92479/<br />
* Cinder Third Party CI Names (asselin)<br />
** Online discussion of this thread: http://lists.openstack.org/pipermail/openstack-dev/2014-July/039103.html<br />
<br />
'''June 25th, 2014 16:00 UTC'''<br />
* Consistency groups [xyang]<br />
** Cinder spec review: https://review.openstack.org/#/c/96665/<br />
* CI status [xyang]<br />
** [http://www.mail-archive.com/openstack-dev@lists.openstack.org/msg26258.html Third-Party CI Issue (asselin): direct access to review.openstack.org port 29418 required]l<br />
* Pools implementation [navneet]<br />
** Comparison etherpad https://etherpad.openstack.org/p/cinder-pool-impl-comparison<br />
** Decision to select implementation<br />
* keystoneclient integration with cinderclient [hrybacki / ayoung]<br />
** Discuss integration and collaboration possibilities<br />
<br />
<br />
<br />
'''June 18th, 2014 16:00 UTC'''<br />
* It's review day !?! [jdg]<br />
* Mid cycle meetup plans/updates [jdg]<br />
** https://etherpad.openstack.org/p/CinderMidCycleMeetupAug2014<br />
* Separation of Connectors from Driver/Device Interface (status update) [jdg]<br />
* Updates on 3'rd party CI [jdg]<br />
* Things we need to decide upon (not today, but do your homework for next week)<br />
** Software Define Storage layers/drivers<br />
** Pools implementation<br />
<br />
'''June 11th, 2014 16:00 UTC'''<br />
* Volume replication (ronenkat)<br />
** Blueprint and spec review/comments? https://review.openstack.org/#/c/98308<br />
* oslo.db (jungleboyj)<br />
** Want to quickly discuss the review out there for this: https://review.openstack.org/#/c/77125/<br />
** Move to current oslo.db? Wait for library work?<br />
* oslo logging discussion (jungleboyj)<br />
** Removing translation of debug messages<br />
** Adding _LE, _LI, _LW<br />
* 3rd party cinder ci (asselin)<br />
** Looking for volunteers to test out my fork of jaypipe's 3rd party ci setup which has support for nodepool & http proxies.<br />
** https://github.com/rasselin/os-ext-testing<br />
** https://github.com/rasselin/os-ext-testing-data<br />
** [http://www.mail-archive.com/openstack-dev@lists.openstack.org/msg26258.html Third-Party CI Issue: direct access to review.openstack.org port 29418 required]l<br />
* HDS HNAS Cinder drivers (sombrafam)<br />
** As we are trying check-in for quite a while, we want to get some feedback on the missing steps<br />
** First thread: https://review.openstack.org/#/c/74371/<br />
** Continuation: https://review.openstack.org/#/c/82505/<br />
** Current thread in discussion: https://review.openstack.org/#/c/84244/<br />
* Mid-cyce Sprint (scottda)<br />
** HP in Fort Collins, CO can host on site<br />
** The thought was 10-20 developers<br />
** Large room is available July 14,15,17,18, 21-25, 27-Aug 1 ... Other options exist<br />
* Backend Pools (navneet)<br />
** which way to go? There are two WIPs.<br />
** comparisons between the two approaches? Any wiki/etherpad present or to be prepared for documenting opinions?<br />
<br />
'''June 4th, 2014 16:00 UTC'''<br />
* Volume backup modification (navneet)<br />
** Blueprint and spec review/comments? https://blueprints.launchpad.net/cinder/+spec/vol-backup-service-per-backend<br />
* Dynamic multi pool (navneet)<br />
** Review comments? https://review.openstack.org/#/c/85760/<br />
** Implementation approach comparison.<br />
* 3rd party ci (asselin)<br />
** I have a conflict with another meeting, but my WIP to add nodepool into jaypipe's 3rd party ci solution is available here: https://github.com/rasselin/os-ext-testing/tree/nodepool<br />
* oslo.db (jungleboyj)<br />
** Want to quickly discuss the review out there for this: https://review.openstack.org/#/c/77125/<br />
** Move to current oslo.db? Wait for library work?<br />
** Need to drop off the meeting about 40 minutes in so if we can cover before then it would be appreciated. :-)<br />
<br />
'''May 28th, 2014 16:00 UTC'''<br />
* 3rd Party CI (jungleboyj)<br />
** What tempest test cases to run?<br />
** iSCSI only? What about for FC only drivers then?<br />
** Progress on where to record results?<br />
* SSH host keys (jungleboyj)<br />
** https://launchpad.net/bugs/1320050 and https://bugs.launchpad.net/cinder/+bug/1320056<br />
** Need plan to get this addressed by all drivers using SSH. (New config options?)<br />
** Way to get this backported to Havana?<br />
* Dynamic multi-pools (navneet)<br />
** Status and WIP review (https://review.openstack.org/#/c/85760/)<br />
** Back manager design improvement/rewriting for better rpc message handling.<br />
** Back up service for multi pools.<br />
* cinder-specs (jgriffith)<br />
** Specs repo is live<br />
** Process<br />
** Reviews<br />
<br />
'''May 21st, 2014 16:00 UTC'''<br />
* Consistency Groups (xyang)<br />
** A few people have concerns on the restriction of one volume type per CG. Should we allow one CG to have multiple volume types on the same backend? Let's discuss about it.<br />
* Third-Party CI (jgriffith)<br />
** Who's started, who's planning to and how can we help support each other to get this going smoothly<br />
* Moving GlusterFS snapshot code into the NFS RemoteFs driver (mberlin)<br />
** The GlusterFS snapshot code using qcow2 snapshots is useful for all file based storage systems. I would volunteer to move the GlusterFS snapshot code into the general RemoteFs driver - making it easier to get [https://review.openstack.org/#/c/94186/ our driver] accepted ;-)<br />
** Eric Harney is fine with this and planned to do this for Juno anyway ([https://blueprints.launchpad.net/cinder/+spec/remotefs-snaps see his blueprint]). I've put it on the agenda to make sure others also agree with this approach.<br />
<br />
'''May 7th, 2014 16:00 UTC'''<br />
* Limit == 0 in API [https://review.openstack.org/#/c/86207/ patch review] - thingee<br />
<br />
'''April 16th, 2014 16:00 UTC'''<br />
* Release Status<br />
* Summit Session Updates<br />
* Next Stop ATL!!!<br />
* Cinder resource status - thingee<br />
<br />
'''April 9th, 2014 16:00 UTC'''<br />
(Agenda entered retrospectively)<br />
* Cinder Spec (jgriffith) <br />
** Just a heads up that cinder blueprints will move to a gerrit based process shortly, a la nova. Details and wiki entry to follow.<br />
* RC2 status (jgriffith) <br />
** Just after cutting RC2, a bunch of bugs<br />
* Testing RC code (jgriffith)<br />
** Get on it, folks!<br />
**Looks like theres some serious, intermittent performance issues in the API somewhere...<br />
<br />
'''April 2, 2014 16:00 UTC'''<br />
_Meeting cancelled and summary discussion held on #openstack-cinder_<br />
<br />
* Release status and bugs<br />
* -2s left on reviews from before Junos opened - please check if they are still valid<br />
- https://review.openstack.org/#/c/73446/ (JGriffith)<br />
- https://review.openstack.org/#/c/80550/ (JBryant)<br />
- https://review.openstack.org/#/c/82100/ (Avishay)<br />
- https://review.openstack.org/#/c/74158/ (Avishay)<br />
- + a whole bunch of stable branch stuff<br />
<br />
== Previous meetings ==<br />
<br />
'''Mar 26, 2014 16:00 UTC'''<br />
* RC1 updates (jgriffith)<br />
* Design Summit Sessions (jgriffith)<br />
http://eavesdrop.openstack.org/meetings/cinder/2014/cinder.2014-03-26-16.00.log.html<br />
<br />
'''Mar 19, 2014 16:00 UTC'''<br />
* ProphetStor Driver Exception request for Icehouse (jgriffith)<br />
* Bug status/updates (jgriffith)<br />
* What we should be punting to Juno (aka immediate -2 in Gerrit) (jgriffith)<br />
* Continuous Integration for Cinder Certification (jungleboyj)<br />
<br />
'''Mar 12, 2014 16:00 UTC'''<br />
* Cancelled due to nothing on the agenda. Ad-hoc discussion on #openstack-cinder instead<br />
<br />
'''Mar 5, 2014 16:00 UTC'''<br />
* Volume replication - avishay<br />
* [https://blueprints.launchpad.net/cinder/+spec/lvm-driver-for-shared-storage New LVM-based driver for shared storage] - mtanino<br />
* DRBD/drbdmanage driver for cinder - philr<br />
<br />
'''Feb 19, 2014 16:00 UTC'''<br />
* I3 Status check/updates<br />
* Cert test<br />
* [https://review.openstack.org/#/c/73745 Milestone Consideration for Drivers] -thingee<br />
* [https://etherpad.openstack.org/p/cinder-hack-201402 Hack-a-thon details] -thingee<br />
* [https://review.openstack.org/#/c/66737/ scheduling for local storage] -DuncanT<br />
<br />
'''Feb 5, 2014 16:00 UTC'''<br />
* I3 Status check/updates<br />
* Cert test<br />
* Multiple pools per backend (bswartz)<br />
'''Jan 8, 2014 16:00 UTC'''<br />
* I2 is just around the corner, blueprint updates<br />
* Alternating meeting time proposal, results on feedback<br />
* Driver cert test, it's there... use it<br />
* Prioritizing patches and reviews<br />
'''December 18, 2013 16:00 UTC'''<br />
* Blueprint discussion [https://blueprints.launchpad.net/cinder/+spec/cinder-backup-recover-api cinder backup recovery api - import/export backups] - avishay<br />
* Blueprint discussion [https://blueprints.launchpad.net/cinder/+spec/copy-volume-to-image-task-flow] - Griffith<br />
* [https://blueprints.launchpad.net/cinder/+spec/admin-defined-capabilities Admin-defined capabilities] - Ollie<br />
* Why is type manage an extension? -Thingee<br />
<br />
'''December 11, 2013 16:00 UTC'''<br />
* Proposal of [https://etherpad.openstack.org/p/cinder-extensions extension packages] -Thingee<br />
<br />
'''December 4, 2013 16:00 UTC'''<br />
* Progressing with [https://wiki.openstack.org/wiki/Cinder/blueprints/multi-attach-volume multi-attach / shared-volume] - sgordon<br />
* [https://etherpad.openstack.org/p/icehouse-cinder-acls-for-volumes Access Control List design discussion] - alatynskaya<br />
<br />
'''November 27, 2013 16:00 UTC'''<br />
* [https://etherpad.openstack.org/p/icehouse-cinder-continuous-volume-replication-v2 Updated volume mirroring design] - avishay<br />
* Start using only Mock for new tests... [http://lists.openstack.org/pipermail/openstack-dev/2013-November/018501.html Related Nova Discussion] - Thingee<br />
* Rate limiting came up in the summit, and [http://lists.openstack.org/pipermail/openstack-dev/2013-November/020291.html on openstack-dev] - avishay<br />
* Metadata backup (https://review.openstack.org/#/c/51900/) progress RFC - dosaboy<br />
<br />
<br />
'''November 20, 2013 16:00 UTC'''<br />
* I-1 scheduling - JGriffith<br />
<br />
<br />
'''November 13, 2013 16:00 UTC'''<br />
* patches should update doc files where necessary to ease writing of release notes (Avishay?)<br />
* fencing host from storage (Ehud Trainin)<br />
* Summarize priority of tasks from summit discussions (https://wiki.openstack.org/wiki/Summit/Icehouse/Etherpads#Cinder and https://etherpad.openstack.org/p/cinder-icehouse-summary) Griff<br />
<br />
<br />
'''October 30, 2013 16:00 UTC'''<br />
* cinder backup metadata support - http://goo.gl/Jkg2FV (dosaboy)<br />
* fencing and unfencing host from storage - https://blueprints.launchpad.net/cinder/+spec/fencing-and-unfencing (Ehud Trainin)<br />
<br />
'''October 23, 2013 16:00 UTC'''<br />
* Nexenta backup driver https://review.openstack.org/#/c/47005/ - DuncanT<br />
<br />
<br />
'''October 2, 2013 16:00 UTC'''<br />
* What's still broken in Havana<br />
:* Backups and multibackend (https://code.launchpad.net/bugs/1228223): Fix committed<br />
:* Configuration - Global CONF settings in brick don't belong, and a number of them break multi-backend (Bug #1230066): Fix committed<br />
:* TaskFlow retry mechanism - The majority felt this should be left as a white list, but no work has been done to fix it so we still have ugly failures/roll-backs (3 bugs logged here): '''???'''<br />
:* Quotas - Don't know that anybody has gotten to the bottom of the quota syncing issue (Bug #1202896): '''Still open'''<br />
:* iSCSI Target creation failures - This was thought to have been fixed but showed up last night (Bug #1223469): Probable fix committed<br />
:* Summary of gate issue pertaining to Cinder can be viewed here: http://paste.openstack.org/show/47798/<br />
:* Moving to taskflow - avishay<br />
<br />
<br />
'''Sept 25, 2013, 16:00 UTC'''<br />
* PTL nomination process is open until the 26'th, if you want to run send your nomination proposal out to the dev ML<br />
* What's broken in Havana<br />
:* Backups (specifically when configured with multi-backend volumes)<br />
:* Configuration - Global CONF settings in brick don't belong, and a number of them break multi-backend (Bug #1230066)<br />
:* TaskFlow retry mechanism - The majority felt this should be left as a white list, but no work has been done to fix it so we still have ugly failures/roll-backs (3 bugs logged here)<br />
:* Quotas - Don't know that anybody has gotten to the bottom of the quota syncing issue (Bug #1202896)<br />
:* iSCSI Target creation failures - This was thought to have been fixed but showed up last night (Bug #1223469)<br />
:* ????<br />
* Cinderclient release plans/status? (Eharney)<br />
* OSLO imports (DuncanT)<br />
* bp/cinder-backup-improvements (dosaboy)<br />
* bp/multi-attach (zhiyan)<br />
<br />
<br />
<br />
'''Aug 21, 2013, 16:00 UTC'''<br />
# No agenda, no meeting.<br />
<br />
'''Aug 14, 2013, 16:00 UTC'''<br />
# Volume migration status - avishay<br />
# API extensions using metadata. This comes from the [https://review.openstack.org/#/c/38322/ readonly volume attach support]. - thingee<br />
[http://eavesdrop.openstack.org/meetings/cinder/2013/cinder.2013-08-14-16.00.log.html IRC Log]<br />
<br />
'''Aug 7, 2013, 16:00 UTC'''<br />
# [https://bugs.launchpad.net/cinder/+bug/1209199 RFC - make all rbd clones copy-on-write] -- Dosaboy<br />
# V1 API removal issues, plans and timescales - DuncanT<br />
<br />
== Meeting Minutes ==<br />
<br />
http://eavesdrop.openstack.org/meetings/cinder/2014/<br />
<br />
http://eavesdrop.openstack.org/meetings/cinder/2013/<br />
<br />
http://eavesdrop.openstack.org/meetings/cinder/2012/</div>Ronenkathttps://wiki.openstack.org/w/index.php?title=CinderMeetings&diff=58786CinderMeetings2014-07-23T15:29:04Z<p>Ronenkat: /* Next meeting */</p>
<hr />
<div><br />
= Weekly Cinder team meeting =<br />
'''NOTE MEETING TIME: Wed's at 16:00 UTC'''<br />
<br />
If you're interested in Cinder or Block Storage in general for OpenStack, we have a weekly meetings in <code><nowiki>#openstack-meeting</nowiki></code>, on Wednesdays at 16:00 UTC. Please feel free to add items to the agenda below. NOTE: When adding topics please include your IRC name so we know who's topic it is and how to get more info.<br />
<br />
== Next meeting ==<br />
'''NOTE:''' ''Include your IRC nickname next to agenda items so that you can be called upon in the meeting and arrive at the meeting promptly if placing items in agenda. You might want to put this on your calendar if you are adding items.''<br />
<br />
'''July 23th, 2014 16:00 UTC'''<br />
<br />
* Plug for weekly 3rd Party CI meeting (Mondays at 18:00 UTC [1 pm Central]) (jungelboyj)<br />
** I attended this week's meeting and gave a high level status. They are looking for more participation.<br />
* ProphetStor Cinder drivers (stevetan)<br />
** Get feedback on progress of our DPL driver and documentation required https://review.openstack.org/#/c/95829/<br />
** Get direction from community for our Federator SDS driver https://review.openstack.org/#/c/99616/<br />
* Volume replication - work in progress<br />
** https://review.openstack.org/#/c/106718/2<br />
<br />
== Previous meetings ==<br />
'''July 16th, 2014 16:00 UTC'''<br />
* Putting the fun back into cinder development.<br />
** There's been a mailing list thread recently about how nit-picky reviews are getting about typos, white space and the like, and how it is a motivation killer. I'm inclinded to agree - the formatting of the doc strings, fullstops at the end of comments etc doesn't actually improve the code much at all, and getting a -1 for it is a buzz kill of the highest order. Should we leave that sort of thing to the gate, and say that if there is no hacking check for it then it isn't important in general? (DuncanT)<br />
<br />
* How to proceed with cinder/openstack requirements? python-dbus for https://review.openstack.org/99013, see mailing list conclusion http://lists.openstack.org/pipermail/openstack-dev/2014-July/040182.html (flip214)<br />
* code churn, not sure where/when to start, fear of merge conflicts (flip214)<br />
<br />
* Hitachi Block Storage cinder driver (tsekiyama)<br />
** We want to get some feedback about how we can make this forward<br />
** Review: https://review.openstack.org/#/c/90379/<br />
<br />
* Log translations https://review.openstack.org/#/c/105665/ is still stuck - any thoughts? Options I can see: (DuncanT)<br />
** A better technical solution - should be possible where the message format is not expanded outside the logging call i.e.:<br />
Ok:<br />
<nowiki><br />
LOG.warning("The flubigar id %d exploded messily", flu_id)<br />
</nowiki><br />
Not ok:<br />
<nowiki><br />
msg = _("The flubigar id %d exploded messily") % flu_id<br />
LOG.warning(msg)<br />
</nowiki><br />
<br />
** We don't break up our message categories<br />
*** This makes life harder for the translation team, makes us inconsistent with Openstack in general but keeps the code from descending into ugliness<br />
** Related discussion on enabling translation (jungleboyj):<br />
*** Have two patches awaiting approval: Explicit import of _() https://review.openstack.org/105315 and enable lazy translation: https://review.openstack.org/105561<br />
*** Need to get these merged so we are running with the changes.<br />
* 3rd Party CI (jungleboyj):<br />
** Clarification on when drivers are going to be removed.<br />
<br />
== Previous meetings ==<br />
<br />
'''July 9, 2014 16:00 UTC'''<br />
<br />
* flip214 to jgriffith: Status of Separation of Connectors from Driver/Device Interface?<br />
* Quick check: Is everybody happy in principle with the text of https://wiki.openstack.org/wiki/CinderCodeCleanupPatches ? (DuncanT)<br />
<br />
<br />
'''July 2nd, 2014 16:00 UTC'''<br />
* Batching up mechanical code cleanup until the one week after each milestone (DuncanT)<br />
** See https://review.openstack.org/#/c/102872/ for example and https://review.openstack.org/#/c/101847<br />
** Log translations and hacking fixes fall into this class<br />
** Means you only take one bit hit per milestone for rebases<br />
** Does require some tracking so they don't get missed (and I will suck at said tracking, enviably)<br />
* LVM: Support a volume-group on shared storage (mtanino)<br />
** Want to quickly discuss the driver benefit, driver comparison, performance(P8-P14): https://wiki.openstack.org/w/images/0/08/Cinder-Support_LVM_on_a_sharedLU.pdf<br />
** Review comments? https://review.openstack.org/#/c/92479/<br />
* Cinder Third Party CI Names (asselin)<br />
** Online discussion of this thread: http://lists.openstack.org/pipermail/openstack-dev/2014-July/039103.html<br />
<br />
'''June 25th, 2014 16:00 UTC'''<br />
* Consistency groups [xyang]<br />
** Cinder spec review: https://review.openstack.org/#/c/96665/<br />
* CI status [xyang]<br />
** [http://www.mail-archive.com/openstack-dev@lists.openstack.org/msg26258.html Third-Party CI Issue (asselin): direct access to review.openstack.org port 29418 required]l<br />
* Pools implementation [navneet]<br />
** Comparison etherpad https://etherpad.openstack.org/p/cinder-pool-impl-comparison<br />
** Decision to select implementation<br />
* keystoneclient integration with cinderclient [hrybacki / ayoung]<br />
** Discuss integration and collaboration possibilities<br />
<br />
<br />
<br />
'''June 18th, 2014 16:00 UTC'''<br />
* It's review day !?! [jdg]<br />
* Mid cycle meetup plans/updates [jdg]<br />
** https://etherpad.openstack.org/p/CinderMidCycleMeetupAug2014<br />
* Separation of Connectors from Driver/Device Interface (status update) [jdg]<br />
* Updates on 3'rd party CI [jdg]<br />
* Things we need to decide upon (not today, but do your homework for next week)<br />
** Software Define Storage layers/drivers<br />
** Pools implementation<br />
<br />
'''June 11th, 2014 16:00 UTC'''<br />
* Volume replication (ronenkat)<br />
** Blueprint and spec review/comments? https://review.openstack.org/#/c/98308<br />
* oslo.db (jungleboyj)<br />
** Want to quickly discuss the review out there for this: https://review.openstack.org/#/c/77125/<br />
** Move to current oslo.db? Wait for library work?<br />
* oslo logging discussion (jungleboyj)<br />
** Removing translation of debug messages<br />
** Adding _LE, _LI, _LW<br />
* 3rd party cinder ci (asselin)<br />
** Looking for volunteers to test out my fork of jaypipe's 3rd party ci setup which has support for nodepool & http proxies.<br />
** https://github.com/rasselin/os-ext-testing<br />
** https://github.com/rasselin/os-ext-testing-data<br />
** [http://www.mail-archive.com/openstack-dev@lists.openstack.org/msg26258.html Third-Party CI Issue: direct access to review.openstack.org port 29418 required]l<br />
* HDS HNAS Cinder drivers (sombrafam)<br />
** As we are trying check-in for quite a while, we want to get some feedback on the missing steps<br />
** First thread: https://review.openstack.org/#/c/74371/<br />
** Continuation: https://review.openstack.org/#/c/82505/<br />
** Current thread in discussion: https://review.openstack.org/#/c/84244/<br />
* Mid-cyce Sprint (scottda)<br />
** HP in Fort Collins, CO can host on site<br />
** The thought was 10-20 developers<br />
** Large room is available July 14,15,17,18, 21-25, 27-Aug 1 ... Other options exist<br />
* Backend Pools (navneet)<br />
** which way to go? There are two WIPs.<br />
** comparisons between the two approaches? Any wiki/etherpad present or to be prepared for documenting opinions?<br />
<br />
'''June 4th, 2014 16:00 UTC'''<br />
* Volume backup modification (navneet)<br />
** Blueprint and spec review/comments? https://blueprints.launchpad.net/cinder/+spec/vol-backup-service-per-backend<br />
* Dynamic multi pool (navneet)<br />
** Review comments? https://review.openstack.org/#/c/85760/<br />
** Implementation approach comparison.<br />
* 3rd party ci (asselin)<br />
** I have a conflict with another meeting, but my WIP to add nodepool into jaypipe's 3rd party ci solution is available here: https://github.com/rasselin/os-ext-testing/tree/nodepool<br />
* oslo.db (jungleboyj)<br />
** Want to quickly discuss the review out there for this: https://review.openstack.org/#/c/77125/<br />
** Move to current oslo.db? Wait for library work?<br />
** Need to drop off the meeting about 40 minutes in so if we can cover before then it would be appreciated. :-)<br />
<br />
'''May 28th, 2014 16:00 UTC'''<br />
* 3rd Party CI (jungleboyj)<br />
** What tempest test cases to run?<br />
** iSCSI only? What about for FC only drivers then?<br />
** Progress on where to record results?<br />
* SSH host keys (jungleboyj)<br />
** https://launchpad.net/bugs/1320050 and https://bugs.launchpad.net/cinder/+bug/1320056<br />
** Need plan to get this addressed by all drivers using SSH. (New config options?)<br />
** Way to get this backported to Havana?<br />
* Dynamic multi-pools (navneet)<br />
** Status and WIP review (https://review.openstack.org/#/c/85760/)<br />
** Back manager design improvement/rewriting for better rpc message handling.<br />
** Back up service for multi pools.<br />
* cinder-specs (jgriffith)<br />
** Specs repo is live<br />
** Process<br />
** Reviews<br />
<br />
'''May 21st, 2014 16:00 UTC'''<br />
* Consistency Groups (xyang)<br />
** A few people have concerns on the restriction of one volume type per CG. Should we allow one CG to have multiple volume types on the same backend? Let's discuss about it.<br />
* Third-Party CI (jgriffith)<br />
** Who's started, who's planning to and how can we help support each other to get this going smoothly<br />
* Moving GlusterFS snapshot code into the NFS RemoteFs driver (mberlin)<br />
** The GlusterFS snapshot code using qcow2 snapshots is useful for all file based storage systems. I would volunteer to move the GlusterFS snapshot code into the general RemoteFs driver - making it easier to get [https://review.openstack.org/#/c/94186/ our driver] accepted ;-)<br />
** Eric Harney is fine with this and planned to do this for Juno anyway ([https://blueprints.launchpad.net/cinder/+spec/remotefs-snaps see his blueprint]). I've put it on the agenda to make sure others also agree with this approach.<br />
<br />
'''May 7th, 2014 16:00 UTC'''<br />
* Limit == 0 in API [https://review.openstack.org/#/c/86207/ patch review] - thingee<br />
<br />
'''April 16th, 2014 16:00 UTC'''<br />
* Release Status<br />
* Summit Session Updates<br />
* Next Stop ATL!!!<br />
* Cinder resource status - thingee<br />
<br />
'''April 9th, 2014 16:00 UTC'''<br />
(Agenda entered retrospectively)<br />
* Cinder Spec (jgriffith) <br />
** Just a heads up that cinder blueprints will move to a gerrit based process shortly, a la nova. Details and wiki entry to follow.<br />
* RC2 status (jgriffith) <br />
** Just after cutting RC2, a bunch of bugs<br />
* Testing RC code (jgriffith)<br />
** Get on it, folks!<br />
**Looks like theres some serious, intermittent performance issues in the API somewhere...<br />
<br />
'''April 2, 2014 16:00 UTC'''<br />
_Meeting cancelled and summary discussion held on #openstack-cinder_<br />
<br />
* Release status and bugs<br />
* -2s left on reviews from before Junos opened - please check if they are still valid<br />
- https://review.openstack.org/#/c/73446/ (JGriffith)<br />
- https://review.openstack.org/#/c/80550/ (JBryant)<br />
- https://review.openstack.org/#/c/82100/ (Avishay)<br />
- https://review.openstack.org/#/c/74158/ (Avishay)<br />
- + a whole bunch of stable branch stuff<br />
<br />
== Previous meetings ==<br />
<br />
'''Mar 26, 2014 16:00 UTC'''<br />
* RC1 updates (jgriffith)<br />
* Design Summit Sessions (jgriffith)<br />
http://eavesdrop.openstack.org/meetings/cinder/2014/cinder.2014-03-26-16.00.log.html<br />
<br />
'''Mar 19, 2014 16:00 UTC'''<br />
* ProphetStor Driver Exception request for Icehouse (jgriffith)<br />
* Bug status/updates (jgriffith)<br />
* What we should be punting to Juno (aka immediate -2 in Gerrit) (jgriffith)<br />
* Continuous Integration for Cinder Certification (jungleboyj)<br />
<br />
'''Mar 12, 2014 16:00 UTC'''<br />
* Cancelled due to nothing on the agenda. Ad-hoc discussion on #openstack-cinder instead<br />
<br />
'''Mar 5, 2014 16:00 UTC'''<br />
* Volume replication - avishay<br />
* [https://blueprints.launchpad.net/cinder/+spec/lvm-driver-for-shared-storage New LVM-based driver for shared storage] - mtanino<br />
* DRBD/drbdmanage driver for cinder - philr<br />
<br />
'''Feb 19, 2014 16:00 UTC'''<br />
* I3 Status check/updates<br />
* Cert test<br />
* [https://review.openstack.org/#/c/73745 Milestone Consideration for Drivers] -thingee<br />
* [https://etherpad.openstack.org/p/cinder-hack-201402 Hack-a-thon details] -thingee<br />
* [https://review.openstack.org/#/c/66737/ scheduling for local storage] -DuncanT<br />
<br />
'''Feb 5, 2014 16:00 UTC'''<br />
* I3 Status check/updates<br />
* Cert test<br />
* Multiple pools per backend (bswartz)<br />
'''Jan 8, 2014 16:00 UTC'''<br />
* I2 is just around the corner, blueprint updates<br />
* Alternating meeting time proposal, results on feedback<br />
* Driver cert test, it's there... use it<br />
* Prioritizing patches and reviews<br />
'''December 18, 2013 16:00 UTC'''<br />
* Blueprint discussion [https://blueprints.launchpad.net/cinder/+spec/cinder-backup-recover-api cinder backup recovery api - import/export backups] - avishay<br />
* Blueprint discussion [https://blueprints.launchpad.net/cinder/+spec/copy-volume-to-image-task-flow] - Griffith<br />
* [https://blueprints.launchpad.net/cinder/+spec/admin-defined-capabilities Admin-defined capabilities] - Ollie<br />
* Why is type manage an extension? -Thingee<br />
<br />
'''December 11, 2013 16:00 UTC'''<br />
* Proposal of [https://etherpad.openstack.org/p/cinder-extensions extension packages] -Thingee<br />
<br />
'''December 4, 2013 16:00 UTC'''<br />
* Progressing with [https://wiki.openstack.org/wiki/Cinder/blueprints/multi-attach-volume multi-attach / shared-volume] - sgordon<br />
* [https://etherpad.openstack.org/p/icehouse-cinder-acls-for-volumes Access Control List design discussion] - alatynskaya<br />
<br />
'''November 27, 2013 16:00 UTC'''<br />
* [https://etherpad.openstack.org/p/icehouse-cinder-continuous-volume-replication-v2 Updated volume mirroring design] - avishay<br />
* Start using only Mock for new tests... [http://lists.openstack.org/pipermail/openstack-dev/2013-November/018501.html Related Nova Discussion] - Thingee<br />
* Rate limiting came up in the summit, and [http://lists.openstack.org/pipermail/openstack-dev/2013-November/020291.html on openstack-dev] - avishay<br />
* Metadata backup (https://review.openstack.org/#/c/51900/) progress RFC - dosaboy<br />
<br />
<br />
'''November 20, 2013 16:00 UTC'''<br />
* I-1 scheduling - JGriffith<br />
<br />
<br />
'''November 13, 2013 16:00 UTC'''<br />
* patches should update doc files where necessary to ease writing of release notes (Avishay?)<br />
* fencing host from storage (Ehud Trainin)<br />
* Summarize priority of tasks from summit discussions (https://wiki.openstack.org/wiki/Summit/Icehouse/Etherpads#Cinder and https://etherpad.openstack.org/p/cinder-icehouse-summary) Griff<br />
<br />
<br />
'''October 30, 2013 16:00 UTC'''<br />
* cinder backup metadata support - http://goo.gl/Jkg2FV (dosaboy)<br />
* fencing and unfencing host from storage - https://blueprints.launchpad.net/cinder/+spec/fencing-and-unfencing (Ehud Trainin)<br />
<br />
'''October 23, 2013 16:00 UTC'''<br />
* Nexenta backup driver https://review.openstack.org/#/c/47005/ - DuncanT<br />
<br />
<br />
'''October 2, 2013 16:00 UTC'''<br />
* What's still broken in Havana<br />
:* Backups and multibackend (https://code.launchpad.net/bugs/1228223): Fix committed<br />
:* Configuration - Global CONF settings in brick don't belong, and a number of them break multi-backend (Bug #1230066): Fix committed<br />
:* TaskFlow retry mechanism - The majority felt this should be left as a white list, but no work has been done to fix it so we still have ugly failures/roll-backs (3 bugs logged here): '''???'''<br />
:* Quotas - Don't know that anybody has gotten to the bottom of the quota syncing issue (Bug #1202896): '''Still open'''<br />
:* iSCSI Target creation failures - This was thought to have been fixed but showed up last night (Bug #1223469): Probable fix committed<br />
:* Summary of gate issue pertaining to Cinder can be viewed here: http://paste.openstack.org/show/47798/<br />
:* Moving to taskflow - avishay<br />
<br />
<br />
'''Sept 25, 2013, 16:00 UTC'''<br />
* PTL nomination process is open until the 26'th, if you want to run send your nomination proposal out to the dev ML<br />
* What's broken in Havana<br />
:* Backups (specifically when configured with multi-backend volumes)<br />
:* Configuration - Global CONF settings in brick don't belong, and a number of them break multi-backend (Bug #1230066)<br />
:* TaskFlow retry mechanism - The majority felt this should be left as a white list, but no work has been done to fix it so we still have ugly failures/roll-backs (3 bugs logged here)<br />
:* Quotas - Don't know that anybody has gotten to the bottom of the quota syncing issue (Bug #1202896)<br />
:* iSCSI Target creation failures - This was thought to have been fixed but showed up last night (Bug #1223469)<br />
:* ????<br />
* Cinderclient release plans/status? (Eharney)<br />
* OSLO imports (DuncanT)<br />
* bp/cinder-backup-improvements (dosaboy)<br />
* bp/multi-attach (zhiyan)<br />
<br />
<br />
<br />
'''Aug 21, 2013, 16:00 UTC'''<br />
# No agenda, no meeting.<br />
<br />
'''Aug 14, 2013, 16:00 UTC'''<br />
# Volume migration status - avishay<br />
# API extensions using metadata. This comes from the [https://review.openstack.org/#/c/38322/ readonly volume attach support]. - thingee<br />
[http://eavesdrop.openstack.org/meetings/cinder/2013/cinder.2013-08-14-16.00.log.html IRC Log]<br />
<br />
'''Aug 7, 2013, 16:00 UTC'''<br />
# [https://bugs.launchpad.net/cinder/+bug/1209199 RFC - make all rbd clones copy-on-write] -- Dosaboy<br />
# V1 API removal issues, plans and timescales - DuncanT<br />
<br />
== Meeting Minutes ==<br />
<br />
http://eavesdrop.openstack.org/meetings/cinder/2014/<br />
<br />
http://eavesdrop.openstack.org/meetings/cinder/2013/<br />
<br />
http://eavesdrop.openstack.org/meetings/cinder/2012/</div>Ronenkathttps://wiki.openstack.org/w/index.php?title=CinderMeetings&diff=55077CinderMeetings2014-06-06T07:15:26Z<p>Ronenkat: </p>
<hr />
<div><br />
= Weekly Cinder team meeting =<br />
'''NOTE MEETING TIME: Wed's at 16:00 UTC'''<br />
<br />
If you're interested in Cinder or Block Storage in general for OpenStack, we have a weekly meetings in <code><nowiki>#openstack-meeting</nowiki></code>, on Wednesdays at 16:00 UTC. Please feel free to add items to the agenda below. NOTE: When adding topics please include your IRC name so we know who's topic it is and how to get more info.<br />
<br />
== Next meeting ==<br />
'''NOTE:''' ''Include your IRC nickname next to agenda items so that you can be called upon in the meeting and arrive at the meeting promptly if placing items in agenda. You might want to put this on your calendar if you are adding items.''<br />
<br />
'''June 11th, 2014 16:00 UTC'''<br />
* Volume replication (ronenkat)<br />
** Blueprint and spec review/comments? https://review.openstack.org/#/c/98308<br />
<br />
== Previous meetings ==<br />
<br />
'''June 4th, 2014 16:00 UTC'''<br />
* Volume backup modification (navneet)<br />
** Blueprint and spec review/comments? https://blueprints.launchpad.net/cinder/+spec/vol-backup-service-per-backend<br />
* Dynamic multi pool (navneet)<br />
** Review comments? https://review.openstack.org/#/c/85760/<br />
** Implementation approach comparison.<br />
* 3rd party ci (asselin)<br />
** I have a conflict with another meeting, but my WIP to add nodepool into jaypipe's 3rd party ci solution is available here: https://github.com/rasselin/os-ext-testing/tree/nodepool<br />
* oslo.db (jungleboyj)<br />
** Want to quickly discuss the review out there for this: https://review.openstack.org/#/c/77125/<br />
** Move to current oslo.db? Wait for library work?<br />
** Need to drop off the meeting about 40 minutes in so if we can cover before then it would be appreciated. :-)<br />
<br />
'''May 28th, 2014 16:00 UTC'''<br />
* 3rd Party CI (jungleboyj)<br />
** What tempest test cases to run?<br />
** iSCSI only? What about for FC only drivers then?<br />
** Progress on where to record results?<br />
* SSH host keys (jungleboyj)<br />
** https://launchpad.net/bugs/1320050 and https://bugs.launchpad.net/cinder/+bug/1320056<br />
** Need plan to get this addressed by all drivers using SSH. (New config options?)<br />
** Way to get this backported to Havana?<br />
* Dynamic multi-pools (navneet)<br />
** Status and WIP review (https://review.openstack.org/#/c/85760/)<br />
** Back manager design improvement/rewriting for better rpc message handling.<br />
** Back up service for multi pools.<br />
* cinder-specs (jgriffith)<br />
** Specs repo is live<br />
** Process<br />
** Reviews<br />
<br />
'''May 21st, 2014 16:00 UTC'''<br />
* Consistency Groups (xyang)<br />
** A few people have concerns on the restriction of one volume type per CG. Should we allow one CG to have multiple volume types on the same backend? Let's discuss about it.<br />
* Third-Party CI (jgriffith)<br />
** Who's started, who's planning to and how can we help support each other to get this going smoothly<br />
* Moving GlusterFS snapshot code into the NFS RemoteFs driver (mberlin)<br />
** The GlusterFS snapshot code using qcow2 snapshots is useful for all file based storage systems. I would volunteer to move the GlusterFS snapshot code into the general RemoteFs driver - making it easier to get [https://review.openstack.org/#/c/94186/ our driver] accepted ;-)<br />
** Eric Harney is fine with this and planned to do this for Juno anyway ([https://blueprints.launchpad.net/cinder/+spec/remotefs-snaps see his blueprint]). I've put it on the agenda to make sure others also agree with this approach.<br />
<br />
'''May 7th, 2014 16:00 UTC'''<br />
* Limit == 0 in API [https://review.openstack.org/#/c/86207/ patch review] - thingee<br />
<br />
'''April 16th, 2014 16:00 UTC'''<br />
* Release Status<br />
* Summit Session Updates<br />
* Next Stop ATL!!!<br />
* Cinder resource status - thingee<br />
<br />
'''April 9th, 2014 16:00 UTC'''<br />
(Agenda entered retrospectively)<br />
* Cinder Spec (jgriffith) <br />
** Just a heads up that cinder blueprints will move to a gerrit based process shortly, a la nova. Details and wiki entry to follow.<br />
* RC2 status (jgriffith) <br />
** Just after cutting RC2, a bunch of bugs<br />
* Testing RC code (jgriffith)<br />
** Get on it, folks!<br />
**Looks like theres some serious, intermittent performance issues in the API somewhere...<br />
<br />
'''April 2, 2014 16:00 UTC'''<br />
_Meeting cancelled and summary discussion held on #openstack-cinder_<br />
<br />
* Release status and bugs<br />
* -2s left on reviews from before Junos opened - please check if they are still valid<br />
- https://review.openstack.org/#/c/73446/ (JGriffith)<br />
- https://review.openstack.org/#/c/80550/ (JBryant)<br />
- https://review.openstack.org/#/c/82100/ (Avishay)<br />
- https://review.openstack.org/#/c/74158/ (Avishay)<br />
- + a whole bunch of stable branch stuff<br />
<br />
== Previous meetings ==<br />
<br />
'''Mar 26, 2014 16:00 UTC'''<br />
* RC1 updates (jgriffith)<br />
* Design Summit Sessions (jgriffith)<br />
http://eavesdrop.openstack.org/meetings/cinder/2014/cinder.2014-03-26-16.00.log.html<br />
<br />
'''Mar 19, 2014 16:00 UTC'''<br />
* ProphetStor Driver Exception request for Icehouse (jgriffith)<br />
* Bug status/updates (jgriffith)<br />
* What we should be punting to Juno (aka immediate -2 in Gerrit) (jgriffith)<br />
* Continuous Integration for Cinder Certification (jungleboyj)<br />
<br />
'''Mar 12, 2014 16:00 UTC'''<br />
* Cancelled due to nothing on the agenda. Ad-hoc discussion on #openstack-cinder instead<br />
<br />
'''Mar 5, 2014 16:00 UTC'''<br />
* Volume replication - avishay<br />
* [https://blueprints.launchpad.net/cinder/+spec/lvm-driver-for-shared-storage New LVM-based driver for shared storage] - mtanino<br />
* DRBD/drbdmanage driver for cinder - philr<br />
<br />
'''Feb 19, 2014 16:00 UTC'''<br />
* I3 Status check/updates<br />
* Cert test<br />
* [https://review.openstack.org/#/c/73745 Milestone Consideration for Drivers] -thingee<br />
* [https://etherpad.openstack.org/p/cinder-hack-201402 Hack-a-thon details] -thingee<br />
* [https://review.openstack.org/#/c/66737/ scheduling for local storage] -DuncanT<br />
<br />
'''Feb 5, 2014 16:00 UTC'''<br />
* I3 Status check/updates<br />
* Cert test<br />
* Multiple pools per backend (bswartz)<br />
'''Jan 8, 2014 16:00 UTC'''<br />
* I2 is just around the corner, blueprint updates<br />
* Alternating meeting time proposal, results on feedback<br />
* Driver cert test, it's there... use it<br />
* Prioritizing patches and reviews<br />
'''December 18, 2013 16:00 UTC'''<br />
* Blueprint discussion [https://blueprints.launchpad.net/cinder/+spec/cinder-backup-recover-api cinder backup recovery api - import/export backups] - avishay<br />
* Blueprint discussion [https://blueprints.launchpad.net/cinder/+spec/copy-volume-to-image-task-flow] - Griffith<br />
* [https://blueprints.launchpad.net/cinder/+spec/admin-defined-capabilities Admin-defined capabilities] - Ollie<br />
* Why is type manage an extension? -Thingee<br />
<br />
'''December 11, 2013 16:00 UTC'''<br />
* Proposal of [https://etherpad.openstack.org/p/cinder-extensions extension packages] -Thingee<br />
<br />
'''December 4, 2013 16:00 UTC'''<br />
* Progressing with [https://wiki.openstack.org/wiki/Cinder/blueprints/multi-attach-volume multi-attach / shared-volume] - sgordon<br />
* [https://etherpad.openstack.org/p/icehouse-cinder-acls-for-volumes Access Control List design discussion] - alatynskaya<br />
<br />
'''November 27, 2013 16:00 UTC'''<br />
* [https://etherpad.openstack.org/p/icehouse-cinder-continuous-volume-replication-v2 Updated volume mirroring design] - avishay<br />
* Start using only Mock for new tests... [http://lists.openstack.org/pipermail/openstack-dev/2013-November/018501.html Related Nova Discussion] - Thingee<br />
* Rate limiting came up in the summit, and [http://lists.openstack.org/pipermail/openstack-dev/2013-November/020291.html on openstack-dev] - avishay<br />
* Metadata backup (https://review.openstack.org/#/c/51900/) progress RFC - dosaboy<br />
<br />
<br />
'''November 20, 2013 16:00 UTC'''<br />
* I-1 scheduling - JGriffith<br />
<br />
<br />
'''November 13, 2013 16:00 UTC'''<br />
* patches should update doc files where necessary to ease writing of release notes (Avishay?)<br />
* fencing host from storage (Ehud Trainin)<br />
* Summarize priority of tasks from summit discussions (https://wiki.openstack.org/wiki/Summit/Icehouse/Etherpads#Cinder and https://etherpad.openstack.org/p/cinder-icehouse-summary) Griff<br />
<br />
<br />
'''October 30, 2013 16:00 UTC'''<br />
* cinder backup metadata support - http://goo.gl/Jkg2FV (dosaboy)<br />
* fencing and unfencing host from storage - https://blueprints.launchpad.net/cinder/+spec/fencing-and-unfencing (Ehud Trainin)<br />
<br />
'''October 23, 2013 16:00 UTC'''<br />
* Nexenta backup driver https://review.openstack.org/#/c/47005/ - DuncanT<br />
<br />
<br />
'''October 2, 2013 16:00 UTC'''<br />
* What's still broken in Havana<br />
:* Backups and multibackend (https://code.launchpad.net/bugs/1228223): Fix committed<br />
:* Configuration - Global CONF settings in brick don't belong, and a number of them break multi-backend (Bug #1230066): Fix committed<br />
:* TaskFlow retry mechanism - The majority felt this should be left as a white list, but no work has been done to fix it so we still have ugly failures/roll-backs (3 bugs logged here): '''???'''<br />
:* Quotas - Don't know that anybody has gotten to the bottom of the quota syncing issue (Bug #1202896): '''Still open'''<br />
:* iSCSI Target creation failures - This was thought to have been fixed but showed up last night (Bug #1223469): Probable fix committed<br />
:* Summary of gate issue pertaining to Cinder can be viewed here: http://paste.openstack.org/show/47798/<br />
:* Moving to taskflow - avishay<br />
<br />
<br />
'''Sept 25, 2013, 16:00 UTC'''<br />
* PTL nomination process is open until the 26'th, if you want to run send your nomination proposal out to the dev ML<br />
* What's broken in Havana<br />
:* Backups (specifically when configured with multi-backend volumes)<br />
:* Configuration - Global CONF settings in brick don't belong, and a number of them break multi-backend (Bug #1230066)<br />
:* TaskFlow retry mechanism - The majority felt this should be left as a white list, but no work has been done to fix it so we still have ugly failures/roll-backs (3 bugs logged here)<br />
:* Quotas - Don't know that anybody has gotten to the bottom of the quota syncing issue (Bug #1202896)<br />
:* iSCSI Target creation failures - This was thought to have been fixed but showed up last night (Bug #1223469)<br />
:* ????<br />
* Cinderclient release plans/status? (Eharney)<br />
* OSLO imports (DuncanT)<br />
* bp/cinder-backup-improvements (dosaboy)<br />
* bp/multi-attach (zhiyan)<br />
<br />
<br />
<br />
'''Aug 21, 2013, 16:00 UTC'''<br />
# No agenda, no meeting.<br />
<br />
'''Aug 14, 2013, 16:00 UTC'''<br />
# Volume migration status - avishay<br />
# API extensions using metadata. This comes from the [https://review.openstack.org/#/c/38322/ readonly volume attach support]. - thingee<br />
[http://eavesdrop.openstack.org/meetings/cinder/2013/cinder.2013-08-14-16.00.log.html IRC Log]<br />
<br />
'''Aug 7, 2013, 16:00 UTC'''<br />
# [https://bugs.launchpad.net/cinder/+bug/1209199 RFC - make all rbd clones copy-on-write] -- Dosaboy<br />
# V1 API removal issues, plans and timescales - DuncanT<br />
<br />
== Meeting Minutes ==<br />
<br />
http://eavesdrop.openstack.org/meetings/cinder/2014/<br />
<br />
http://eavesdrop.openstack.org/meetings/cinder/2013/<br />
<br />
http://eavesdrop.openstack.org/meetings/cinder/2012/</div>Ronenkathttps://wiki.openstack.org/w/index.php?title=DisasterRecovery&diff=46639DisasterRecovery2014-03-26T07:14:31Z<p>Ronenkat: /* Looking toward the Juno summit */</p>
<hr />
<div>= Disaster Recovery for OpenStack =<br />
<br />
'''Disaster Recovery (DR)''' for OpenStack is an umbrella topic that describes what needs to be done for '''applications and services''' (generally referred to as workload) running in an OpenStack cloud to survive a '''large scale disaster'''.<br />
Providing DR for a workload is a complex task involving infrastructure, software and an understanding of the workload. To enable recovery following a disaster, the administrator needs to execute a complex set of provisioning operations that will mimic the day-to-day setup in a different environment.<br />
Enabling DR for OpenStack hosted workloads requires enablement (APIs) in OpenStack components (e.g., Cinder) and tools which may be outside of OpenStack (e.g., scripts) to invoke, orchestrate and leverage the component specific APIs. <br />
<br />
The goal of this work is to create a framework that will enable protecting '''applications and services''' (VMs, images, volumes, etc) from disaster.<br />
Determining and selecting ''what application and services to protect'' is the responsibility of the user, while ''handling the logistics of protecting'' is up to the cloud (and its operator).<br />
<br />
== What is Disaster Recovery? ==<br />
Disaster Recovery is the process of ensuring continuity of a set of workloads following or in advance of a large scale disaster that disrupts the current environment or infrastructure. By large scale disaster, we are considering disasters which can lead to a complete loss of a data center such as floods, tornadoes, hurricanes, fires, etc. To provide DR, we need a geographically distant site which will be the target of recovery. Any resources, data, etc., needed by the application to recover need to be at the target site prior to the disaster.<br />
<br />
== High Availability versus Disaster Recovery ==<br />
While both High Availability (HA) and Disaster Recovery strive to achieve continued operations in face of failures, High Availability usually deals with individual components failures, while Disaster Recovery deals with large scale failures. <br />
<br />
Some distinguish HA from DR by networking scope - LAN for HA and WAN for DR, in the cloud context a better distinction is probably the autonomy of management. High Availability will be the mechanism for continued operations within a single cloud environment - one deployment of OpenStack in a single location or multiple locations. Disaster Recovery will be the mechanism for continued operations when you have multiple cloud environments - multiple OpenStack deployment in various locations. In this context DR is a continued workload operations in an alternative deployment, the recovery target clouds.<br />
<br />
== Scope and Scenarios ==<br />
The goal is to provide a mechanism to mark and protect from disaster '''applications and services''' (a set of OpenStack entities) also referred to as a hosted workload. <br />
In this context the cloud is the equivalent of the physical hardware, and the recovery process focuses on the application and services, including their data, which are running in the cloud.<br />
<br />
The mechanism to determine the exact set of VMs, VM images, volumes, etc, to be recover can be based on a tenant, or a per entity mechanism. In its most basic case, it could be a single VM, but can also be all the entities associated with a user.<br />
<br />
A separate recovery mechanism, outside the scope of this work, should address making the primary cloud available to run workloads following a disaster.<br />
The disaster recovery mechanism for applications and services will handle the fail-back to the primary cloud.<br />
<br />
=== Examples ===<br />
* Application service running on customer cloud and protected by recovery on hosted cloud.<br />
* Application service running on customer cloud in data center #1 and protected by recovery on customer data center #2.<br />
<br />
[[file:DR.png]]<br />
<br />
The plan is to provide a solution for both the born-in-the-cloud applications, as well as legacy applications that require storage and state.<br />
<br />
== Is this a new OpenStack project? ==<br />
Not necessarily. A better description would be an umbrella topic that describes the required APIs and features that OpenStack needs in order to support DR for hosted workloads. Some APIs and features will be integrated into existing projects such as Nova (DR features for compute) and Cinder (Storage replication). Some functionality, like DR orchestration may leverage Heat, or be a new project, or even be outside the scope of OpenStack.<br />
<br />
Disaster Recovery is a complex task where different applications and use-cases have different requirements; some use-cases can be easily supported while others may be more complex, this is targeted as a long-term effort with incremental steps.<br />
<br />
== Vision and Roadmap ==<br />
Disaster Recovery should include support for:<br />
* Capturing the metadata of the cloud management stack, relevant for the protected workloads/resources: either as point-in-time snapshots of the metadata, or as continuous replication of the metadata.<br />
* Making available the VM images needed to run the hosted workload on the target cloud.<br />
* Replication of the workload data using storage replication, application level replication, or backup/restore.<br />
<br />
We note that metadata changes are less frequent than application data changes, and different mechanisms can handle replication of different portions of the metadata and data (volumes, images, etc)<br />
<br />
The approach is built around:<br />
# Identify required enablement and missing features in OpenStack projects <br />
# Create enablement in specific OpenStack projects <br />
# Create orchestration scripts to demonstrate DR <br />
<br />
When resources to be protected are logically associated with a workload (or a set of inter-related workloads), both the replication and the recovery processes should be able to incorporate hooks to ensure consistency of the replicated data & metadata, as well as to enable customization (automated or manual) of the individual workload components at recovery site. Heat can be used to represent such workloads, as well as to automate the above processes (when applicable).<br />
<br />
== Design Tenets ==<br />
* The DR is between a primary cloud and a target cloud - independently managed.<br />
* The approach should enable a hybrid deployment between private and public cloud.<br />
* Note that some of the work related to DR may be relevant to enabling high-availability between regions, availability zones or cells which do share some of the OpenStack services.<br />
* Ideally (but not as an immediate step) one of the clouds (primary or target) could be non-OpenStack or even non-cloud bare-metal environments.<br />
* The primary and target cloud interact through a “mediator” - a DR middleware or gateway to make sure the clouds are decoupled.<br />
* The DR scheme will protect a set of VMs and related resources (VM images, persistent storage, network definitions, metadata, etc). The resources would be typically associated with a workload or a set of workloads owned by a tenant.<br />
* Allow flexibility in choice of Recovery Point Objective (RPO) and Recovery Time Objective (RTO).<br />
<br />
=== Disaster Recovery functionality to be supported ===<br />
* Fail-over - switch to recovery site upon failure<br />
* Fail-back - switch back to primary site<br />
* Test - test application in a sandbox at the recovery site <br />
<br />
=== End goal for Disaster Recovery ===<br />
* Define RPO/RTO objectives<br />
** Defines the replication params (sync/async, bandwidth, etc.)<br />
** Defines DR policy type<br />
* Enablement of multiple DR Policy options<br />
** backup to Swift<br />
** Active - Cold standby<br />
** Active - Hot standby<br />
** Active - Active (requires application awareness and support)<br />
** Plugable DR policies - e.g. DR to the cloud<br />
* Ability to mark a complete composite application as protected<br />
* Ability to elect DR region or availability zone per application<br />
* Ability to create one to many DR relationships per application<br />
* Ability to scale down the application at the recovery site if needed<br />
* Replication of all configuration and metadata required by an application - Neutron, Cinder, Nova, etc.<br />
* Ability to ensure consistency of the replicated data & metadata<br />
* Supporting a wide range of data replication methods<br />
** Storage systems based replication<br />
** Hypervisor assisted (possibly between heterogeneous storage systems). For example, using DRBD or Qemu based replication<br />
** Backup and Restore methods<br />
** Pluggable application level replication methods<br />
* Integration with horizon for basic DR orchestration<br />
<br />
== Activities ==<br />
<br />
=== Looking toward the Juno summit ===<br />
* [http://www.youtube.com/watch?v=FlWtmTAsJTE Demonstrating Disaster Recovery for OpenStack using backup and restore apparoch] [https://wiki.openstack.org/w/images/4/49/Openstack_disaster_recovery_-_openstack_meetup.pdf Slides]<br />
* [https://etherpad.openstack.org/p/juno-disaster-recovery-call-for-stakeholders call for stakeholder and collaboration for the Juno development cycle]<br />
<br />
=== Related sessions in Icehouse summit ===<br />
* [http://openstacksummitnovember2013.sched.org/event/36ef8daa098c248d7fbb4ac7409f802a#%20 Surviving the worst: A vision for OpenStack disaster recovery - November 7, 9:50am]<br />
* Storage replication (Cinder) - [http://summit.openstack.org/cfp/details/69 Volume continuous replication]<br />
<br />
=== Icehouse Features ===<br />
<br />
* Cinder Volume replication <br />
** [https://review.openstack.org/#/c/64026/ Volume replication] - to be continued during Juno development cycle<br />
** [https://review.openstack.org/#/c/64027/ admin commands for managing volume replication] - to be continued during Juno development cycle<br />
** [https://review.openstack.org/#/c/70792/ Replication enablement for IBM Storwize/SVC] - to be continued during Juno development cycle<br />
<br />
* [https://etherpad.openstack.org/p/icehouse-cinder-volume-replication-feedback Comments, compatibility and gap for volume replication]<br />
<br />
* Enable exporting/importing Cinder backups between OpenStack deployments<br />
<br />
** [https://review.openstack.org/#/c/69351 Export and import backup service metadata]<br />
** [https://review.openstack.org/#/c/72743 Client support for export and import backup service metadata]<br />
<br />
=== Related projects and topics ===<br />
* Resource reservation on target cloud - [[Resource-reservation-service|https://wiki.openstack.org/wiki/Resource-reservation-service]]<br />
* Heat description of workload / Stack abandon and adopt (Icehouse session proposal - merged into one session) - [http://summit.openstack.org/cfp/details/98 Create Heat stack from existing resources] [http://summit.openstack.org/cfp/details/200 Stack abandon and adopt]<br />
<br />
=== Contacts and (current) team ===<br />
* Ronen Kat (ronenkat) (IBM)<br />
* Ayal Baron (abaron) (Red Hat)<br />
* Sean Cohen (scohen) (Red Hat)<br />
* Alex Glikson (glikson) (IBM)<br />
* Avishay Traeger (avishay-il) (IBM)</div>Ronenkathttps://wiki.openstack.org/w/index.php?title=File:Openstack_disaster_recovery_-_openstack_meetup.pdf&diff=46637File:Openstack disaster recovery - openstack meetup.pdf2014-03-26T07:10:42Z<p>Ronenkat: </p>
<hr />
<div></div>Ronenkathttps://wiki.openstack.org/w/index.php?title=Cinder/driver-maintainers&diff=45071Cinder/driver-maintainers2014-03-09T14:52:23Z<p>Ronenkat: </p>
<hr />
<div>The following is a list of driver items in the Cinder tree along with maintainer and contact info for those that are listed as managing or owning the driver<br />
<br />
{| class="wikitable sortable"<br />
|-<br />
! Vendor/Driver !! Maintainers Name !! Maintainers Email !! Maintainers IRC Nic !! Additional contacts<br />
|-<br />
| SolidFireDriver || John Griffith || john.griffith@solidfire.com || jgriffith || <br />
|-<br />
| LVM || John Griffith || john.griffith@solidfire.com || Eric Harney, Jon Bernard || <br />
|-<br />
| IBM Drivers (XIV/DS8K, Storwize, NAS, GPFS) || || openstack_storage_drivers@il.ibm.com || || <br />
|-<br />
| NetApp || Ben Swartzlander || || bswartz || <br />
|-<br />
| Huaweii || || || || <br />
|-<br />
| Hitachi Data Systems || || || || <br />
|-<br />
| Gluster || Eric Harney || || eharney || <br />
|-<br />
| CEPH || Josh Durgin || || jdurgin || Mike Perez, Edward Hope-Morley<br />
|-<br />
| HP-3PAR || Walter Boring, Kurt Martin, Jim Branen, Ramy Asselin || walter.boring@hp.com || hemna, kmartin, branen, asselin ||<br />
|-<br />
| HP-LeftHand || Kurt Martin, Walt Boring, Jim Branen, Ramy Asselin || kurt.f.martin@hp.com || kmartin, hemna, branen, asselin||<br />
|-<br />
| EMC || Xing Yang || xing.yang@emc.com || xyang ||<br />
|-<br />
| VMware VC/ESX || Subramanian Neelakantan || sneelakantan@vmware.com || sneelakantan || Kartik Bommepally<br />
|-<br />
|}</div>Ronenkathttps://wiki.openstack.org/w/index.php?title=Cinder/driver-maintainers&diff=45070Cinder/driver-maintainers2014-03-09T14:49:16Z<p>Ronenkat: </p>
<hr />
<div>The following is a list of driver items in the Cinder tree along with maintainer and contact info for those that are listed as managing or owning the driver<br />
<br />
{| class="wikitable sortable"<br />
|-<br />
! Vendor/Driver !! Maintainers Name !! Maintainers Email !! Maintainers IRC Nic !! Additional contacts<br />
|-<br />
| SolidFireDriver || John Griffith || john.griffith@solidfire.com || jgriffith || <br />
|-<br />
| LVM || John Griffith || john.griffith@solidfire.com || Eric Harney, Jon Bernard || <br />
|-<br />
| IBM Drivers (XIV, Storwize, NAS, GPFS) || || openstack_storage_drivers@il.ibm.com || || <br />
|-<br />
| NetApp || Ben Swartzlander || || bswartz || <br />
|-<br />
| Huaweii || || || || <br />
|-<br />
| Hitachi Data Systems || || || || <br />
|-<br />
| Gluster || Eric Harney || || eharney || <br />
|-<br />
| CEPH || Josh Durgin || || jdurgin || Mike Perez, Edward Hope-Morley<br />
|-<br />
| HP-3PAR || Walter Boring, Kurt Martin, Jim Branen, Ramy Asselin || walter.boring@hp.com || hemna, kmartin, branen, asselin ||<br />
|-<br />
| HP-LeftHand || Kurt Martin, Walt Boring, Jim Branen, Ramy Asselin || kurt.f.martin@hp.com || kmartin, hemna, branen, asselin||<br />
|-<br />
| EMC || Xing Yang || xing.yang@emc.com || xyang ||<br />
|-<br />
| VMware VC/ESX || Subramanian Neelakantan || sneelakantan@vmware.com || sneelakantan || Kartik Bommepally<br />
|-<br />
|}</div>Ronenkathttps://wiki.openstack.org/w/index.php?title=DisasterRecovery&diff=44071DisasterRecovery2014-03-04T11:29:35Z<p>Ronenkat: /* Looking toward the Juno summit */</p>
<hr />
<div>= Disaster Recovery for OpenStack =<br />
<br />
'''Disaster Recovery (DR)''' for OpenStack is an umbrella topic that describes what needs to be done for '''applications and services''' (generally referred to as workload) running in an OpenStack cloud to survive a '''large scale disaster'''.<br />
Providing DR for a workload is a complex task involving infrastructure, software and an understanding of the workload. To enable recovery following a disaster, the administrator needs to execute a complex set of provisioning operations that will mimic the day-to-day setup in a different environment.<br />
Enabling DR for OpenStack hosted workloads requires enablement (APIs) in OpenStack components (e.g., Cinder) and tools which may be outside of OpenStack (e.g., scripts) to invoke, orchestrate and leverage the component specific APIs. <br />
<br />
The goal of this work is to create a framework that will enable protecting '''applications and services''' (VMs, images, volumes, etc) from disaster.<br />
Determining and selecting ''what application and services to protect'' is the responsibility of the user, while ''handling the logistics of protecting'' is up to the cloud (and its operator).<br />
<br />
== What is Disaster Recovery? ==<br />
Disaster Recovery is the process of ensuring continuity of a set of workloads following or in advance of a large scale disaster that disrupts the current environment or infrastructure. By large scale disaster, we are considering disasters which can lead to a complete loss of a data center such as floods, tornadoes, hurricanes, fires, etc. To provide DR, we need a geographically distant site which will be the target of recovery. Any resources, data, etc., needed by the application to recover need to be at the target site prior to the disaster.<br />
<br />
== High Availability versus Disaster Recovery ==<br />
While both High Availability (HA) and Disaster Recovery strive to achieve continued operations in face of failures, High Availability usually deals with individual components failures, while Disaster Recovery deals with large scale failures. <br />
<br />
Some distinguish HA from DR by networking scope - LAN for HA and WAN for DR, in the cloud context a better distinction is probably the autonomy of management. High Availability will be the mechanism for continued operations within a single cloud environment - one deployment of OpenStack in a single location or multiple locations. Disaster Recovery will be the mechanism for continued operations when you have multiple cloud environments - multiple OpenStack deployment in various locations. In this context DR is a continued workload operations in an alternative deployment, the recovery target clouds.<br />
<br />
== Scope and Scenarios ==<br />
The goal is to provide a mechanism to mark and protect from disaster '''applications and services''' (a set of OpenStack entities) also referred to as a hosted workload. <br />
In this context the cloud is the equivalent of the physical hardware, and the recovery process focuses on the application and services, including their data, which are running in the cloud.<br />
<br />
The mechanism to determine the exact set of VMs, VM images, volumes, etc, to be recover can be based on a tenant, or a per entity mechanism. In its most basic case, it could be a single VM, but can also be all the entities associated with a user.<br />
<br />
A separate recovery mechanism, outside the scope of this work, should address making the primary cloud available to run workloads following a disaster.<br />
The disaster recovery mechanism for applications and services will handle the fail-back to the primary cloud.<br />
<br />
=== Examples ===<br />
* Application service running on customer cloud and protected by recovery on hosted cloud.<br />
* Application service running on customer cloud in data center #1 and protected by recovery on customer data center #2.<br />
<br />
[[file:DR.png]]<br />
<br />
The plan is to provide a solution for both the born-in-the-cloud applications, as well as legacy applications that require storage and state.<br />
<br />
== Is this a new OpenStack project? ==<br />
Not necessarily. A better description would be an umbrella topic that describes the required APIs and features that OpenStack needs in order to support DR for hosted workloads. Some APIs and features will be integrated into existing projects such as Nova (DR features for compute) and Cinder (Storage replication). Some functionality, like DR orchestration may leverage Heat, or be a new project, or even be outside the scope of OpenStack.<br />
<br />
Disaster Recovery is a complex task where different applications and use-cases have different requirements; some use-cases can be easily supported while others may be more complex, this is targeted as a long-term effort with incremental steps.<br />
<br />
== Vision and Roadmap ==<br />
Disaster Recovery should include support for:<br />
* Capturing the metadata of the cloud management stack, relevant for the protected workloads/resources: either as point-in-time snapshots of the metadata, or as continuous replication of the metadata.<br />
* Making available the VM images needed to run the hosted workload on the target cloud.<br />
* Replication of the workload data using storage replication, application level replication, or backup/restore.<br />
<br />
We note that metadata changes are less frequent than application data changes, and different mechanisms can handle replication of different portions of the metadata and data (volumes, images, etc)<br />
<br />
The approach is built around:<br />
# Identify required enablement and missing features in OpenStack projects <br />
# Create enablement in specific OpenStack projects <br />
# Create orchestration scripts to demonstrate DR <br />
<br />
When resources to be protected are logically associated with a workload (or a set of inter-related workloads), both the replication and the recovery processes should be able to incorporate hooks to ensure consistency of the replicated data & metadata, as well as to enable customization (automated or manual) of the individual workload components at recovery site. Heat can be used to represent such workloads, as well as to automate the above processes (when applicable).<br />
<br />
== Design Tenets ==<br />
* The DR is between a primary cloud and a target cloud - independently managed.<br />
* The approach should enable a hybrid deployment between private and public cloud.<br />
* Note that some of the work related to DR may be relevant to enabling high-availability between regions, availability zones or cells which do share some of the OpenStack services.<br />
* Ideally (but not as an immediate step) one of the clouds (primary or target) could be non-OpenStack or even non-cloud bare-metal environments.<br />
* The primary and target cloud interact through a “mediator” - a DR middleware or gateway to make sure the clouds are decoupled.<br />
* The DR scheme will protect a set of VMs and related resources (VM images, persistent storage, network definitions, metadata, etc). The resources would be typically associated with a workload or a set of workloads owned by a tenant.<br />
* Allow flexibility in choice of Recovery Point Objective (RPO) and Recovery Time Objective (RTO).<br />
<br />
=== Disaster Recovery functionality to be supported ===<br />
* Fail-over - switch to recovery site upon failure<br />
* Fail-back - switch back to primary site<br />
* Test - test application in a sandbox at the recovery site <br />
<br />
=== End goal for Disaster Recovery ===<br />
* Define RPO/RTO objectives<br />
** Defines the replication params (sync/async, bandwidth, etc.)<br />
** Defines DR policy type<br />
* Enablement of multiple DR Policy options<br />
** backup to Swift<br />
** Active - Cold standby<br />
** Active - Hot standby<br />
** Active - Active (requires application awareness and support)<br />
** Plugable DR policies - e.g. DR to the cloud<br />
* Ability to mark a complete composite application as protected<br />
* Ability to elect DR region or availability zone per application<br />
* Ability to create one to many DR relationships per application<br />
* Ability to scale down the application at the recovery site if needed<br />
* Replication of all configuration and metadata required by an application - Neutron, Cinder, Nova, etc.<br />
* Ability to ensure consistency of the replicated data & metadata<br />
* Supporting a wide range of data replication methods<br />
** Storage systems based replication<br />
** Hypervisor assisted (possibly between heterogeneous storage systems). For example, using DRBD or Qemu based replication<br />
** Backup and Restore methods<br />
** Pluggable application level replication methods<br />
* Integration with horizon for basic DR orchestration<br />
<br />
== Activities ==<br />
<br />
=== Looking toward the Juno summit ===<br />
* [http://www.youtube.com/watch?v=FlWtmTAsJTE Demonstrating Disaster Recovery for OpenStack using backup and restore apparoch]<br />
* [https://etherpad.openstack.org/p/juno-disaster-recovery-call-for-stakeholders call for stakeholder and collaboration for the Juno development cycle]<br />
<br />
=== Related sessions in Icehouse summit ===<br />
* [http://openstacksummitnovember2013.sched.org/event/36ef8daa098c248d7fbb4ac7409f802a#%20 Surviving the worst: A vision for OpenStack disaster recovery - November 7, 9:50am]<br />
* Storage replication (Cinder) - [http://summit.openstack.org/cfp/details/69 Volume continuous replication]<br />
<br />
=== Icehouse Features ===<br />
<br />
* Cinder Volume replication <br />
** [https://review.openstack.org/#/c/64026/ Volume replication] - to be continued during Juno development cycle<br />
** [https://review.openstack.org/#/c/64027/ admin commands for managing volume replication] - to be continued during Juno development cycle<br />
** [https://review.openstack.org/#/c/70792/ Replication enablement for IBM Storwize/SVC] - to be continued during Juno development cycle<br />
<br />
* [https://etherpad.openstack.org/p/icehouse-cinder-volume-replication-feedback Comments, compatibility and gap for volume replication]<br />
<br />
* Enable exporting/importing Cinder backups between OpenStack deployments<br />
<br />
** [https://review.openstack.org/#/c/69351 Export and import backup service metadata]<br />
** [https://review.openstack.org/#/c/72743 Client support for export and import backup service metadata]<br />
<br />
=== Related projects and topics ===<br />
* Resource reservation on target cloud - [[Resource-reservation-service|https://wiki.openstack.org/wiki/Resource-reservation-service]]<br />
* Heat description of workload / Stack abandon and adopt (Icehouse session proposal - merged into one session) - [http://summit.openstack.org/cfp/details/98 Create Heat stack from existing resources] [http://summit.openstack.org/cfp/details/200 Stack abandon and adopt]<br />
<br />
=== Contacts and (current) team ===<br />
* Ronen Kat (ronenkat) (IBM)<br />
* Ayal Baron (abaron) (Red Hat)<br />
* Sean Cohen (scohen) (Red Hat)<br />
* Alex Glikson (glikson) (IBM)<br />
* Avishay Traeger (avishay-il) (IBM)</div>Ronenkathttps://wiki.openstack.org/w/index.php?title=DisasterRecovery&diff=44070DisasterRecovery2014-03-04T11:28:38Z<p>Ronenkat: /* Icehouse Features */</p>
<hr />
<div>= Disaster Recovery for OpenStack =<br />
<br />
'''Disaster Recovery (DR)''' for OpenStack is an umbrella topic that describes what needs to be done for '''applications and services''' (generally referred to as workload) running in an OpenStack cloud to survive a '''large scale disaster'''.<br />
Providing DR for a workload is a complex task involving infrastructure, software and an understanding of the workload. To enable recovery following a disaster, the administrator needs to execute a complex set of provisioning operations that will mimic the day-to-day setup in a different environment.<br />
Enabling DR for OpenStack hosted workloads requires enablement (APIs) in OpenStack components (e.g., Cinder) and tools which may be outside of OpenStack (e.g., scripts) to invoke, orchestrate and leverage the component specific APIs. <br />
<br />
The goal of this work is to create a framework that will enable protecting '''applications and services''' (VMs, images, volumes, etc) from disaster.<br />
Determining and selecting ''what application and services to protect'' is the responsibility of the user, while ''handling the logistics of protecting'' is up to the cloud (and its operator).<br />
<br />
== What is Disaster Recovery? ==<br />
Disaster Recovery is the process of ensuring continuity of a set of workloads following or in advance of a large scale disaster that disrupts the current environment or infrastructure. By large scale disaster, we are considering disasters which can lead to a complete loss of a data center such as floods, tornadoes, hurricanes, fires, etc. To provide DR, we need a geographically distant site which will be the target of recovery. Any resources, data, etc., needed by the application to recover need to be at the target site prior to the disaster.<br />
<br />
== High Availability versus Disaster Recovery ==<br />
While both High Availability (HA) and Disaster Recovery strive to achieve continued operations in face of failures, High Availability usually deals with individual components failures, while Disaster Recovery deals with large scale failures. <br />
<br />
Some distinguish HA from DR by networking scope - LAN for HA and WAN for DR, in the cloud context a better distinction is probably the autonomy of management. High Availability will be the mechanism for continued operations within a single cloud environment - one deployment of OpenStack in a single location or multiple locations. Disaster Recovery will be the mechanism for continued operations when you have multiple cloud environments - multiple OpenStack deployment in various locations. In this context DR is a continued workload operations in an alternative deployment, the recovery target clouds.<br />
<br />
== Scope and Scenarios ==<br />
The goal is to provide a mechanism to mark and protect from disaster '''applications and services''' (a set of OpenStack entities) also referred to as a hosted workload. <br />
In this context the cloud is the equivalent of the physical hardware, and the recovery process focuses on the application and services, including their data, which are running in the cloud.<br />
<br />
The mechanism to determine the exact set of VMs, VM images, volumes, etc, to be recover can be based on a tenant, or a per entity mechanism. In its most basic case, it could be a single VM, but can also be all the entities associated with a user.<br />
<br />
A separate recovery mechanism, outside the scope of this work, should address making the primary cloud available to run workloads following a disaster.<br />
The disaster recovery mechanism for applications and services will handle the fail-back to the primary cloud.<br />
<br />
=== Examples ===<br />
* Application service running on customer cloud and protected by recovery on hosted cloud.<br />
* Application service running on customer cloud in data center #1 and protected by recovery on customer data center #2.<br />
<br />
[[file:DR.png]]<br />
<br />
The plan is to provide a solution for both the born-in-the-cloud applications, as well as legacy applications that require storage and state.<br />
<br />
== Is this a new OpenStack project? ==<br />
Not necessarily. A better description would be an umbrella topic that describes the required APIs and features that OpenStack needs in order to support DR for hosted workloads. Some APIs and features will be integrated into existing projects such as Nova (DR features for compute) and Cinder (Storage replication). Some functionality, like DR orchestration may leverage Heat, or be a new project, or even be outside the scope of OpenStack.<br />
<br />
Disaster Recovery is a complex task where different applications and use-cases have different requirements; some use-cases can be easily supported while others may be more complex, this is targeted as a long-term effort with incremental steps.<br />
<br />
== Vision and Roadmap ==<br />
Disaster Recovery should include support for:<br />
* Capturing the metadata of the cloud management stack, relevant for the protected workloads/resources: either as point-in-time snapshots of the metadata, or as continuous replication of the metadata.<br />
* Making available the VM images needed to run the hosted workload on the target cloud.<br />
* Replication of the workload data using storage replication, application level replication, or backup/restore.<br />
<br />
We note that metadata changes are less frequent than application data changes, and different mechanisms can handle replication of different portions of the metadata and data (volumes, images, etc)<br />
<br />
The approach is built around:<br />
# Identify required enablement and missing features in OpenStack projects <br />
# Create enablement in specific OpenStack projects <br />
# Create orchestration scripts to demonstrate DR <br />
<br />
When resources to be protected are logically associated with a workload (or a set of inter-related workloads), both the replication and the recovery processes should be able to incorporate hooks to ensure consistency of the replicated data & metadata, as well as to enable customization (automated or manual) of the individual workload components at recovery site. Heat can be used to represent such workloads, as well as to automate the above processes (when applicable).<br />
<br />
== Design Tenets ==<br />
* The DR is between a primary cloud and a target cloud - independently managed.<br />
* The approach should enable a hybrid deployment between private and public cloud.<br />
* Note that some of the work related to DR may be relevant to enabling high-availability between regions, availability zones or cells which do share some of the OpenStack services.<br />
* Ideally (but not as an immediate step) one of the clouds (primary or target) could be non-OpenStack or even non-cloud bare-metal environments.<br />
* The primary and target cloud interact through a “mediator” - a DR middleware or gateway to make sure the clouds are decoupled.<br />
* The DR scheme will protect a set of VMs and related resources (VM images, persistent storage, network definitions, metadata, etc). The resources would be typically associated with a workload or a set of workloads owned by a tenant.<br />
* Allow flexibility in choice of Recovery Point Objective (RPO) and Recovery Time Objective (RTO).<br />
<br />
=== Disaster Recovery functionality to be supported ===<br />
* Fail-over - switch to recovery site upon failure<br />
* Fail-back - switch back to primary site<br />
* Test - test application in a sandbox at the recovery site <br />
<br />
=== End goal for Disaster Recovery ===<br />
* Define RPO/RTO objectives<br />
** Defines the replication params (sync/async, bandwidth, etc.)<br />
** Defines DR policy type<br />
* Enablement of multiple DR Policy options<br />
** backup to Swift<br />
** Active - Cold standby<br />
** Active - Hot standby<br />
** Active - Active (requires application awareness and support)<br />
** Plugable DR policies - e.g. DR to the cloud<br />
* Ability to mark a complete composite application as protected<br />
* Ability to elect DR region or availability zone per application<br />
* Ability to create one to many DR relationships per application<br />
* Ability to scale down the application at the recovery site if needed<br />
* Replication of all configuration and metadata required by an application - Neutron, Cinder, Nova, etc.<br />
* Ability to ensure consistency of the replicated data & metadata<br />
* Supporting a wide range of data replication methods<br />
** Storage systems based replication<br />
** Hypervisor assisted (possibly between heterogeneous storage systems). For example, using DRBD or Qemu based replication<br />
** Backup and Restore methods<br />
** Pluggable application level replication methods<br />
* Integration with horizon for basic DR orchestration<br />
<br />
== Activities ==<br />
<br />
=== Looking toward the Juno summit ===<br />
* [https://etherpad.openstack.org/p/juno-disaster-recovery-call-for-stakeholders call for stakeholder and collaboration for the Juno development cycle]<br />
<br />
=== Related sessions in Icehouse summit ===<br />
* [http://openstacksummitnovember2013.sched.org/event/36ef8daa098c248d7fbb4ac7409f802a#%20 Surviving the worst: A vision for OpenStack disaster recovery - November 7, 9:50am]<br />
* Storage replication (Cinder) - [http://summit.openstack.org/cfp/details/69 Volume continuous replication]<br />
<br />
=== Icehouse Features ===<br />
<br />
* Cinder Volume replication <br />
** [https://review.openstack.org/#/c/64026/ Volume replication] - to be continued during Juno development cycle<br />
** [https://review.openstack.org/#/c/64027/ admin commands for managing volume replication] - to be continued during Juno development cycle<br />
** [https://review.openstack.org/#/c/70792/ Replication enablement for IBM Storwize/SVC] - to be continued during Juno development cycle<br />
<br />
* [https://etherpad.openstack.org/p/icehouse-cinder-volume-replication-feedback Comments, compatibility and gap for volume replication]<br />
<br />
* Enable exporting/importing Cinder backups between OpenStack deployments<br />
<br />
** [https://review.openstack.org/#/c/69351 Export and import backup service metadata]<br />
** [https://review.openstack.org/#/c/72743 Client support for export and import backup service metadata]<br />
<br />
=== Related projects and topics ===<br />
* Resource reservation on target cloud - [[Resource-reservation-service|https://wiki.openstack.org/wiki/Resource-reservation-service]]<br />
* Heat description of workload / Stack abandon and adopt (Icehouse session proposal - merged into one session) - [http://summit.openstack.org/cfp/details/98 Create Heat stack from existing resources] [http://summit.openstack.org/cfp/details/200 Stack abandon and adopt]<br />
<br />
=== Contacts and (current) team ===<br />
* Ronen Kat (ronenkat) (IBM)<br />
* Ayal Baron (abaron) (Red Hat)<br />
* Sean Cohen (scohen) (Red Hat)<br />
* Alex Glikson (glikson) (IBM)<br />
* Avishay Traeger (avishay-il) (IBM)</div>Ronenkathttps://wiki.openstack.org/w/index.php?title=DisasterRecovery&diff=44056DisasterRecovery2014-03-04T10:05:01Z<p>Ronenkat: /* Activities */</p>
<hr />
<div>= Disaster Recovery for OpenStack =<br />
<br />
'''Disaster Recovery (DR)''' for OpenStack is an umbrella topic that describes what needs to be done for '''applications and services''' (generally referred to as workload) running in an OpenStack cloud to survive a '''large scale disaster'''.<br />
Providing DR for a workload is a complex task involving infrastructure, software and an understanding of the workload. To enable recovery following a disaster, the administrator needs to execute a complex set of provisioning operations that will mimic the day-to-day setup in a different environment.<br />
Enabling DR for OpenStack hosted workloads requires enablement (APIs) in OpenStack components (e.g., Cinder) and tools which may be outside of OpenStack (e.g., scripts) to invoke, orchestrate and leverage the component specific APIs. <br />
<br />
The goal of this work is to create a framework that will enable protecting '''applications and services''' (VMs, images, volumes, etc) from disaster.<br />
Determining and selecting ''what application and services to protect'' is the responsibility of the user, while ''handling the logistics of protecting'' is up to the cloud (and its operator).<br />
<br />
== What is Disaster Recovery? ==<br />
Disaster Recovery is the process of ensuring continuity of a set of workloads following or in advance of a large scale disaster that disrupts the current environment or infrastructure. By large scale disaster, we are considering disasters which can lead to a complete loss of a data center such as floods, tornadoes, hurricanes, fires, etc. To provide DR, we need a geographically distant site which will be the target of recovery. Any resources, data, etc., needed by the application to recover need to be at the target site prior to the disaster.<br />
<br />
== High Availability versus Disaster Recovery ==<br />
While both High Availability (HA) and Disaster Recovery strive to achieve continued operations in face of failures, High Availability usually deals with individual components failures, while Disaster Recovery deals with large scale failures. <br />
<br />
Some distinguish HA from DR by networking scope - LAN for HA and WAN for DR, in the cloud context a better distinction is probably the autonomy of management. High Availability will be the mechanism for continued operations within a single cloud environment - one deployment of OpenStack in a single location or multiple locations. Disaster Recovery will be the mechanism for continued operations when you have multiple cloud environments - multiple OpenStack deployment in various locations. In this context DR is a continued workload operations in an alternative deployment, the recovery target clouds.<br />
<br />
== Scope and Scenarios ==<br />
The goal is to provide a mechanism to mark and protect from disaster '''applications and services''' (a set of OpenStack entities) also referred to as a hosted workload. <br />
In this context the cloud is the equivalent of the physical hardware, and the recovery process focuses on the application and services, including their data, which are running in the cloud.<br />
<br />
The mechanism to determine the exact set of VMs, VM images, volumes, etc, to be recover can be based on a tenant, or a per entity mechanism. In its most basic case, it could be a single VM, but can also be all the entities associated with a user.<br />
<br />
A separate recovery mechanism, outside the scope of this work, should address making the primary cloud available to run workloads following a disaster.<br />
The disaster recovery mechanism for applications and services will handle the fail-back to the primary cloud.<br />
<br />
=== Examples ===<br />
* Application service running on customer cloud and protected by recovery on hosted cloud.<br />
* Application service running on customer cloud in data center #1 and protected by recovery on customer data center #2.<br />
<br />
[[file:DR.png]]<br />
<br />
The plan is to provide a solution for both the born-in-the-cloud applications, as well as legacy applications that require storage and state.<br />
<br />
== Is this a new OpenStack project? ==<br />
Not necessarily. A better description would be an umbrella topic that describes the required APIs and features that OpenStack needs in order to support DR for hosted workloads. Some APIs and features will be integrated into existing projects such as Nova (DR features for compute) and Cinder (Storage replication). Some functionality, like DR orchestration may leverage Heat, or be a new project, or even be outside the scope of OpenStack.<br />
<br />
Disaster Recovery is a complex task where different applications and use-cases have different requirements; some use-cases can be easily supported while others may be more complex, this is targeted as a long-term effort with incremental steps.<br />
<br />
== Vision and Roadmap ==<br />
Disaster Recovery should include support for:<br />
* Capturing the metadata of the cloud management stack, relevant for the protected workloads/resources: either as point-in-time snapshots of the metadata, or as continuous replication of the metadata.<br />
* Making available the VM images needed to run the hosted workload on the target cloud.<br />
* Replication of the workload data using storage replication, application level replication, or backup/restore.<br />
<br />
We note that metadata changes are less frequent than application data changes, and different mechanisms can handle replication of different portions of the metadata and data (volumes, images, etc)<br />
<br />
The approach is built around:<br />
# Identify required enablement and missing features in OpenStack projects <br />
# Create enablement in specific OpenStack projects <br />
# Create orchestration scripts to demonstrate DR <br />
<br />
When resources to be protected are logically associated with a workload (or a set of inter-related workloads), both the replication and the recovery processes should be able to incorporate hooks to ensure consistency of the replicated data & metadata, as well as to enable customization (automated or manual) of the individual workload components at recovery site. Heat can be used to represent such workloads, as well as to automate the above processes (when applicable).<br />
<br />
== Design Tenets ==<br />
* The DR is between a primary cloud and a target cloud - independently managed.<br />
* The approach should enable a hybrid deployment between private and public cloud.<br />
* Note that some of the work related to DR may be relevant to enabling high-availability between regions, availability zones or cells which do share some of the OpenStack services.<br />
* Ideally (but not as an immediate step) one of the clouds (primary or target) could be non-OpenStack or even non-cloud bare-metal environments.<br />
* The primary and target cloud interact through a “mediator” - a DR middleware or gateway to make sure the clouds are decoupled.<br />
* The DR scheme will protect a set of VMs and related resources (VM images, persistent storage, network definitions, metadata, etc). The resources would be typically associated with a workload or a set of workloads owned by a tenant.<br />
* Allow flexibility in choice of Recovery Point Objective (RPO) and Recovery Time Objective (RTO).<br />
<br />
=== Disaster Recovery functionality to be supported ===<br />
* Fail-over - switch to recovery site upon failure<br />
* Fail-back - switch back to primary site<br />
* Test - test application in a sandbox at the recovery site <br />
<br />
=== End goal for Disaster Recovery ===<br />
* Define RPO/RTO objectives<br />
** Defines the replication params (sync/async, bandwidth, etc.)<br />
** Defines DR policy type<br />
* Enablement of multiple DR Policy options<br />
** backup to Swift<br />
** Active - Cold standby<br />
** Active - Hot standby<br />
** Active - Active (requires application awareness and support)<br />
** Plugable DR policies - e.g. DR to the cloud<br />
* Ability to mark a complete composite application as protected<br />
* Ability to elect DR region or availability zone per application<br />
* Ability to create one to many DR relationships per application<br />
* Ability to scale down the application at the recovery site if needed<br />
* Replication of all configuration and metadata required by an application - Neutron, Cinder, Nova, etc.<br />
* Ability to ensure consistency of the replicated data & metadata<br />
* Supporting a wide range of data replication methods<br />
** Storage systems based replication<br />
** Hypervisor assisted (possibly between heterogeneous storage systems). For example, using DRBD or Qemu based replication<br />
** Backup and Restore methods<br />
** Pluggable application level replication methods<br />
* Integration with horizon for basic DR orchestration<br />
<br />
== Activities ==<br />
<br />
=== Looking toward the Juno summit ===<br />
* [https://etherpad.openstack.org/p/juno-disaster-recovery-call-for-stakeholders call for stakeholder and collaboration for the Juno development cycle]<br />
<br />
=== Related sessions in Icehouse summit ===<br />
* [http://openstacksummitnovember2013.sched.org/event/36ef8daa098c248d7fbb4ac7409f802a#%20 Surviving the worst: A vision for OpenStack disaster recovery - November 7, 9:50am]<br />
* Storage replication (Cinder) - [http://summit.openstack.org/cfp/details/69 Volume continuous replication]<br />
<br />
=== Icehouse Features ===<br />
<br />
* Cinder Volume replication <br />
** [https://review.openstack.org/#/c/64026/ Volume replication]<br />
** [https://review.openstack.org/#/c/64027/ admin commands for managing volume replication]<br />
** [https://review.openstack.org/#/c/70792/ Replication enablement for IBM Storwize/SVC]<br />
<br />
* [https://etherpad.openstack.org/p/icehouse-cinder-volume-replication-feedback Comments, compatibility and gap for volume replication]<br />
<br />
* Enable exporting/importing Cinder backups between OpenStack deployments<br />
<br />
** [https://review.openstack.org/#/c/69351 Export and import backup service metadata]<br />
** [https://review.openstack.org/#/c/72743 Client support for export and import backup service metadata]<br />
<br />
=== Related projects and topics ===<br />
* Resource reservation on target cloud - [[Resource-reservation-service|https://wiki.openstack.org/wiki/Resource-reservation-service]]<br />
* Heat description of workload / Stack abandon and adopt (Icehouse session proposal - merged into one session) - [http://summit.openstack.org/cfp/details/98 Create Heat stack from existing resources] [http://summit.openstack.org/cfp/details/200 Stack abandon and adopt]<br />
<br />
=== Contacts and (current) team ===<br />
* Ronen Kat (ronenkat) (IBM)<br />
* Ayal Baron (abaron) (Red Hat)<br />
* Sean Cohen (scohen) (Red Hat)<br />
* Alex Glikson (glikson) (IBM)<br />
* Avishay Traeger (avishay-il) (IBM)</div>Ronenkathttps://wiki.openstack.org/w/index.php?title=DisasterRecovery&diff=42026DisasterRecovery2014-02-12T16:05:05Z<p>Ronenkat: /* Icehouse Features */</p>
<hr />
<div>= Disaster Recovery for OpenStack =<br />
<br />
'''Disaster Recovery (DR)''' for OpenStack is an umbrella topic that describes what needs to be done for '''applications and services''' (generally referred to as workload) running in an OpenStack cloud to survive a '''large scale disaster'''.<br />
Providing DR for a workload is a complex task involving infrastructure, software and an understanding of the workload. To enable recovery following a disaster, the administrator needs to execute a complex set of provisioning operations that will mimic the day-to-day setup in a different environment.<br />
Enabling DR for OpenStack hosted workloads requires enablement (APIs) in OpenStack components (e.g., Cinder) and tools which may be outside of OpenStack (e.g., scripts) to invoke, orchestrate and leverage the component specific APIs. <br />
<br />
The goal of this work is to create a framework that will enable protecting '''applications and services''' (VMs, images, volumes, etc) from disaster.<br />
Determining and selecting ''what application and services to protect'' is the responsibility of the user, while ''handling the logistics of protecting'' is up to the cloud (and its operator).<br />
<br />
== What is Disaster Recovery? ==<br />
Disaster Recovery is the process of ensuring continuity of a set of workloads following or in advance of a large scale disaster that disrupts the current environment or infrastructure. By large scale disaster, we are considering disasters which can lead to a complete loss of a data center such as floods, tornadoes, hurricanes, fires, etc. To provide DR, we need a geographically distant site which will be the target of recovery. Any resources, data, etc., needed by the application to recover need to be at the target site prior to the disaster.<br />
<br />
== High Availability versus Disaster Recovery ==<br />
While both High Availability (HA) and Disaster Recovery strive to achieve continued operations in face of failures, High Availability usually deals with individual components failures, while Disaster Recovery deals with large scale failures. <br />
<br />
Some distinguish HA from DR by networking scope - LAN for HA and WAN for DR, in the cloud context a better distinction is probably the autonomy of management. High Availability will be the mechanism for continued operations within a single cloud environment - one deployment of OpenStack in a single location or multiple locations. Disaster Recovery will be the mechanism for continued operations when you have multiple cloud environments - multiple OpenStack deployment in various locations. In this context DR is a continued workload operations in an alternative deployment, the recovery target clouds.<br />
<br />
== Scope and Scenarios ==<br />
The goal is to provide a mechanism to mark and protect from disaster '''applications and services''' (a set of OpenStack entities) also referred to as a hosted workload. <br />
In this context the cloud is the equivalent of the physical hardware, and the recovery process focuses on the application and services, including their data, which are running in the cloud.<br />
<br />
The mechanism to determine the exact set of VMs, VM images, volumes, etc, to be recover can be based on a tenant, or a per entity mechanism. In its most basic case, it could be a single VM, but can also be all the entities associated with a user.<br />
<br />
A separate recovery mechanism, outside the scope of this work, should address making the primary cloud available to run workloads following a disaster.<br />
The disaster recovery mechanism for applications and services will handle the fail-back to the primary cloud.<br />
<br />
=== Examples ===<br />
* Application service running on customer cloud and protected by recovery on hosted cloud.<br />
* Application service running on customer cloud in data center #1 and protected by recovery on customer data center #2.<br />
<br />
[[file:DR.png]]<br />
<br />
The plan is to provide a solution for both the born-in-the-cloud applications, as well as legacy applications that require storage and state.<br />
<br />
== Is this a new OpenStack project? ==<br />
Not necessarily. A better description would be an umbrella topic that describes the required APIs and features that OpenStack needs in order to support DR for hosted workloads. Some APIs and features will be integrated into existing projects such as Nova (DR features for compute) and Cinder (Storage replication). Some functionality, like DR orchestration may leverage Heat, or be a new project, or even be outside the scope of OpenStack.<br />
<br />
Disaster Recovery is a complex task where different applications and use-cases have different requirements; some use-cases can be easily supported while others may be more complex, this is targeted as a long-term effort with incremental steps.<br />
<br />
== Vision and Roadmap ==<br />
Disaster Recovery should include support for:<br />
* Capturing the metadata of the cloud management stack, relevant for the protected workloads/resources: either as point-in-time snapshots of the metadata, or as continuous replication of the metadata.<br />
* Making available the VM images needed to run the hosted workload on the target cloud.<br />
* Replication of the workload data using storage replication, application level replication, or backup/restore.<br />
<br />
We note that metadata changes are less frequent than application data changes, and different mechanisms can handle replication of different portions of the metadata and data (volumes, images, etc)<br />
<br />
The approach is built around:<br />
# Identify required enablement and missing features in OpenStack projects <br />
# Create enablement in specific OpenStack projects <br />
# Create orchestration scripts to demonstrate DR <br />
<br />
When resources to be protected are logically associated with a workload (or a set of inter-related workloads), both the replication and the recovery processes should be able to incorporate hooks to ensure consistency of the replicated data & metadata, as well as to enable customization (automated or manual) of the individual workload components at recovery site. Heat can be used to represent such workloads, as well as to automate the above processes (when applicable).<br />
<br />
== Design Tenets ==<br />
* The DR is between a primary cloud and a target cloud - independently managed.<br />
* The approach should enable a hybrid deployment between private and public cloud.<br />
* Note that some of the work related to DR may be relevant to enabling high-availability between regions, availability zones or cells which do share some of the OpenStack services.<br />
* Ideally (but not as an immediate step) one of the clouds (primary or target) could be non-OpenStack or even non-cloud bare-metal environments.<br />
* The primary and target cloud interact through a “mediator” - a DR middleware or gateway to make sure the clouds are decoupled.<br />
* The DR scheme will protect a set of VMs and related resources (VM images, persistent storage, network definitions, metadata, etc). The resources would be typically associated with a workload or a set of workloads owned by a tenant.<br />
* Allow flexibility in choice of Recovery Point Objective (RPO) and Recovery Time Objective (RTO).<br />
<br />
=== Disaster Recovery functionality to be supported ===<br />
* Fail-over - switch to recovery site upon failure<br />
* Fail-back - switch back to primary site<br />
* Test - test application in a sandbox at the recovery site <br />
<br />
=== End goal for Disaster Recovery ===<br />
* Define RPO/RTO objectives<br />
** Defines the replication params (sync/async, bandwidth, etc.)<br />
** Defines DR policy type<br />
* Enablement of multiple DR Policy options<br />
** backup to Swift<br />
** Active - Cold standby<br />
** Active - Hot standby<br />
** Active - Active (requires application awareness and support)<br />
** Plugable DR policies - e.g. DR to the cloud<br />
* Ability to mark a complete composite application as protected<br />
* Ability to elect DR region or availability zone per application<br />
* Ability to create one to many DR relationships per application<br />
* Ability to scale down the application at the recovery site if needed<br />
* Replication of all configuration and metadata required by an application - Neutron, Cinder, Nova, etc.<br />
* Ability to ensure consistency of the replicated data & metadata<br />
* Supporting a wide range of data replication methods<br />
** Storage systems based replication<br />
** Hypervisor assisted (possibly between heterogeneous storage systems). For example, using DRBD or Qemu based replication<br />
** Backup and Restore methods<br />
** Pluggable application level replication methods<br />
* Integration with horizon for basic DR orchestration<br />
<br />
== Activities ==<br />
=== Related sessions in Icehouse summit ===<br />
* [http://openstacksummitnovember2013.sched.org/event/36ef8daa098c248d7fbb4ac7409f802a#%20 Surviving the worst: A vision for OpenStack disaster recovery - November 7, 9:50am]<br />
* Storage replication (Cinder) - [http://summit.openstack.org/cfp/details/69 Volume continuous replication]<br />
<br />
=== Icehouse Features ===<br />
<br />
* Cinder Volume replication <br />
** [https://review.openstack.org/#/c/64026/ Volume replication]<br />
** [https://review.openstack.org/#/c/64027/ admin commands for managing volume replication]<br />
** [https://review.openstack.org/#/c/70792/ Replication enablement for IBM Storwize/SVC]<br />
<br />
* [https://etherpad.openstack.org/p/icehouse-cinder-volume-replication-feedback Comments, compatibility and gap for volume replication]<br />
<br />
* Enable exporting/importing Cinder backups between OpenStack deployments<br />
<br />
** [https://review.openstack.org/#/c/69351 Export and import backup service metadata]<br />
** [https://review.openstack.org/#/c/72743 Client support for export and import backup service metadata]<br />
<br />
=== Related projects and topics ===<br />
* Resource reservation on target cloud - [[Resource-reservation-service|https://wiki.openstack.org/wiki/Resource-reservation-service]]<br />
* Heat description of workload / Stack abandon and adopt (Icehouse session proposal - merged into one session) - [http://summit.openstack.org/cfp/details/98 Create Heat stack from existing resources] [http://summit.openstack.org/cfp/details/200 Stack abandon and adopt]<br />
<br />
=== Contacts and (current) team ===<br />
* Ronen Kat (ronenkat) (IBM)<br />
* Ayal Baron (abaron) (Red Hat)<br />
* Sean Cohen (scohen) (Red Hat)<br />
* Alex Glikson (glikson) (IBM)<br />
* Avishay Traeger (avishay-il) (IBM)</div>Ronenkathttps://wiki.openstack.org/w/index.php?title=DisasterRecovery&diff=42025DisasterRecovery2014-02-12T16:03:07Z<p>Ronenkat: /* Icehouse Features */</p>
<hr />
<div>= Disaster Recovery for OpenStack =<br />
<br />
'''Disaster Recovery (DR)''' for OpenStack is an umbrella topic that describes what needs to be done for '''applications and services''' (generally referred to as workload) running in an OpenStack cloud to survive a '''large scale disaster'''.<br />
Providing DR for a workload is a complex task involving infrastructure, software and an understanding of the workload. To enable recovery following a disaster, the administrator needs to execute a complex set of provisioning operations that will mimic the day-to-day setup in a different environment.<br />
Enabling DR for OpenStack hosted workloads requires enablement (APIs) in OpenStack components (e.g., Cinder) and tools which may be outside of OpenStack (e.g., scripts) to invoke, orchestrate and leverage the component specific APIs. <br />
<br />
The goal of this work is to create a framework that will enable protecting '''applications and services''' (VMs, images, volumes, etc) from disaster.<br />
Determining and selecting ''what application and services to protect'' is the responsibility of the user, while ''handling the logistics of protecting'' is up to the cloud (and its operator).<br />
<br />
== What is Disaster Recovery? ==<br />
Disaster Recovery is the process of ensuring continuity of a set of workloads following or in advance of a large scale disaster that disrupts the current environment or infrastructure. By large scale disaster, we are considering disasters which can lead to a complete loss of a data center such as floods, tornadoes, hurricanes, fires, etc. To provide DR, we need a geographically distant site which will be the target of recovery. Any resources, data, etc., needed by the application to recover need to be at the target site prior to the disaster.<br />
<br />
== High Availability versus Disaster Recovery ==<br />
While both High Availability (HA) and Disaster Recovery strive to achieve continued operations in face of failures, High Availability usually deals with individual components failures, while Disaster Recovery deals with large scale failures. <br />
<br />
Some distinguish HA from DR by networking scope - LAN for HA and WAN for DR, in the cloud context a better distinction is probably the autonomy of management. High Availability will be the mechanism for continued operations within a single cloud environment - one deployment of OpenStack in a single location or multiple locations. Disaster Recovery will be the mechanism for continued operations when you have multiple cloud environments - multiple OpenStack deployment in various locations. In this context DR is a continued workload operations in an alternative deployment, the recovery target clouds.<br />
<br />
== Scope and Scenarios ==<br />
The goal is to provide a mechanism to mark and protect from disaster '''applications and services''' (a set of OpenStack entities) also referred to as a hosted workload. <br />
In this context the cloud is the equivalent of the physical hardware, and the recovery process focuses on the application and services, including their data, which are running in the cloud.<br />
<br />
The mechanism to determine the exact set of VMs, VM images, volumes, etc, to be recover can be based on a tenant, or a per entity mechanism. In its most basic case, it could be a single VM, but can also be all the entities associated with a user.<br />
<br />
A separate recovery mechanism, outside the scope of this work, should address making the primary cloud available to run workloads following a disaster.<br />
The disaster recovery mechanism for applications and services will handle the fail-back to the primary cloud.<br />
<br />
=== Examples ===<br />
* Application service running on customer cloud and protected by recovery on hosted cloud.<br />
* Application service running on customer cloud in data center #1 and protected by recovery on customer data center #2.<br />
<br />
[[file:DR.png]]<br />
<br />
The plan is to provide a solution for both the born-in-the-cloud applications, as well as legacy applications that require storage and state.<br />
<br />
== Is this a new OpenStack project? ==<br />
Not necessarily. A better description would be an umbrella topic that describes the required APIs and features that OpenStack needs in order to support DR for hosted workloads. Some APIs and features will be integrated into existing projects such as Nova (DR features for compute) and Cinder (Storage replication). Some functionality, like DR orchestration may leverage Heat, or be a new project, or even be outside the scope of OpenStack.<br />
<br />
Disaster Recovery is a complex task where different applications and use-cases have different requirements; some use-cases can be easily supported while others may be more complex, this is targeted as a long-term effort with incremental steps.<br />
<br />
== Vision and Roadmap ==<br />
Disaster Recovery should include support for:<br />
* Capturing the metadata of the cloud management stack, relevant for the protected workloads/resources: either as point-in-time snapshots of the metadata, or as continuous replication of the metadata.<br />
* Making available the VM images needed to run the hosted workload on the target cloud.<br />
* Replication of the workload data using storage replication, application level replication, or backup/restore.<br />
<br />
We note that metadata changes are less frequent than application data changes, and different mechanisms can handle replication of different portions of the metadata and data (volumes, images, etc)<br />
<br />
The approach is built around:<br />
# Identify required enablement and missing features in OpenStack projects <br />
# Create enablement in specific OpenStack projects <br />
# Create orchestration scripts to demonstrate DR <br />
<br />
When resources to be protected are logically associated with a workload (or a set of inter-related workloads), both the replication and the recovery processes should be able to incorporate hooks to ensure consistency of the replicated data & metadata, as well as to enable customization (automated or manual) of the individual workload components at recovery site. Heat can be used to represent such workloads, as well as to automate the above processes (when applicable).<br />
<br />
== Design Tenets ==<br />
* The DR is between a primary cloud and a target cloud - independently managed.<br />
* The approach should enable a hybrid deployment between private and public cloud.<br />
* Note that some of the work related to DR may be relevant to enabling high-availability between regions, availability zones or cells which do share some of the OpenStack services.<br />
* Ideally (but not as an immediate step) one of the clouds (primary or target) could be non-OpenStack or even non-cloud bare-metal environments.<br />
* The primary and target cloud interact through a “mediator” - a DR middleware or gateway to make sure the clouds are decoupled.<br />
* The DR scheme will protect a set of VMs and related resources (VM images, persistent storage, network definitions, metadata, etc). The resources would be typically associated with a workload or a set of workloads owned by a tenant.<br />
* Allow flexibility in choice of Recovery Point Objective (RPO) and Recovery Time Objective (RTO).<br />
<br />
=== Disaster Recovery functionality to be supported ===<br />
* Fail-over - switch to recovery site upon failure<br />
* Fail-back - switch back to primary site<br />
* Test - test application in a sandbox at the recovery site <br />
<br />
=== End goal for Disaster Recovery ===<br />
* Define RPO/RTO objectives<br />
** Defines the replication params (sync/async, bandwidth, etc.)<br />
** Defines DR policy type<br />
* Enablement of multiple DR Policy options<br />
** backup to Swift<br />
** Active - Cold standby<br />
** Active - Hot standby<br />
** Active - Active (requires application awareness and support)<br />
** Plugable DR policies - e.g. DR to the cloud<br />
* Ability to mark a complete composite application as protected<br />
* Ability to elect DR region or availability zone per application<br />
* Ability to create one to many DR relationships per application<br />
* Ability to scale down the application at the recovery site if needed<br />
* Replication of all configuration and metadata required by an application - Neutron, Cinder, Nova, etc.<br />
* Ability to ensure consistency of the replicated data & metadata<br />
* Supporting a wide range of data replication methods<br />
** Storage systems based replication<br />
** Hypervisor assisted (possibly between heterogeneous storage systems). For example, using DRBD or Qemu based replication<br />
** Backup and Restore methods<br />
** Pluggable application level replication methods<br />
* Integration with horizon for basic DR orchestration<br />
<br />
== Activities ==<br />
=== Related sessions in Icehouse summit ===<br />
* [http://openstacksummitnovember2013.sched.org/event/36ef8daa098c248d7fbb4ac7409f802a#%20 Surviving the worst: A vision for OpenStack disaster recovery - November 7, 9:50am]<br />
* Storage replication (Cinder) - [http://summit.openstack.org/cfp/details/69 Volume continuous replication]<br />
<br />
=== Icehouse Features ===<br />
<br />
* Cinder Volume replication <br />
** [https://review.openstack.org/#/c/64026/ Volume replication]<br />
** [https://review.openstack.org/#/c/64027/ admin commands for managing volume replication]<br />
** [https://review.openstack.org/#/c/70792/ Replication enablement for IBM Storwize/SVC]<br />
<br />
* [https://etherpad.openstack.org/p/icehouse-cinder-volume-replication-feedback Comments, compatibility and gap for volume replication]<br />
<br />
=== Related projects and topics ===<br />
* Resource reservation on target cloud - [[Resource-reservation-service|https://wiki.openstack.org/wiki/Resource-reservation-service]]<br />
* Heat description of workload / Stack abandon and adopt (Icehouse session proposal - merged into one session) - [http://summit.openstack.org/cfp/details/98 Create Heat stack from existing resources] [http://summit.openstack.org/cfp/details/200 Stack abandon and adopt]<br />
<br />
=== Contacts and (current) team ===<br />
* Ronen Kat (ronenkat) (IBM)<br />
* Ayal Baron (abaron) (Red Hat)<br />
* Sean Cohen (scohen) (Red Hat)<br />
* Alex Glikson (glikson) (IBM)<br />
* Avishay Traeger (avishay-il) (IBM)</div>Ronenkathttps://wiki.openstack.org/w/index.php?title=DisasterRecovery&diff=42024DisasterRecovery2014-02-12T16:01:31Z<p>Ronenkat: /* Related sessions in Icehouse summit */</p>
<hr />
<div>= Disaster Recovery for OpenStack =<br />
<br />
'''Disaster Recovery (DR)''' for OpenStack is an umbrella topic that describes what needs to be done for '''applications and services''' (generally referred to as workload) running in an OpenStack cloud to survive a '''large scale disaster'''.<br />
Providing DR for a workload is a complex task involving infrastructure, software and an understanding of the workload. To enable recovery following a disaster, the administrator needs to execute a complex set of provisioning operations that will mimic the day-to-day setup in a different environment.<br />
Enabling DR for OpenStack hosted workloads requires enablement (APIs) in OpenStack components (e.g., Cinder) and tools which may be outside of OpenStack (e.g., scripts) to invoke, orchestrate and leverage the component specific APIs. <br />
<br />
The goal of this work is to create a framework that will enable protecting '''applications and services''' (VMs, images, volumes, etc) from disaster.<br />
Determining and selecting ''what application and services to protect'' is the responsibility of the user, while ''handling the logistics of protecting'' is up to the cloud (and its operator).<br />
<br />
== What is Disaster Recovery? ==<br />
Disaster Recovery is the process of ensuring continuity of a set of workloads following or in advance of a large scale disaster that disrupts the current environment or infrastructure. By large scale disaster, we are considering disasters which can lead to a complete loss of a data center such as floods, tornadoes, hurricanes, fires, etc. To provide DR, we need a geographically distant site which will be the target of recovery. Any resources, data, etc., needed by the application to recover need to be at the target site prior to the disaster.<br />
<br />
== High Availability versus Disaster Recovery ==<br />
While both High Availability (HA) and Disaster Recovery strive to achieve continued operations in face of failures, High Availability usually deals with individual components failures, while Disaster Recovery deals with large scale failures. <br />
<br />
Some distinguish HA from DR by networking scope - LAN for HA and WAN for DR, in the cloud context a better distinction is probably the autonomy of management. High Availability will be the mechanism for continued operations within a single cloud environment - one deployment of OpenStack in a single location or multiple locations. Disaster Recovery will be the mechanism for continued operations when you have multiple cloud environments - multiple OpenStack deployment in various locations. In this context DR is a continued workload operations in an alternative deployment, the recovery target clouds.<br />
<br />
== Scope and Scenarios ==<br />
The goal is to provide a mechanism to mark and protect from disaster '''applications and services''' (a set of OpenStack entities) also referred to as a hosted workload. <br />
In this context the cloud is the equivalent of the physical hardware, and the recovery process focuses on the application and services, including their data, which are running in the cloud.<br />
<br />
The mechanism to determine the exact set of VMs, VM images, volumes, etc, to be recover can be based on a tenant, or a per entity mechanism. In its most basic case, it could be a single VM, but can also be all the entities associated with a user.<br />
<br />
A separate recovery mechanism, outside the scope of this work, should address making the primary cloud available to run workloads following a disaster.<br />
The disaster recovery mechanism for applications and services will handle the fail-back to the primary cloud.<br />
<br />
=== Examples ===<br />
* Application service running on customer cloud and protected by recovery on hosted cloud.<br />
* Application service running on customer cloud in data center #1 and protected by recovery on customer data center #2.<br />
<br />
[[file:DR.png]]<br />
<br />
The plan is to provide a solution for both the born-in-the-cloud applications, as well as legacy applications that require storage and state.<br />
<br />
== Is this a new OpenStack project? ==<br />
Not necessarily. A better description would be an umbrella topic that describes the required APIs and features that OpenStack needs in order to support DR for hosted workloads. Some APIs and features will be integrated into existing projects such as Nova (DR features for compute) and Cinder (Storage replication). Some functionality, like DR orchestration may leverage Heat, or be a new project, or even be outside the scope of OpenStack.<br />
<br />
Disaster Recovery is a complex task where different applications and use-cases have different requirements; some use-cases can be easily supported while others may be more complex, this is targeted as a long-term effort with incremental steps.<br />
<br />
== Vision and Roadmap ==<br />
Disaster Recovery should include support for:<br />
* Capturing the metadata of the cloud management stack, relevant for the protected workloads/resources: either as point-in-time snapshots of the metadata, or as continuous replication of the metadata.<br />
* Making available the VM images needed to run the hosted workload on the target cloud.<br />
* Replication of the workload data using storage replication, application level replication, or backup/restore.<br />
<br />
We note that metadata changes are less frequent than application data changes, and different mechanisms can handle replication of different portions of the metadata and data (volumes, images, etc)<br />
<br />
The approach is built around:<br />
# Identify required enablement and missing features in OpenStack projects <br />
# Create enablement in specific OpenStack projects <br />
# Create orchestration scripts to demonstrate DR <br />
<br />
When resources to be protected are logically associated with a workload (or a set of inter-related workloads), both the replication and the recovery processes should be able to incorporate hooks to ensure consistency of the replicated data & metadata, as well as to enable customization (automated or manual) of the individual workload components at recovery site. Heat can be used to represent such workloads, as well as to automate the above processes (when applicable).<br />
<br />
== Design Tenets ==<br />
* The DR is between a primary cloud and a target cloud - independently managed.<br />
* The approach should enable a hybrid deployment between private and public cloud.<br />
* Note that some of the work related to DR may be relevant to enabling high-availability between regions, availability zones or cells which do share some of the OpenStack services.<br />
* Ideally (but not as an immediate step) one of the clouds (primary or target) could be non-OpenStack or even non-cloud bare-metal environments.<br />
* The primary and target cloud interact through a “mediator” - a DR middleware or gateway to make sure the clouds are decoupled.<br />
* The DR scheme will protect a set of VMs and related resources (VM images, persistent storage, network definitions, metadata, etc). The resources would be typically associated with a workload or a set of workloads owned by a tenant.<br />
* Allow flexibility in choice of Recovery Point Objective (RPO) and Recovery Time Objective (RTO).<br />
<br />
=== Disaster Recovery functionality to be supported ===<br />
* Fail-over - switch to recovery site upon failure<br />
* Fail-back - switch back to primary site<br />
* Test - test application in a sandbox at the recovery site <br />
<br />
=== End goal for Disaster Recovery ===<br />
* Define RPO/RTO objectives<br />
** Defines the replication params (sync/async, bandwidth, etc.)<br />
** Defines DR policy type<br />
* Enablement of multiple DR Policy options<br />
** backup to Swift<br />
** Active - Cold standby<br />
** Active - Hot standby<br />
** Active - Active (requires application awareness and support)<br />
** Plugable DR policies - e.g. DR to the cloud<br />
* Ability to mark a complete composite application as protected<br />
* Ability to elect DR region or availability zone per application<br />
* Ability to create one to many DR relationships per application<br />
* Ability to scale down the application at the recovery site if needed<br />
* Replication of all configuration and metadata required by an application - Neutron, Cinder, Nova, etc.<br />
* Ability to ensure consistency of the replicated data & metadata<br />
* Supporting a wide range of data replication methods<br />
** Storage systems based replication<br />
** Hypervisor assisted (possibly between heterogeneous storage systems). For example, using DRBD or Qemu based replication<br />
** Backup and Restore methods<br />
** Pluggable application level replication methods<br />
* Integration with horizon for basic DR orchestration<br />
<br />
== Activities ==<br />
=== Related sessions in Icehouse summit ===<br />
* [http://openstacksummitnovember2013.sched.org/event/36ef8daa098c248d7fbb4ac7409f802a#%20 Surviving the worst: A vision for OpenStack disaster recovery - November 7, 9:50am]<br />
* Storage replication (Cinder) - [http://summit.openstack.org/cfp/details/69 Volume continuous replication]<br />
<br />
=== Icehouse Features ===<br />
<br />
* Cinder Volume replication <br />
** [https://review.openstack.org/#/c/64026/ Volume replication]<br />
** [https://review.openstack.org/#/c/64027/ admin commands for managing volume replication]<br />
** [https://review.openstack.org/#/c/70792/ Replication enablement for IBM Storwize/SVC]<br />
<br />
=== Related projects and topics ===<br />
* Resource reservation on target cloud - [[Resource-reservation-service|https://wiki.openstack.org/wiki/Resource-reservation-service]]<br />
* Heat description of workload / Stack abandon and adopt (Icehouse session proposal - merged into one session) - [http://summit.openstack.org/cfp/details/98 Create Heat stack from existing resources] [http://summit.openstack.org/cfp/details/200 Stack abandon and adopt]<br />
<br />
=== Contacts and (current) team ===<br />
* Ronen Kat (ronenkat) (IBM)<br />
* Ayal Baron (abaron) (Red Hat)<br />
* Sean Cohen (scohen) (Red Hat)<br />
* Alex Glikson (glikson) (IBM)<br />
* Avishay Traeger (avishay-il) (IBM)</div>Ronenkathttps://wiki.openstack.org/w/index.php?title=CinderMeetings&diff=38543CinderMeetings2013-12-17T13:57:38Z<p>Ronenkat: </p>
<hr />
<div><br />
= Weekly Cinder team meeting =<br />
'''NOTE MEETING TIME: Wed's at 16:00 UTC'''<br />
<br />
If you're interested in Cinder or Block Storage in general for OpenStack, we have a weekly meetings in <code><nowiki>#openstack-meeting</nowiki></code>, on Wednesdays at 16:00 UTC. Please feel free to add items to the agenda below. NOTE: When adding topics please include your IRC name so we know who's topic it is and how to get more info.<br />
<br />
== Next meetings ==<br />
'''NOTE:''' ''Include your IRC nickname next to agenda items so that you can be called upon in the meeting and arrive at the meeting promptly if placing items in agenda. You might want to put this on your calendar if you are adding items.''<br />
<br />
'''December 17, 2013 16:00 UTC'''<br />
* Blueprint discussion [https://blueprints.launchpad.net/cinder/+spec/cinder-backup-recover-api cinder backup recovery api - import/export backups] - avishay<br />
<br />
== Previous meetings ==<br />
'''December 11, 2013 16:00 UTC'''<br />
* Proposal of [https://etherpad.openstack.org/p/cinder-extensions extension packages] -Thingee<br />
<br />
'''December 4, 2013 16:00 UTC'''<br />
* Progressing with [https://wiki.openstack.org/wiki/Cinder/blueprints/multi-attach-volume multi-attach / shared-volume] - sgordon<br />
* [https://etherpad.openstack.org/p/icehouse-cinder-acls-for-volumes Access Control List design discussion] - alatynskaya<br />
<br />
'''November 27, 2013 16:00 UTC'''<br />
* [https://etherpad.openstack.org/p/icehouse-cinder-continuous-volume-replication-v2 Updated volume mirroring design] - avishay<br />
* Start using only Mock for new tests... [http://lists.openstack.org/pipermail/openstack-dev/2013-November/018501.html Related Nova Discussion] - Thingee<br />
* Rate limiting came up in the summit, and [http://lists.openstack.org/pipermail/openstack-dev/2013-November/020291.html on openstack-dev] - avishay<br />
* Metadata backup (https://review.openstack.org/#/c/51900/) progress RFC - dosaboy<br />
<br />
<br />
'''November 20, 2013 16:00 UTC'''<br />
* I-1 scheduling - JGriffith<br />
<br />
<br />
'''November 13, 2013 16:00 UTC'''<br />
* patches should update doc files where necessary to ease writing of release notes (Avishay?)<br />
* fencing host from storage (Ehud Trainin)<br />
* Summarize priority of tasks from summit discussions (https://wiki.openstack.org/wiki/Summit/Icehouse/Etherpads#Cinder and https://etherpad.openstack.org/p/cinder-icehouse-summary) Griff<br />
<br />
<br />
'''October 30, 2013 16:00 UTC'''<br />
* cinder backup metadata support - http://goo.gl/Jkg2FV (dosaboy)<br />
* fencing and unfencing host from storage - https://blueprints.launchpad.net/cinder/+spec/fencing-and-unfencing (Ehud Trainin)<br />
<br />
'''October 23, 2013 16:00 UTC'''<br />
* Nexenta backup driver https://review.openstack.org/#/c/47005/ - DuncanT<br />
<br />
<br />
'''October 2, 2013 16:00 UTC'''<br />
* What's still broken in Havana<br />
:* Backups and multibackend (https://code.launchpad.net/bugs/1228223): Fix committed<br />
:* Configuration - Global CONF settings in brick don't belong, and a number of them break multi-backend (Bug #1230066): Fix committed<br />
:* TaskFlow retry mechanism - The majority felt this should be left as a white list, but no work has been done to fix it so we still have ugly failures/roll-backs (3 bugs logged here): '''???'''<br />
:* Quotas - Don't know that anybody has gotten to the bottom of the quota syncing issue (Bug #1202896): '''Still open'''<br />
:* iSCSI Target creation failures - This was thought to have been fixed but showed up last night (Bug #1223469): Probable fix committed<br />
:* Summary of gate issue pertaining to Cinder can be viewed here: http://paste.openstack.org/show/47798/<br />
:* Moving to taskflow - avishay<br />
<br />
<br />
'''Sept 25, 2013, 16:00 UTC'''<br />
* PTL nomination process is open until the 26'th, if you want to run send your nomination proposal out to the dev ML<br />
* What's broken in Havana<br />
:* Backups (specifically when configured with multi-backend volumes)<br />
:* Configuration - Global CONF settings in brick don't belong, and a number of them break multi-backend (Bug #1230066)<br />
:* TaskFlow retry mechanism - The majority felt this should be left as a white list, but no work has been done to fix it so we still have ugly failures/roll-backs (3 bugs logged here)<br />
:* Quotas - Don't know that anybody has gotten to the bottom of the quota syncing issue (Bug #1202896)<br />
:* iSCSI Target creation failures - This was thought to have been fixed but showed up last night (Bug #1223469)<br />
:* ????<br />
* Cinderclient release plans/status? (Eharney)<br />
* OSLO imports (DuncanT)<br />
* bp/cinder-backup-improvements (dosaboy)<br />
* bp/multi-attach (zhiyan)<br />
<br />
<br />
<br />
'''Aug 21, 2013, 16:00 UTC'''<br />
# No agenda, no meeting.<br />
<br />
'''Aug 14, 2013, 16:00 UTC'''<br />
# Volume migration status - avishay<br />
# API extensions using metadata. This comes from the [https://review.openstack.org/#/c/38322/ readonly volume attach support]. - thingee<br />
[http://eavesdrop.openstack.org/meetings/cinder/2013/cinder.2013-08-14-16.00.log.html IRC Log]<br />
<br />
'''Aug 7, 2013, 16:00 UTC'''<br />
# [https://bugs.launchpad.net/cinder/+bug/1209199 RFC - make all rbd clones copy-on-write] -- Dosaboy<br />
# V1 API removal issues, plans and timescales - DuncanT<br />
<br />
== Meeting Minutes ==<br />
<br />
http://eavesdrop.openstack.org/meetings/cinder/2013/<br />
<br />
http://eavesdrop.openstack.org/meetings/cinder/2012/</div>Ronenkathttps://wiki.openstack.org/w/index.php?title=DisasterRecovery&diff=34107DisasterRecovery2013-10-27T18:46:45Z<p>Ronenkat: /* Related projects and topics */</p>
<hr />
<div>= Disaster Recovery for OpenStack =<br />
<br />
'''Disaster Recovery (DR)''' for OpenStack is an umbrella topic that describes what needs to be done for '''applications and services''' (generally referred to as workload) running in an OpenStack cloud to survive a '''large scale disaster'''.<br />
Providing DR for a workload is a complex task involving infrastructure, software and an understanding of the workload. To enable recovery following a disaster, the administrator needs to execute a complex set of provisioning operations that will mimic the day-to-day setup in a different environment.<br />
Enabling DR for OpenStack hosted workloads requires enablement (APIs) in OpenStack components (e.g., Cinder) and tools which may be outside of OpenStack (e.g., scripts) to invoke, orchestrate and leverage the component specific APIs. <br />
<br />
The goal of this work is to create a framework that will enable protecting '''applications and services''' (VMs, images, volumes, etc) from disaster.<br />
Determining and selecting ''what application and services to protect'' is the responsibility of the user, while ''handling the logistics of protecting'' is up to the cloud (and its operator).<br />
<br />
== What is Disaster Recovery? ==<br />
Disaster Recovery is the process of ensuring continuity of a set of workloads following or in advance of a large scale disaster that disrupts the current environment or infrastructure. By large scale disaster, we are considering disasters which can lead to a complete loss of a data center such as floods, tornadoes, hurricanes, fires, etc. To provide DR, we need a geographically distant site which will be the target of recovery. Any resources, data, etc., needed by the application to recover need to be at the target site prior to the disaster.<br />
<br />
== High Availability versus Disaster Recovery ==<br />
While both High Availability (HA) and Disaster Recovery strive to achieve continued operations in face of failures, High Availability usually deals with individual components failures, while Disaster Recovery deals with large scale failures. <br />
<br />
Some distinguish HA from DR by networking scope - LAN for HA and WAN for DR, in the cloud context a better distinction is probably the autonomy of management. High Availability will be the mechanism for continued operations within a single cloud environment - one deployment of OpenStack in a single location or multiple locations. Disaster Recovery will be the mechanism for continued operations when you have multiple cloud environments - multiple OpenStack deployment in various locations. In this context DR is a continued workload operations in an alternative deployment, the recovery target clouds.<br />
<br />
== Scope and Scenarios ==<br />
The goal is to provide a mechanism to mark and protect from disaster '''applications and services''' (a set of OpenStack entities) also referred to as a hosted workload. <br />
In this context the cloud is the equivalent of the physical hardware, and the recovery process focuses on the application and services, including their data, which are running in the cloud.<br />
<br />
The mechanism to determine the exact set of VMs, VM images, volumes, etc, to be recover can be based on a tenant, or a per entity mechanism. In its most basic case, it could be a single VM, but can also be all the entities associated with a user.<br />
<br />
A separate recovery mechanism, outside the scope of this work, should address making the primary cloud available to run workloads following a disaster.<br />
The disaster recovery mechanism for applications and services will handle the fail-back to the primary cloud.<br />
<br />
=== Examples ===<br />
* Application service running on customer cloud and protected by recovery on hosted cloud.<br />
* Application service running on customer cloud in data center #1 and protected by recovery on customer data center #2.<br />
<br />
[[file:DR.png]]<br />
<br />
The plan is to provide a solution for both the born-in-the-cloud applications, as well as legacy applications that require storage and state.<br />
<br />
== Is this a new OpenStack project? ==<br />
Not necessarily. A better description would be an umbrella topic that describes the required APIs and features that OpenStack needs in order to support DR for hosted workloads. Some APIs and features will be integrated into existing projects such as Nova (DR features for compute) and Cinder (Storage replication). Some functionality, like DR orchestration may leverage Heat, or be a new project, or even be outside the scope of OpenStack.<br />
<br />
Disaster Recovery is a complex task where different applications and use-cases have different requirements; some use-cases can be easily supported while others may be more complex, this is targeted as a long-term effort with incremental steps.<br />
<br />
== Vision and Roadmap ==<br />
Disaster Recovery should include support for:<br />
* Capturing the metadata of the cloud management stack, relevant for the protected workloads/resources: either as point-in-time snapshots of the metadata, or as continuous replication of the metadata.<br />
* Making available the VM images needed to run the hosted workload on the target cloud.<br />
* Replication of the workload data using storage replication, application level replication, or backup/restore.<br />
<br />
We note that metadata changes are less frequent than application data changes, and different mechanisms can handle replication of different portions of the metadata and data (volumes, images, etc)<br />
<br />
The approach is built around:<br />
# Identify required enablement and missing features in OpenStack projects <br />
# Create enablement in specific OpenStack projects <br />
# Create orchestration scripts to demonstrate DR <br />
<br />
When resources to be protected are logically associated with a workload (or a set of inter-related workloads), both the replication and the recovery processes should be able to incorporate hooks to ensure consistency of the replicated data & metadata, as well as to enable customization (automated or manual) of the individual workload components at recovery site. Heat can be used to represent such workloads, as well as to automate the above processes (when applicable).<br />
<br />
== Design Tenets ==<br />
* The DR is between a primary cloud and a target cloud - independently managed.<br />
* The approach should enable a hybrid deployment between private and public cloud.<br />
* Note that some of the work related to DR may be relevant to enabling high-availability between regions, availability zones or cells which do share some of the OpenStack services.<br />
* Ideally (but not as an immediate step) one of the clouds (primary or target) could be non-OpenStack or even non-cloud bare-metal environments.<br />
* The primary and target cloud interact through a “mediator” - a DR middleware or gateway to make sure the clouds are decoupled.<br />
* The DR scheme will protect a set of VMs and related resources (VM images, persistent storage, network definitions, metadata, etc). The resources would be typically associated with a workload or a set of workloads owned by a tenant.<br />
* Allow flexibility in choice of Recovery Point Objective (RPO) and Recovery Time Objective (RTO).<br />
<br />
=== Disaster Recovery functionality to be supported ===<br />
* Fail-over - switch to recovery site upon failure<br />
* Fail-back - switch back to primary site<br />
* Test - test application in a sandbox at the recovery site <br />
<br />
=== End goal for Disaster Recovery ===<br />
* Define RPO/RTO objectives<br />
** Defines the replication params (sync/async, bandwidth, etc.)<br />
** Defines DR policy type<br />
* Enablement of multiple DR Policy options<br />
** backup to Swift<br />
** Active - Cold standby<br />
** Active - Hot standby<br />
** Active - Active (requires application awareness and support)<br />
** Plugable DR policies - e.g. DR to the cloud<br />
* Ability to mark a complete composite application as protected<br />
* Ability to elect DR region or availability zone per application<br />
* Ability to create one to many DR relationships per application<br />
* Ability to scale down the application at the recovery site if needed<br />
* Replication of all configuration and metadata required by an application - Neutron, Cinder, Nova, etc.<br />
* Ability to ensure consistency of the replicated data & metadata<br />
* Supporting a wide range of data replication methods<br />
** Storage systems based replication<br />
** Hypervisor assisted (possibly between heterogeneous storage systems). For example, using DRBD or Qemu based replication<br />
** Backup and Restore methods<br />
** Pluggable application level replication methods<br />
* Integration with horizon for basic DR orchestration<br />
<br />
== Activities ==<br />
=== Related sessions in Icehouse summit ===<br />
* [http://openstacksummitnovember2013.sched.org/event/36ef8daa098c248d7fbb4ac7409f802a#%20 Surviving the worst: A vision for OpenStack disaster recovery - November 7, 9:50am]<br />
* Storage replication (Cinder) - [http://summit.openstack.org/cfp/details/69 Volume continuous replication]<br />
<br />
=== Related projects and topics ===<br />
* Resource reservation on target cloud - [[Resource-reservation-service|https://wiki.openstack.org/wiki/Resource-reservation-service]]<br />
* Heat description of workload / Stack abandon and adopt (Icehouse session proposal - merged into one session) - [http://summit.openstack.org/cfp/details/98 Create Heat stack from existing resources] [http://summit.openstack.org/cfp/details/200 Stack abandon and adopt]<br />
<br />
=== Contacts and (current) team ===<br />
* Ronen Kat (ronenkat) (IBM)<br />
* Ayal Baron (abaron) (Red Hat)<br />
* Sean Cohen (scohen) (Red Hat)<br />
* Alex Glikson (glikson) (IBM)<br />
* Avishay Traeger (avishay-il) (IBM)</div>Ronenkathttps://wiki.openstack.org/w/index.php?title=DisasterRecovery&diff=34106DisasterRecovery2013-10-27T18:44:23Z<p>Ronenkat: /* Related sessions in Icehouse summit */</p>
<hr />
<div>= Disaster Recovery for OpenStack =<br />
<br />
'''Disaster Recovery (DR)''' for OpenStack is an umbrella topic that describes what needs to be done for '''applications and services''' (generally referred to as workload) running in an OpenStack cloud to survive a '''large scale disaster'''.<br />
Providing DR for a workload is a complex task involving infrastructure, software and an understanding of the workload. To enable recovery following a disaster, the administrator needs to execute a complex set of provisioning operations that will mimic the day-to-day setup in a different environment.<br />
Enabling DR for OpenStack hosted workloads requires enablement (APIs) in OpenStack components (e.g., Cinder) and tools which may be outside of OpenStack (e.g., scripts) to invoke, orchestrate and leverage the component specific APIs. <br />
<br />
The goal of this work is to create a framework that will enable protecting '''applications and services''' (VMs, images, volumes, etc) from disaster.<br />
Determining and selecting ''what application and services to protect'' is the responsibility of the user, while ''handling the logistics of protecting'' is up to the cloud (and its operator).<br />
<br />
== What is Disaster Recovery? ==<br />
Disaster Recovery is the process of ensuring continuity of a set of workloads following or in advance of a large scale disaster that disrupts the current environment or infrastructure. By large scale disaster, we are considering disasters which can lead to a complete loss of a data center such as floods, tornadoes, hurricanes, fires, etc. To provide DR, we need a geographically distant site which will be the target of recovery. Any resources, data, etc., needed by the application to recover need to be at the target site prior to the disaster.<br />
<br />
== High Availability versus Disaster Recovery ==<br />
While both High Availability (HA) and Disaster Recovery strive to achieve continued operations in face of failures, High Availability usually deals with individual components failures, while Disaster Recovery deals with large scale failures. <br />
<br />
Some distinguish HA from DR by networking scope - LAN for HA and WAN for DR, in the cloud context a better distinction is probably the autonomy of management. High Availability will be the mechanism for continued operations within a single cloud environment - one deployment of OpenStack in a single location or multiple locations. Disaster Recovery will be the mechanism for continued operations when you have multiple cloud environments - multiple OpenStack deployment in various locations. In this context DR is a continued workload operations in an alternative deployment, the recovery target clouds.<br />
<br />
== Scope and Scenarios ==<br />
The goal is to provide a mechanism to mark and protect from disaster '''applications and services''' (a set of OpenStack entities) also referred to as a hosted workload. <br />
In this context the cloud is the equivalent of the physical hardware, and the recovery process focuses on the application and services, including their data, which are running in the cloud.<br />
<br />
The mechanism to determine the exact set of VMs, VM images, volumes, etc, to be recover can be based on a tenant, or a per entity mechanism. In its most basic case, it could be a single VM, but can also be all the entities associated with a user.<br />
<br />
A separate recovery mechanism, outside the scope of this work, should address making the primary cloud available to run workloads following a disaster.<br />
The disaster recovery mechanism for applications and services will handle the fail-back to the primary cloud.<br />
<br />
=== Examples ===<br />
* Application service running on customer cloud and protected by recovery on hosted cloud.<br />
* Application service running on customer cloud in data center #1 and protected by recovery on customer data center #2.<br />
<br />
[[file:DR.png]]<br />
<br />
The plan is to provide a solution for both the born-in-the-cloud applications, as well as legacy applications that require storage and state.<br />
<br />
== Is this a new OpenStack project? ==<br />
Not necessarily. A better description would be an umbrella topic that describes the required APIs and features that OpenStack needs in order to support DR for hosted workloads. Some APIs and features will be integrated into existing projects such as Nova (DR features for compute) and Cinder (Storage replication). Some functionality, like DR orchestration may leverage Heat, or be a new project, or even be outside the scope of OpenStack.<br />
<br />
Disaster Recovery is a complex task where different applications and use-cases have different requirements; some use-cases can be easily supported while others may be more complex, this is targeted as a long-term effort with incremental steps.<br />
<br />
== Vision and Roadmap ==<br />
Disaster Recovery should include support for:<br />
* Capturing the metadata of the cloud management stack, relevant for the protected workloads/resources: either as point-in-time snapshots of the metadata, or as continuous replication of the metadata.<br />
* Making available the VM images needed to run the hosted workload on the target cloud.<br />
* Replication of the workload data using storage replication, application level replication, or backup/restore.<br />
<br />
We note that metadata changes are less frequent than application data changes, and different mechanisms can handle replication of different portions of the metadata and data (volumes, images, etc)<br />
<br />
The approach is built around:<br />
# Identify required enablement and missing features in OpenStack projects <br />
# Create enablement in specific OpenStack projects <br />
# Create orchestration scripts to demonstrate DR <br />
<br />
When resources to be protected are logically associated with a workload (or a set of inter-related workloads), both the replication and the recovery processes should be able to incorporate hooks to ensure consistency of the replicated data & metadata, as well as to enable customization (automated or manual) of the individual workload components at recovery site. Heat can be used to represent such workloads, as well as to automate the above processes (when applicable).<br />
<br />
== Design Tenets ==<br />
* The DR is between a primary cloud and a target cloud - independently managed.<br />
* The approach should enable a hybrid deployment between private and public cloud.<br />
* Note that some of the work related to DR may be relevant to enabling high-availability between regions, availability zones or cells which do share some of the OpenStack services.<br />
* Ideally (but not as an immediate step) one of the clouds (primary or target) could be non-OpenStack or even non-cloud bare-metal environments.<br />
* The primary and target cloud interact through a “mediator” - a DR middleware or gateway to make sure the clouds are decoupled.<br />
* The DR scheme will protect a set of VMs and related resources (VM images, persistent storage, network definitions, metadata, etc). The resources would be typically associated with a workload or a set of workloads owned by a tenant.<br />
* Allow flexibility in choice of Recovery Point Objective (RPO) and Recovery Time Objective (RTO).<br />
<br />
=== Disaster Recovery functionality to be supported ===<br />
* Fail-over - switch to recovery site upon failure<br />
* Fail-back - switch back to primary site<br />
* Test - test application in a sandbox at the recovery site <br />
<br />
=== End goal for Disaster Recovery ===<br />
* Define RPO/RTO objectives<br />
** Defines the replication params (sync/async, bandwidth, etc.)<br />
** Defines DR policy type<br />
* Enablement of multiple DR Policy options<br />
** backup to Swift<br />
** Active - Cold standby<br />
** Active - Hot standby<br />
** Active - Active (requires application awareness and support)<br />
** Plugable DR policies - e.g. DR to the cloud<br />
* Ability to mark a complete composite application as protected<br />
* Ability to elect DR region or availability zone per application<br />
* Ability to create one to many DR relationships per application<br />
* Ability to scale down the application at the recovery site if needed<br />
* Replication of all configuration and metadata required by an application - Neutron, Cinder, Nova, etc.<br />
* Ability to ensure consistency of the replicated data & metadata<br />
* Supporting a wide range of data replication methods<br />
** Storage systems based replication<br />
** Hypervisor assisted (possibly between heterogeneous storage systems). For example, using DRBD or Qemu based replication<br />
** Backup and Restore methods<br />
** Pluggable application level replication methods<br />
* Integration with horizon for basic DR orchestration<br />
<br />
== Activities ==<br />
=== Related sessions in Icehouse summit ===<br />
* [http://openstacksummitnovember2013.sched.org/event/36ef8daa098c248d7fbb4ac7409f802a#%20 Surviving the worst: A vision for OpenStack disaster recovery - November 7, 9:50am]<br />
* Storage replication (Cinder) - [http://summit.openstack.org/cfp/details/69 Volume continuous replication]<br />
<br />
=== Related projects and topics ===<br />
* Resource reservation on target cloud - [[Resource-reservation-service|https://wiki.openstack.org/wiki/Resource-reservation-service]]<br />
* Heat description of workload (Icehouse session proposal) - [http://summit.openstack.org/cfp/details/98 Create Heat stack from existing resources]<br />
<br />
=== Contacts and (current) team ===<br />
* Ronen Kat (ronenkat) (IBM)<br />
* Ayal Baron (abaron) (Red Hat)<br />
* Sean Cohen (scohen) (Red Hat)<br />
* Alex Glikson (glikson) (IBM)<br />
* Avishay Traeger (avishay-il) (IBM)</div>Ronenkathttps://wiki.openstack.org/w/index.php?title=DisasterRecovery&diff=33390DisasterRecovery2013-10-21T17:59:54Z<p>Ronenkat: /* Disaster Recovery for OpenStack */</p>
<hr />
<div>= Disaster Recovery for OpenStack =<br />
<br />
'''Disaster Recovery (DR)''' for OpenStack is an umbrella topic that describes what needs to be done for '''applications and services''' (generally referred to as workload) running in an OpenStack cloud to survive a '''large scale disaster'''.<br />
Providing DR for a workload is a complex task involving infrastructure, software and an understanding of the workload. To enable recovery following a disaster, the administrator needs to execute a complex set of provisioning operations that will mimic the day-to-day setup in a different environment.<br />
Enabling DR for OpenStack hosted workloads requires enablement (APIs) in OpenStack components (e.g., Cinder) and tools which may be outside of OpenStack (e.g., scripts) to invoke, orchestrate and leverage the component specific APIs. <br />
<br />
The goal of this work is to create a framework that will enable protecting '''applications and services''' (VMs, images, volumes, etc) from disaster.<br />
Determining and selecting ''what application and services to protect'' is the responsibility of the user, while ''handling the logistics of protecting'' is up to the cloud (and its operator).<br />
<br />
== What is Disaster Recovery? ==<br />
Disaster Recovery is the process of ensuring continuity of a set of workloads following or in advance of a large scale disaster that disrupts the current environment or infrastructure. By large scale disaster, we are considering disasters which can lead to a complete loss of a data center such as floods, tornadoes, hurricanes, fires, etc. To provide DR, we need a geographically distant site which will be the target of recovery. Any resources, data, etc., needed by the application to recover need to be at the target site prior to the disaster.<br />
<br />
== High Availability versus Disaster Recovery ==<br />
While both High Availability (HA) and Disaster Recovery strive to achieve continued operations in face of failures, High Availability usually deals with individual components failures, while Disaster Recovery deals with large scale failures. <br />
<br />
Some distinguish HA from DR by networking scope - LAN for HA and WAN for DR, in the cloud context a better distinction is probably the autonomy of management. High Availability will be the mechanism for continued operations within a single cloud environment - one deployment of OpenStack in a single location or multiple locations. Disaster Recovery will be the mechanism for continued operations when you have multiple cloud environments - multiple OpenStack deployment in various locations. In this context DR is a continued workload operations in an alternative deployment, the recovery target clouds.<br />
<br />
== Scope and Scenarios ==<br />
The goal is to provide a mechanism to mark and protect from disaster '''applications and services''' (a set of OpenStack entities) also referred to as a hosted workload. <br />
In this context the cloud is the equivalent of the physical hardware, and the recovery process focuses on the application and services, including their data, which are running in the cloud.<br />
<br />
The mechanism to determine the exact set of VMs, VM images, volumes, etc, to be recover can be based on a tenant, or a per entity mechanism. In its most basic case, it could be a single VM, but can also be all the entities associated with a user.<br />
<br />
A separate recovery mechanism, outside the scope of this work, should address making the primary cloud available to run workloads following a disaster.<br />
The disaster recovery mechanism for applications and services will handle the fail-back to the primary cloud.<br />
<br />
=== Examples ===<br />
* Application service running on customer cloud and protected by recovery on hosted cloud.<br />
* Application service running on customer cloud in data center #1 and protected by recovery on customer data center #2.<br />
<br />
[[file:DR.png]]<br />
<br />
The plan is to provide a solution for both the born-in-the-cloud applications, as well as legacy applications that require storage and state.<br />
<br />
== Is this a new OpenStack project? ==<br />
Not necessarily. A better description would be an umbrella topic that describes the required APIs and features that OpenStack needs in order to support DR for hosted workloads. Some APIs and features will be integrated into existing projects such as Nova (DR features for compute) and Cinder (Storage replication). Some functionality, like DR orchestration may leverage Heat, or be a new project, or even be outside the scope of OpenStack.<br />
<br />
Disaster Recovery is a complex task where different applications and use-cases have different requirements; some use-cases can be easily supported while others may be more complex, this is targeted as a long-term effort with incremental steps.<br />
<br />
== Vision and Roadmap ==<br />
Disaster Recovery should include support for:<br />
* Capturing the metadata of the cloud management stack, relevant for the protected workloads/resources: either as point-in-time snapshots of the metadata, or as continuous replication of the metadata.<br />
* Making available the VM images needed to run the hosted workload on the target cloud.<br />
* Replication of the workload data using storage replication, application level replication, or backup/restore.<br />
<br />
We note that metadata changes are less frequent than application data changes, and different mechanisms can handle replication of different portions of the metadata and data (volumes, images, etc)<br />
<br />
The approach is built around:<br />
# Identify required enablement and missing features in OpenStack projects <br />
# Create enablement in specific OpenStack projects <br />
# Create orchestration scripts to demonstrate DR <br />
<br />
When resources to be protected are logically associated with a workload (or a set of inter-related workloads), both the replication and the recovery processes should be able to incorporate hooks to ensure consistency of the replicated data & metadata, as well as to enable customization (automated or manual) of the individual workload components at recovery site. Heat can be used to represent such workloads, as well as to automate the above processes (when applicable).<br />
<br />
== Design Tenets ==<br />
* The DR is between a primary cloud and a target cloud - independently managed.<br />
* The approach should enable a hybrid deployment between private and public cloud.<br />
* Note that some of the work related to DR may be relevant to enabling high-availability between regions, availability zones or cells which do share some of the OpenStack services.<br />
* Ideally (but not as an immediate step) one of the clouds (primary or target) could be non-OpenStack or even non-cloud bare-metal environments.<br />
* The primary and target cloud interact through a “mediator” - a DR middleware or gateway to make sure the clouds are decoupled.<br />
* The DR scheme will protect a set of VMs and related resources (VM images, persistent storage, network definitions, metadata, etc). The resources would be typically associated with a workload or a set of workloads owned by a tenant.<br />
* Allow flexibility in choice of Recovery Point Objective (RPO) and Recovery Time Objective (RTO).<br />
<br />
=== Disaster Recovery functionality to be supported ===<br />
* Fail-over - switch to recovery site upon failure<br />
* Fail-back - switch back to primary site<br />
* Test - test application in a sandbox at the recovery site <br />
<br />
=== End goal for Disaster Recovery ===<br />
* Define RPO/RTO objectives<br />
** Defines the replication params (sync/async, bandwidth, etc.)<br />
** Defines DR policy type<br />
* Enablement of multiple DR Policy options<br />
** backup to Swift<br />
** Active - Cold standby<br />
** Active - Hot standby<br />
** Active - Active (requires application awareness and support)<br />
** Plugable DR policies - e.g. DR to the cloud<br />
* Ability to mark a complete composite application as protected<br />
* Ability to elect DR region or availability zone per application<br />
* Ability to create one to many DR relationships per application<br />
* Ability to scale down the application at the recovery site if needed<br />
* Replication of all configuration and metadata required by an application - Neutron, Cinder, Nova, etc.<br />
* Ability to ensure consistency of the replicated data & metadata<br />
* Supporting a wide range of data replication methods<br />
** Storage systems based replication<br />
** Hypervisor assisted (possibly between heterogeneous storage systems). For example, using DRBD or Qemu based replication<br />
** Backup and Restore methods<br />
** Pluggable application level replication methods<br />
* Integration with horizon for basic DR orchestration<br />
<br />
== Activities ==<br />
=== Related sessions in Icehouse summit ===<br />
* [http://openstacksummitnovember2013.sched.org/event/36ef8daa098c248d7fbb4ac7409f802a#%20 Surviving the worst: A vision for OpenStack disaster recovery - November 7, 9:50am]<br />
* Storage replication (Cinder) - [http://summit.openstack.org/cfp/details/69%20 Volume continuous replication]<br />
<br />
=== Related projects and topics ===<br />
* Resource reservation on target cloud - [[Resource-reservation-service|https://wiki.openstack.org/wiki/Resource-reservation-service]]<br />
* Heat description of workload (Icehouse session proposal) - [http://summit.openstack.org/cfp/details/98 Create Heat stack from existing resources]<br />
<br />
=== Contacts and (current) team ===<br />
* Ronen Kat (ronenkat) (IBM)<br />
* Ayal Baron (abaron) (Red Hat)<br />
* Sean Cohen (scohen) (Red Hat)<br />
* Alex Glikson (glikson) (IBM)<br />
* Avishay Traeger (avishay-il) (IBM)</div>Ronenkathttps://wiki.openstack.org/w/index.php?title=DisasterRecovery&diff=33387DisasterRecovery2013-10-21T17:52:07Z<p>Ronenkat: /* Scope and Scenarios */</p>
<hr />
<div>= Disaster Recovery for OpenStack =<br />
<br />
'''Disaster Recovery (DR)''' for OpenStack is an umbrella topic that describes what needs to be done for '''applications and services''' (generally referred to as workload) running in an OpenStack cloud to survive a '''large scale disaster'''.<br />
Providing DR for a workload is a complex task involving infrastructure, software and an understanding of the workload. To enable recovery following a disaster, the administrator needs to execute a complex set of provisioning operations that will mimic the day-to-day setup in a different environment.<br />
Enabling DR for OpenStack hosted workloads requires enablement (APIs) in OpenStack components (e.g., Cinder) and tools which may be outside of OpenStack (e.g., scripts) to invoke, orchestrate and leverage the component specific APIs. <br />
<br />
== What is Disaster Recovery? ==<br />
Disaster Recovery is the process of ensuring continuity of a set of workloads following or in advance of a large scale disaster that disrupts the current environment or infrastructure. By large scale disaster, we are considering disasters which can lead to a complete loss of a data center such as floods, tornadoes, hurricanes, fires, etc. To provide DR, we need a geographically distant site which will be the target of recovery. Any resources, data, etc., needed by the application to recover need to be at the target site prior to the disaster.<br />
<br />
== High Availability versus Disaster Recovery ==<br />
While both High Availability (HA) and Disaster Recovery strive to achieve continued operations in face of failures, High Availability usually deals with individual components failures, while Disaster Recovery deals with large scale failures. <br />
<br />
Some distinguish HA from DR by networking scope - LAN for HA and WAN for DR, in the cloud context a better distinction is probably the autonomy of management. High Availability will be the mechanism for continued operations within a single cloud environment - one deployment of OpenStack in a single location or multiple locations. Disaster Recovery will be the mechanism for continued operations when you have multiple cloud environments - multiple OpenStack deployment in various locations. In this context DR is a continued workload operations in an alternative deployment, the recovery target clouds.<br />
<br />
== Scope and Scenarios ==<br />
The goal is to provide a mechanism to mark and protect from disaster '''applications and services''' (a set of OpenStack entities) also referred to as a hosted workload. <br />
In this context the cloud is the equivalent of the physical hardware, and the recovery process focuses on the application and services, including their data, which are running in the cloud.<br />
<br />
The mechanism to determine the exact set of VMs, VM images, volumes, etc, to be recover can be based on a tenant, or a per entity mechanism. In its most basic case, it could be a single VM, but can also be all the entities associated with a user.<br />
<br />
A separate recovery mechanism, outside the scope of this work, should address making the primary cloud available to run workloads following a disaster.<br />
The disaster recovery mechanism for applications and services will handle the fail-back to the primary cloud.<br />
<br />
=== Examples ===<br />
* Application service running on customer cloud and protected by recovery on hosted cloud.<br />
* Application service running on customer cloud in data center #1 and protected by recovery on customer data center #2.<br />
<br />
[[file:DR.png]]<br />
<br />
The plan is to provide a solution for both the born-in-the-cloud applications, as well as legacy applications that require storage and state.<br />
<br />
== Is this a new OpenStack project? ==<br />
Not necessarily. A better description would be an umbrella topic that describes the required APIs and features that OpenStack needs in order to support DR for hosted workloads. Some APIs and features will be integrated into existing projects such as Nova (DR features for compute) and Cinder (Storage replication). Some functionality, like DR orchestration may leverage Heat, or be a new project, or even be outside the scope of OpenStack.<br />
<br />
Disaster Recovery is a complex task where different applications and use-cases have different requirements; some use-cases can be easily supported while others may be more complex, this is targeted as a long-term effort with incremental steps.<br />
<br />
== Vision and Roadmap ==<br />
Disaster Recovery should include support for:<br />
* Capturing the metadata of the cloud management stack, relevant for the protected workloads/resources: either as point-in-time snapshots of the metadata, or as continuous replication of the metadata.<br />
* Making available the VM images needed to run the hosted workload on the target cloud.<br />
* Replication of the workload data using storage replication, application level replication, or backup/restore.<br />
<br />
We note that metadata changes are less frequent than application data changes, and different mechanisms can handle replication of different portions of the metadata and data (volumes, images, etc)<br />
<br />
The approach is built around:<br />
# Identify required enablement and missing features in OpenStack projects <br />
# Create enablement in specific OpenStack projects <br />
# Create orchestration scripts to demonstrate DR <br />
<br />
When resources to be protected are logically associated with a workload (or a set of inter-related workloads), both the replication and the recovery processes should be able to incorporate hooks to ensure consistency of the replicated data & metadata, as well as to enable customization (automated or manual) of the individual workload components at recovery site. Heat can be used to represent such workloads, as well as to automate the above processes (when applicable).<br />
<br />
== Design Tenets ==<br />
* The DR is between a primary cloud and a target cloud - independently managed.<br />
* The approach should enable a hybrid deployment between private and public cloud.<br />
* Note that some of the work related to DR may be relevant to enabling high-availability between regions, availability zones or cells which do share some of the OpenStack services.<br />
* Ideally (but not as an immediate step) one of the clouds (primary or target) could be non-OpenStack or even non-cloud bare-metal environments.<br />
* The primary and target cloud interact through a “mediator” - a DR middleware or gateway to make sure the clouds are decoupled.<br />
* The DR scheme will protect a set of VMs and related resources (VM images, persistent storage, network definitions, metadata, etc). The resources would be typically associated with a workload or a set of workloads owned by a tenant.<br />
* Allow flexibility in choice of Recovery Point Objective (RPO) and Recovery Time Objective (RTO).<br />
<br />
=== Disaster Recovery functionality to be supported ===<br />
* Fail-over - switch to recovery site upon failure<br />
* Fail-back - switch back to primary site<br />
* Test - test application in a sandbox at the recovery site <br />
<br />
=== End goal for Disaster Recovery ===<br />
* Define RPO/RTO objectives<br />
** Defines the replication params (sync/async, bandwidth, etc.)<br />
** Defines DR policy type<br />
* Enablement of multiple DR Policy options<br />
** backup to Swift<br />
** Active - Cold standby<br />
** Active - Hot standby<br />
** Active - Active (requires application awareness and support)<br />
** Plugable DR policies - e.g. DR to the cloud<br />
* Ability to mark a complete composite application as protected<br />
* Ability to elect DR region or availability zone per application<br />
* Ability to create one to many DR relationships per application<br />
* Ability to scale down the application at the recovery site if needed<br />
* Replication of all configuration and metadata required by an application - Neutron, Cinder, Nova, etc.<br />
* Ability to ensure consistency of the replicated data & metadata<br />
* Supporting a wide range of data replication methods<br />
** Storage systems based replication<br />
** Hypervisor assisted (possibly between heterogeneous storage systems). For example, using DRBD or Qemu based replication<br />
** Backup and Restore methods<br />
** Pluggable application level replication methods<br />
* Integration with horizon for basic DR orchestration<br />
<br />
== Activities ==<br />
=== Related sessions in Icehouse summit ===<br />
* [http://openstacksummitnovember2013.sched.org/event/36ef8daa098c248d7fbb4ac7409f802a#%20 Surviving the worst: A vision for OpenStack disaster recovery - November 7, 9:50am]<br />
* Storage replication (Cinder) - [http://summit.openstack.org/cfp/details/69%20 Volume continuous replication]<br />
<br />
=== Related projects and topics ===<br />
* Resource reservation on target cloud - [[Resource-reservation-service|https://wiki.openstack.org/wiki/Resource-reservation-service]]<br />
* Heat description of workload (Icehouse session proposal) - [http://summit.openstack.org/cfp/details/98 Create Heat stack from existing resources]<br />
<br />
=== Contacts and (current) team ===<br />
* Ronen Kat (ronenkat) (IBM)<br />
* Ayal Baron (abaron) (Red Hat)<br />
* Sean Cohen (scohen) (Red Hat)<br />
* Alex Glikson (glikson) (IBM)<br />
* Avishay Traeger (avishay-il) (IBM)</div>Ronenkathttps://wiki.openstack.org/w/index.php?title=DisasterRecovery&diff=33386DisasterRecovery2013-10-21T17:51:51Z<p>Ronenkat: /* Scope and Scenarios */</p>
<hr />
<div>= Disaster Recovery for OpenStack =<br />
<br />
'''Disaster Recovery (DR)''' for OpenStack is an umbrella topic that describes what needs to be done for '''applications and services''' (generally referred to as workload) running in an OpenStack cloud to survive a '''large scale disaster'''.<br />
Providing DR for a workload is a complex task involving infrastructure, software and an understanding of the workload. To enable recovery following a disaster, the administrator needs to execute a complex set of provisioning operations that will mimic the day-to-day setup in a different environment.<br />
Enabling DR for OpenStack hosted workloads requires enablement (APIs) in OpenStack components (e.g., Cinder) and tools which may be outside of OpenStack (e.g., scripts) to invoke, orchestrate and leverage the component specific APIs. <br />
<br />
== What is Disaster Recovery? ==<br />
Disaster Recovery is the process of ensuring continuity of a set of workloads following or in advance of a large scale disaster that disrupts the current environment or infrastructure. By large scale disaster, we are considering disasters which can lead to a complete loss of a data center such as floods, tornadoes, hurricanes, fires, etc. To provide DR, we need a geographically distant site which will be the target of recovery. Any resources, data, etc., needed by the application to recover need to be at the target site prior to the disaster.<br />
<br />
== High Availability versus Disaster Recovery ==<br />
While both High Availability (HA) and Disaster Recovery strive to achieve continued operations in face of failures, High Availability usually deals with individual components failures, while Disaster Recovery deals with large scale failures. <br />
<br />
Some distinguish HA from DR by networking scope - LAN for HA and WAN for DR, in the cloud context a better distinction is probably the autonomy of management. High Availability will be the mechanism for continued operations within a single cloud environment - one deployment of OpenStack in a single location or multiple locations. Disaster Recovery will be the mechanism for continued operations when you have multiple cloud environments - multiple OpenStack deployment in various locations. In this context DR is a continued workload operations in an alternative deployment, the recovery target clouds.<br />
<br />
== Scope and Scenarios ==<br />
The goal is to provide a mechanism to mark and protect from disaster '''applications and services''' (a set of OpenStack entities) also referred to as a hosted workload. <br />
In this context the cloud is the equivalent of the physical hardware, and the recovery process focuses on the application and services, including their data, which are running in the cloud.<br />
The mechanism to determine the exact set of VMs, VM images, volumes, etc, to be recover can be based on a tenant, or a per entity mechanism. In its most basic case, it could be a single VM, but can also be all the entities associated with a user.<br />
<br />
A separate recovery mechanism, outside the scope of this work, should address making the primary cloud available to run workloads following a disaster.<br />
The disaster recovery mechanism for applications and services will handle the fail-back to the primary cloud.<br />
<br />
=== Examples ===<br />
* Application service running on customer cloud and protected by recovery on hosted cloud.<br />
* Application service running on customer cloud in data center #1 and protected by recovery on customer data center #2.<br />
<br />
[[file:DR.png]]<br />
<br />
The plan is to provide a solution for both the born-in-the-cloud applications, as well as legacy applications that require storage and state.<br />
<br />
== Is this a new OpenStack project? ==<br />
Not necessarily. A better description would be an umbrella topic that describes the required APIs and features that OpenStack needs in order to support DR for hosted workloads. Some APIs and features will be integrated into existing projects such as Nova (DR features for compute) and Cinder (Storage replication). Some functionality, like DR orchestration may leverage Heat, or be a new project, or even be outside the scope of OpenStack.<br />
<br />
Disaster Recovery is a complex task where different applications and use-cases have different requirements; some use-cases can be easily supported while others may be more complex, this is targeted as a long-term effort with incremental steps.<br />
<br />
== Vision and Roadmap ==<br />
Disaster Recovery should include support for:<br />
* Capturing the metadata of the cloud management stack, relevant for the protected workloads/resources: either as point-in-time snapshots of the metadata, or as continuous replication of the metadata.<br />
* Making available the VM images needed to run the hosted workload on the target cloud.<br />
* Replication of the workload data using storage replication, application level replication, or backup/restore.<br />
<br />
We note that metadata changes are less frequent than application data changes, and different mechanisms can handle replication of different portions of the metadata and data (volumes, images, etc)<br />
<br />
The approach is built around:<br />
# Identify required enablement and missing features in OpenStack projects <br />
# Create enablement in specific OpenStack projects <br />
# Create orchestration scripts to demonstrate DR <br />
<br />
When resources to be protected are logically associated with a workload (or a set of inter-related workloads), both the replication and the recovery processes should be able to incorporate hooks to ensure consistency of the replicated data & metadata, as well as to enable customization (automated or manual) of the individual workload components at recovery site. Heat can be used to represent such workloads, as well as to automate the above processes (when applicable).<br />
<br />
== Design Tenets ==<br />
* The DR is between a primary cloud and a target cloud - independently managed.<br />
* The approach should enable a hybrid deployment between private and public cloud.<br />
* Note that some of the work related to DR may be relevant to enabling high-availability between regions, availability zones or cells which do share some of the OpenStack services.<br />
* Ideally (but not as an immediate step) one of the clouds (primary or target) could be non-OpenStack or even non-cloud bare-metal environments.<br />
* The primary and target cloud interact through a “mediator” - a DR middleware or gateway to make sure the clouds are decoupled.<br />
* The DR scheme will protect a set of VMs and related resources (VM images, persistent storage, network definitions, metadata, etc). The resources would be typically associated with a workload or a set of workloads owned by a tenant.<br />
* Allow flexibility in choice of Recovery Point Objective (RPO) and Recovery Time Objective (RTO).<br />
<br />
=== Disaster Recovery functionality to be supported ===<br />
* Fail-over - switch to recovery site upon failure<br />
* Fail-back - switch back to primary site<br />
* Test - test application in a sandbox at the recovery site <br />
<br />
=== End goal for Disaster Recovery ===<br />
* Define RPO/RTO objectives<br />
** Defines the replication params (sync/async, bandwidth, etc.)<br />
** Defines DR policy type<br />
* Enablement of multiple DR Policy options<br />
** backup to Swift<br />
** Active - Cold standby<br />
** Active - Hot standby<br />
** Active - Active (requires application awareness and support)<br />
** Plugable DR policies - e.g. DR to the cloud<br />
* Ability to mark a complete composite application as protected<br />
* Ability to elect DR region or availability zone per application<br />
* Ability to create one to many DR relationships per application<br />
* Ability to scale down the application at the recovery site if needed<br />
* Replication of all configuration and metadata required by an application - Neutron, Cinder, Nova, etc.<br />
* Ability to ensure consistency of the replicated data & metadata<br />
* Supporting a wide range of data replication methods<br />
** Storage systems based replication<br />
** Hypervisor assisted (possibly between heterogeneous storage systems). For example, using DRBD or Qemu based replication<br />
** Backup and Restore methods<br />
** Pluggable application level replication methods<br />
* Integration with horizon for basic DR orchestration<br />
<br />
== Activities ==<br />
=== Related sessions in Icehouse summit ===<br />
* [http://openstacksummitnovember2013.sched.org/event/36ef8daa098c248d7fbb4ac7409f802a#%20 Surviving the worst: A vision for OpenStack disaster recovery - November 7, 9:50am]<br />
* Storage replication (Cinder) - [http://summit.openstack.org/cfp/details/69%20 Volume continuous replication]<br />
<br />
=== Related projects and topics ===<br />
* Resource reservation on target cloud - [[Resource-reservation-service|https://wiki.openstack.org/wiki/Resource-reservation-service]]<br />
* Heat description of workload (Icehouse session proposal) - [http://summit.openstack.org/cfp/details/98 Create Heat stack from existing resources]<br />
<br />
=== Contacts and (current) team ===<br />
* Ronen Kat (ronenkat) (IBM)<br />
* Ayal Baron (abaron) (Red Hat)<br />
* Sean Cohen (scohen) (Red Hat)<br />
* Alex Glikson (glikson) (IBM)<br />
* Avishay Traeger (avishay-il) (IBM)</div>Ronenkathttps://wiki.openstack.org/w/index.php?title=DisasterRecovery&diff=33255DisasterRecovery2013-10-20T09:47:52Z<p>Ronenkat: /* Contacts and (current) team */</p>
<hr />
<div>= Disaster Recovery for OpenStack =<br />
<br />
'''Disaster Recovery (DR)''' for OpenStack is an umbrella topic that describes what needs to be done for '''applications and services''' (generally referred to as workload) running in an OpenStack cloud to survive a '''large scale disaster'''.<br />
Providing DR for a workload is a complex task involving infrastructure, software and an understating of the workload. To enable recovery following a disaster, the administrator needs to execute a complex set of provisioning operations that will mimic the day-to-day setup in a different environment.<br />
Enabling DR for OpenStack hosted workloads requires enablement (APIs) in OpenStack components (e.g., Cinder) and tools which may be outside of OpenStack (e.g., scripts) to invoke, orchestrate and leverage the component specific APIs. <br />
<br />
== What is Disaster Recovery? ==<br />
Disaster Recovery is the process of ensuring continuity of a set of workloads following or in advance of a large scale disaster that disrupts the current environment or infrastructure. By large scale disaster, we are considering disasters which can lead to a complete loss of a data center such as floods, tornadoes, hurricanes, fires, etc. To provide DR, we need a geographically distant site which will be the target of recovery. Any resources, data, etc., needed by the application to recover need to be at the target site prior to the disaster.<br />
<br />
== High Availability versus Disaster Recovery ==<br />
While both High Availability (HA) and Disaster Recovery strive to achieve continued operations in face of failures, High Availability usually deals with individual components failures, while Disaster Recovery deals with large scale failures. <br />
<br />
Some distinguish HA from DR by networking scope - LAN for HA and WAN for DR, in the cloud context a better distinction is probably the autonomy of management. High Availability will be the mechanism for continued operations within a single cloud environment - one deployment of OpenStack in a single location or multiple locations. Disaster Recovery will be the mechanism for continued operations when you have multiple cloud environments - multiple OpenStack deployment in various locations. In this context DR is a continued workload operations in an alternative deployment, the recovery target clouds.<br />
<br />
== Scope and Scenarios ==<br />
The goal is to provide a mechanism to mark and protect from disaster applications and services (a set of OpenStack entities) also referred to as a hosted workload. In this context the cloud is the equivalent of the physical hardware. The target of the disaster recovery is not recover the hardware, but the applications, services and their data.<br />
A separate recovery mechanism should address making the primary cloud available to run workloads following a disaster. The disaster recovery mechanism for applications and services will handle the fail-back to the primary cloud.<br />
<br />
=== Examples ===<br />
* Application service running on customer cloud and protected by recovery on hosted cloud.<br />
* Application service running on customer cloud in data center #1 and protected by recovery on customer data center #2.<br />
<br />
[[file:DR.png]]<br />
<br />
The plan is to provide a solution for both the born-in-the-cloud applications, as well as legacy applications that require storage and state.<br />
<br />
== Is this a new OpenStack project? ==<br />
Not necessarily. A better description would be an umbrella topic that describes the required APIs and features that OpenStack needs in order to support DR for hosted workloads. Some APIs and features will be integrated into existing projects such as Nova (DR features for compute) and Cinder (Storage replication). Some functionality, like DR orchestration may leverage Heat, or be a new project, or even be outside the scope of OpenStack.<br />
<br />
Disaster Recovery is a complex task where different applications and use-cases have different requirements; some use-cases can be easily supported while others may be more complex, this is targeted as a long-term effort with incremental steps.<br />
<br />
== Vision and Roadmap ==<br />
Disaster Recovery should include support for:<br />
* Capturing the metadata of the cloud management stack, relevant for the protected workloads/resources: either as point-in-time snapshots of the metadata, or as continuous replication of the metadata.<br />
* Making available the VM images needed to run the hosted workload on the target cloud.<br />
* Replication of the workload data using storage replication, application level replication, or backup/restore.<br />
<br />
We note that metadata changes are less frequent than application data changes, and different mechanisms can handle replication of different portions of the metadata and data (volumes, images, etc)<br />
<br />
The approach is built around:<br />
# Identify required enablement and missing features in OpenStack projects <br />
# Create enablement in specific OpenStack projects <br />
# Create orchestration scripts to demonstrate DR <br />
<br />
When resources to be protected are logically associated with a workload (or a set of inter-related workloads), both the replication and the recovery processes should be able to incorporate hooks to ensure consistency of the replicated data & metadata, as well as to enable customization (automated or manual) of the individual workload components at recovery site. Heat can be used to represent such workloads, as well as to automate the above processes (when applicable).<br />
<br />
== Design Tenets ==<br />
* The DR is between a primary cloud and a target cloud - independently managed.<br />
* The approach should enable a hybrid deployment between private and public cloud.<br />
* Note that some of the work related to DR may be relevant to enabling high-availability between regions, availability zones or cells which do share some of the OpenStack services.<br />
* Ideally (but not as an immediate step) one of the clouds (primary or target) could be non-OpenStack or even non-cloud bare-metal environments.<br />
* The primary and target cloud interact through a “mediator” - a DR middleware or gateway to make sure the clouds are decoupled.<br />
* The DR scheme will protect a set of VMs and related resources (VM images, persistent storage, network definitions, metadata, etc). The resources would be typically associated with a workload or a set of workloads owned by a tenant.<br />
* Allow flexibility in choice of Recovery Point Objective (RPO) and Recovery Time Objective (RTO).<br />
<br />
=== Disaster Recovery functionality to be supported ===<br />
* Fail-over - switch to recovery site upon failure<br />
* Fail-back - switch back to primary site<br />
* Test - test application in a sandbox at the recovery site <br />
<br />
=== End goal for Disaster Recovery ===<br />
* Define RPO/RTO objectives<br />
** Defines the replication params (sync/async, bandwidth, etc.)<br />
** Defines DR policy type<br />
* Enablement of multiple DR Policy options<br />
** backup to Swift<br />
** Active - Cold standby<br />
** Active - Hot standby<br />
** Active - Active (requires application awareness and support)<br />
** Plugable DR policies - e.g. DR to the cloud<br />
* Ability to mark a complete composite application as protected<br />
* Ability to elect DR region or availability zone per application<br />
* Ability to create one to many DR relationships per application<br />
* Ability to scale down the application at the recovery site if needed<br />
* Replication of all configuration and metadata required by an application - Neutron, Cinder, Nova, etc.<br />
* Ability to ensure consistency of the replicated data & metadata<br />
* Supporting a wide range of data replication methods<br />
** Storage systems based replication<br />
** Hypervisor assisted (possibly between heterogeneous storage systems). For example, using DRBD or Qemu based replication<br />
** Backup and Restore methods<br />
** Pluggable application level replication methods<br />
* Integration with horizon for basic DR orchestration<br />
<br />
== Activities ==<br />
=== Related sessions in Icehouse summit ===<br />
* [http://openstacksummitnovember2013.sched.org/event/36ef8daa098c248d7fbb4ac7409f802a#%20 Surviving the worst: A vision for OpenStack disaster recovery - November 7, 9:50am]<br />
* Storage replication (Cinder) - [http://summit.openstack.org/cfp/details/69%20 Volume continuous replication]<br />
<br />
=== Related projects and topics ===<br />
* Resource reservation on target cloud - [[Resource-reservation-service|https://wiki.openstack.org/wiki/Resource-reservation-service]]<br />
* Heat description of workload (Icehouse session proposal) - [http://summit.openstack.org/cfp/details/98 Create Heat stack from existing resources]<br />
<br />
=== Contacts and (current) team ===<br />
* Ronen Kat (ronenkat)<br />
* Ayal Baron (abaron)<br />
* Sean Cohen (scohen)<br />
* Alex Glikson (glikson)<br />
* Avishay Traeger (avishay-il)</div>Ronenkathttps://wiki.openstack.org/w/index.php?title=DisasterRecovery&diff=33254DisasterRecovery2013-10-20T08:48:58Z<p>Ronenkat: /* Examples */</p>
<hr />
<div>= Disaster Recovery for OpenStack =<br />
<br />
'''Disaster Recovery (DR)''' for OpenStack is an umbrella topic that describes what needs to be done for '''applications and services''' (generally referred to as workload) running in an OpenStack cloud to survive a '''large scale disaster'''.<br />
Providing DR for a workload is a complex task involving infrastructure, software and an understating of the workload. To enable recovery following a disaster, the administrator needs to execute a complex set of provisioning operations that will mimic the day-to-day setup in a different environment.<br />
Enabling DR for OpenStack hosted workloads requires enablement (APIs) in OpenStack components (e.g., Cinder) and tools which may be outside of OpenStack (e.g., scripts) to invoke, orchestrate and leverage the component specific APIs. <br />
<br />
== What is Disaster Recovery? ==<br />
Disaster Recovery is the process of ensuring continuity of a set of workloads following or in advance of a large scale disaster that disrupts the current environment or infrastructure. By large scale disaster, we are considering disasters which can lead to a complete loss of a data center such as floods, tornadoes, hurricanes, fires, etc. To provide DR, we need a geographically distant site which will be the target of recovery. Any resources, data, etc., needed by the application to recover need to be at the target site prior to the disaster.<br />
<br />
== High Availability versus Disaster Recovery ==<br />
While both High Availability (HA) and Disaster Recovery strive to achieve continued operations in face of failures, High Availability usually deals with individual components failures, while Disaster Recovery deals with large scale failures. <br />
<br />
Some distinguish HA from DR by networking scope - LAN for HA and WAN for DR, in the cloud context a better distinction is probably the autonomy of management. High Availability will be the mechanism for continued operations within a single cloud environment - one deployment of OpenStack in a single location or multiple locations. Disaster Recovery will be the mechanism for continued operations when you have multiple cloud environments - multiple OpenStack deployment in various locations. In this context DR is a continued workload operations in an alternative deployment, the recovery target clouds.<br />
<br />
== Scope and Scenarios ==<br />
The goal is to provide a mechanism to mark and protect from disaster applications and services (a set of OpenStack entities) also referred to as a hosted workload. In this context the cloud is the equivalent of the physical hardware. The target of the disaster recovery is not recover the hardware, but the applications, services and their data.<br />
A separate recovery mechanism should address making the primary cloud available to run workloads following a disaster. The disaster recovery mechanism for applications and services will handle the fail-back to the primary cloud.<br />
<br />
=== Examples ===<br />
* Application service running on customer cloud and protected by recovery on hosted cloud.<br />
* Application service running on customer cloud in data center #1 and protected by recovery on customer data center #2.<br />
<br />
[[file:DR.png]]<br />
<br />
The plan is to provide a solution for both the born-in-the-cloud applications, as well as legacy applications that require storage and state.<br />
<br />
== Is this a new OpenStack project? ==<br />
Not necessarily. A better description would be an umbrella topic that describes the required APIs and features that OpenStack needs in order to support DR for hosted workloads. Some APIs and features will be integrated into existing projects such as Nova (DR features for compute) and Cinder (Storage replication). Some functionality, like DR orchestration may leverage Heat, or be a new project, or even be outside the scope of OpenStack.<br />
<br />
Disaster Recovery is a complex task where different applications and use-cases have different requirements; some use-cases can be easily supported while others may be more complex, this is targeted as a long-term effort with incremental steps.<br />
<br />
== Vision and Roadmap ==<br />
Disaster Recovery should include support for:<br />
* Capturing the metadata of the cloud management stack, relevant for the protected workloads/resources: either as point-in-time snapshots of the metadata, or as continuous replication of the metadata.<br />
* Making available the VM images needed to run the hosted workload on the target cloud.<br />
* Replication of the workload data using storage replication, application level replication, or backup/restore.<br />
<br />
We note that metadata changes are less frequent than application data changes, and different mechanisms can handle replication of different portions of the metadata and data (volumes, images, etc)<br />
<br />
The approach is built around:<br />
# Identify required enablement and missing features in OpenStack projects <br />
# Create enablement in specific OpenStack projects <br />
# Create orchestration scripts to demonstrate DR <br />
<br />
When resources to be protected are logically associated with a workload (or a set of inter-related workloads), both the replication and the recovery processes should be able to incorporate hooks to ensure consistency of the replicated data & metadata, as well as to enable customization (automated or manual) of the individual workload components at recovery site. Heat can be used to represent such workloads, as well as to automate the above processes (when applicable).<br />
<br />
== Design Tenets ==<br />
* The DR is between a primary cloud and a target cloud - independently managed.<br />
* The approach should enable a hybrid deployment between private and public cloud.<br />
* Note that some of the work related to DR may be relevant to enabling high-availability between regions, availability zones or cells which do share some of the OpenStack services.<br />
* Ideally (but not as an immediate step) one of the clouds (primary or target) could be non-OpenStack or even non-cloud bare-metal environments.<br />
* The primary and target cloud interact through a “mediator” - a DR middleware or gateway to make sure the clouds are decoupled.<br />
* The DR scheme will protect a set of VMs and related resources (VM images, persistent storage, network definitions, metadata, etc). The resources would be typically associated with a workload or a set of workloads owned by a tenant.<br />
* Allow flexibility in choice of Recovery Point Objective (RPO) and Recovery Time Objective (RTO).<br />
<br />
=== Disaster Recovery functionality to be supported ===<br />
* Fail-over - switch to recovery site upon failure<br />
* Fail-back - switch back to primary site<br />
* Test - test application in a sandbox at the recovery site <br />
<br />
=== End goal for Disaster Recovery ===<br />
* Define RPO/RTO objectives<br />
** Defines the replication params (sync/async, bandwidth, etc.)<br />
** Defines DR policy type<br />
* Enablement of multiple DR Policy options<br />
** backup to Swift<br />
** Active - Cold standby<br />
** Active - Hot standby<br />
** Active - Active (requires application awareness and support)<br />
** Plugable DR policies - e.g. DR to the cloud<br />
* Ability to mark a complete composite application as protected<br />
* Ability to elect DR region or availability zone per application<br />
* Ability to create one to many DR relationships per application<br />
* Ability to scale down the application at the recovery site if needed<br />
* Replication of all configuration and metadata required by an application - Neutron, Cinder, Nova, etc.<br />
* Ability to ensure consistency of the replicated data & metadata<br />
* Supporting a wide range of data replication methods<br />
** Storage systems based replication<br />
** Hypervisor assisted (possibly between heterogeneous storage systems). For example, using DRBD or Qemu based replication<br />
** Backup and Restore methods<br />
** Pluggable application level replication methods<br />
* Integration with horizon for basic DR orchestration<br />
<br />
== Activities ==<br />
=== Related sessions in Icehouse summit ===<br />
* [http://openstacksummitnovember2013.sched.org/event/36ef8daa098c248d7fbb4ac7409f802a#%20 Surviving the worst: A vision for OpenStack disaster recovery - November 7, 9:50am]<br />
* Storage replication (Cinder) - [http://summit.openstack.org/cfp/details/69%20 Volume continuous replication]<br />
<br />
=== Related projects and topics ===<br />
* Resource reservation on target cloud - [[Resource-reservation-service|https://wiki.openstack.org/wiki/Resource-reservation-service]]<br />
* Heat description of workload (Icehouse session proposal) - [http://summit.openstack.org/cfp/details/98 Create Heat stack from existing resources]<br />
<br />
=== Contacts and (current) team ===<br />
* Ronen Kat (ronenkat) - ronenkat@il.ibm.com<br />
* Ayal Baron (abaron) - abaron@redhat.com<br />
* Sean Cohen (scohen) - scohen@redhat.com <br />
* Alex Glikson (glikson) - glikson@il.ibm.com<br />
* Avishay Traeger (avishay-il) - avishay@il.ibm.com</div>Ronenkathttps://wiki.openstack.org/w/index.php?title=DisasterRecovery&diff=33253DisasterRecovery2013-10-20T08:48:01Z<p>Ronenkat: /* Examples */</p>
<hr />
<div>= Disaster Recovery for OpenStack =<br />
<br />
'''Disaster Recovery (DR)''' for OpenStack is an umbrella topic that describes what needs to be done for '''applications and services''' (generally referred to as workload) running in an OpenStack cloud to survive a '''large scale disaster'''.<br />
Providing DR for a workload is a complex task involving infrastructure, software and an understating of the workload. To enable recovery following a disaster, the administrator needs to execute a complex set of provisioning operations that will mimic the day-to-day setup in a different environment.<br />
Enabling DR for OpenStack hosted workloads requires enablement (APIs) in OpenStack components (e.g., Cinder) and tools which may be outside of OpenStack (e.g., scripts) to invoke, orchestrate and leverage the component specific APIs. <br />
<br />
== What is Disaster Recovery? ==<br />
Disaster Recovery is the process of ensuring continuity of a set of workloads following or in advance of a large scale disaster that disrupts the current environment or infrastructure. By large scale disaster, we are considering disasters which can lead to a complete loss of a data center such as floods, tornadoes, hurricanes, fires, etc. To provide DR, we need a geographically distant site which will be the target of recovery. Any resources, data, etc., needed by the application to recover need to be at the target site prior to the disaster.<br />
<br />
== High Availability versus Disaster Recovery ==<br />
While both High Availability (HA) and Disaster Recovery strive to achieve continued operations in face of failures, High Availability usually deals with individual components failures, while Disaster Recovery deals with large scale failures. <br />
<br />
Some distinguish HA from DR by networking scope - LAN for HA and WAN for DR, in the cloud context a better distinction is probably the autonomy of management. High Availability will be the mechanism for continued operations within a single cloud environment - one deployment of OpenStack in a single location or multiple locations. Disaster Recovery will be the mechanism for continued operations when you have multiple cloud environments - multiple OpenStack deployment in various locations. In this context DR is a continued workload operations in an alternative deployment, the recovery target clouds.<br />
<br />
== Scope and Scenarios ==<br />
The goal is to provide a mechanism to mark and protect from disaster applications and services (a set of OpenStack entities) also referred to as a hosted workload. In this context the cloud is the equivalent of the physical hardware. The target of the disaster recovery is not recover the hardware, but the applications, services and their data.<br />
A separate recovery mechanism should address making the primary cloud available to run workloads following a disaster. The disaster recovery mechanism for applications and services will handle the fail-back to the primary cloud.<br />
<br />
=== Examples ===<br />
* Application service running on customer cloud and protected by recovery on hosted cloud.<br />
* Application service running on customer cloud in data center #1 and protected by recovery on customer data center #2.<br />
<br />
[[DR.png|framed]]<br />
The plan is to provide a solution for both the born-in-the-cloud applications, as well as legacy applications that require storage and state.<br />
<br />
== Is this a new OpenStack project? ==<br />
Not necessarily. A better description would be an umbrella topic that describes the required APIs and features that OpenStack needs in order to support DR for hosted workloads. Some APIs and features will be integrated into existing projects such as Nova (DR features for compute) and Cinder (Storage replication). Some functionality, like DR orchestration may leverage Heat, or be a new project, or even be outside the scope of OpenStack.<br />
<br />
Disaster Recovery is a complex task where different applications and use-cases have different requirements; some use-cases can be easily supported while others may be more complex, this is targeted as a long-term effort with incremental steps.<br />
<br />
== Vision and Roadmap ==<br />
Disaster Recovery should include support for:<br />
* Capturing the metadata of the cloud management stack, relevant for the protected workloads/resources: either as point-in-time snapshots of the metadata, or as continuous replication of the metadata.<br />
* Making available the VM images needed to run the hosted workload on the target cloud.<br />
* Replication of the workload data using storage replication, application level replication, or backup/restore.<br />
<br />
We note that metadata changes are less frequent than application data changes, and different mechanisms can handle replication of different portions of the metadata and data (volumes, images, etc)<br />
<br />
The approach is built around:<br />
# Identify required enablement and missing features in OpenStack projects <br />
# Create enablement in specific OpenStack projects <br />
# Create orchestration scripts to demonstrate DR <br />
<br />
When resources to be protected are logically associated with a workload (or a set of inter-related workloads), both the replication and the recovery processes should be able to incorporate hooks to ensure consistency of the replicated data & metadata, as well as to enable customization (automated or manual) of the individual workload components at recovery site. Heat can be used to represent such workloads, as well as to automate the above processes (when applicable).<br />
<br />
== Design Tenets ==<br />
* The DR is between a primary cloud and a target cloud - independently managed.<br />
* The approach should enable a hybrid deployment between private and public cloud.<br />
* Note that some of the work related to DR may be relevant to enabling high-availability between regions, availability zones or cells which do share some of the OpenStack services.<br />
* Ideally (but not as an immediate step) one of the clouds (primary or target) could be non-OpenStack or even non-cloud bare-metal environments.<br />
* The primary and target cloud interact through a “mediator” - a DR middleware or gateway to make sure the clouds are decoupled.<br />
* The DR scheme will protect a set of VMs and related resources (VM images, persistent storage, network definitions, metadata, etc). The resources would be typically associated with a workload or a set of workloads owned by a tenant.<br />
* Allow flexibility in choice of Recovery Point Objective (RPO) and Recovery Time Objective (RTO).<br />
<br />
=== Disaster Recovery functionality to be supported ===<br />
* Fail-over - switch to recovery site upon failure<br />
* Fail-back - switch back to primary site<br />
* Test - test application in a sandbox at the recovery site <br />
<br />
=== End goal for Disaster Recovery ===<br />
* Define RPO/RTO objectives<br />
** Defines the replication params (sync/async, bandwidth, etc.)<br />
** Defines DR policy type<br />
* Enablement of multiple DR Policy options<br />
** backup to Swift<br />
** Active - Cold standby<br />
** Active - Hot standby<br />
** Active - Active (requires application awareness and support)<br />
** Plugable DR policies - e.g. DR to the cloud<br />
* Ability to mark a complete composite application as protected<br />
* Ability to elect DR region or availability zone per application<br />
* Ability to create one to many DR relationships per application<br />
* Ability to scale down the application at the recovery site if needed<br />
* Replication of all configuration and metadata required by an application - Neutron, Cinder, Nova, etc.<br />
* Ability to ensure consistency of the replicated data & metadata<br />
* Supporting a wide range of data replication methods<br />
** Storage systems based replication<br />
** Hypervisor assisted (possibly between heterogeneous storage systems). For example, using DRBD or Qemu based replication<br />
** Backup and Restore methods<br />
** Pluggable application level replication methods<br />
* Integration with horizon for basic DR orchestration<br />
<br />
== Activities ==<br />
=== Related sessions in Icehouse summit ===<br />
* [http://openstacksummitnovember2013.sched.org/event/36ef8daa098c248d7fbb4ac7409f802a#%20 Surviving the worst: A vision for OpenStack disaster recovery - November 7, 9:50am]<br />
* Storage replication (Cinder) - [http://summit.openstack.org/cfp/details/69%20 Volume continuous replication]<br />
<br />
=== Related projects and topics ===<br />
* Resource reservation on target cloud - [[Resource-reservation-service|https://wiki.openstack.org/wiki/Resource-reservation-service]]<br />
* Heat description of workload (Icehouse session proposal) - [http://summit.openstack.org/cfp/details/98 Create Heat stack from existing resources]<br />
<br />
=== Contacts and (current) team ===<br />
* Ronen Kat (ronenkat) - ronenkat@il.ibm.com<br />
* Ayal Baron (abaron) - abaron@redhat.com<br />
* Sean Cohen (scohen) - scohen@redhat.com <br />
* Alex Glikson (glikson) - glikson@il.ibm.com<br />
* Avishay Traeger (avishay-il) - avishay@il.ibm.com</div>Ronenkathttps://wiki.openstack.org/w/index.php?title=File:DR.png&diff=33252File:DR.png2013-10-20T08:47:10Z<p>Ronenkat: </p>
<hr />
<div></div>Ronenkathttps://wiki.openstack.org/w/index.php?title=DisasterRecovery&diff=33251DisasterRecovery2013-10-20T08:46:36Z<p>Ronenkat: /* Examples */</p>
<hr />
<div>= Disaster Recovery for OpenStack =<br />
<br />
'''Disaster Recovery (DR)''' for OpenStack is an umbrella topic that describes what needs to be done for '''applications and services''' (generally referred to as workload) running in an OpenStack cloud to survive a '''large scale disaster'''.<br />
Providing DR for a workload is a complex task involving infrastructure, software and an understating of the workload. To enable recovery following a disaster, the administrator needs to execute a complex set of provisioning operations that will mimic the day-to-day setup in a different environment.<br />
Enabling DR for OpenStack hosted workloads requires enablement (APIs) in OpenStack components (e.g., Cinder) and tools which may be outside of OpenStack (e.g., scripts) to invoke, orchestrate and leverage the component specific APIs. <br />
<br />
== What is Disaster Recovery? ==<br />
Disaster Recovery is the process of ensuring continuity of a set of workloads following or in advance of a large scale disaster that disrupts the current environment or infrastructure. By large scale disaster, we are considering disasters which can lead to a complete loss of a data center such as floods, tornadoes, hurricanes, fires, etc. To provide DR, we need a geographically distant site which will be the target of recovery. Any resources, data, etc., needed by the application to recover need to be at the target site prior to the disaster.<br />
<br />
== High Availability versus Disaster Recovery ==<br />
While both High Availability (HA) and Disaster Recovery strive to achieve continued operations in face of failures, High Availability usually deals with individual components failures, while Disaster Recovery deals with large scale failures. <br />
<br />
Some distinguish HA from DR by networking scope - LAN for HA and WAN for DR, in the cloud context a better distinction is probably the autonomy of management. High Availability will be the mechanism for continued operations within a single cloud environment - one deployment of OpenStack in a single location or multiple locations. Disaster Recovery will be the mechanism for continued operations when you have multiple cloud environments - multiple OpenStack deployment in various locations. In this context DR is a continued workload operations in an alternative deployment, the recovery target clouds.<br />
<br />
== Scope and Scenarios ==<br />
The goal is to provide a mechanism to mark and protect from disaster applications and services (a set of OpenStack entities) also referred to as a hosted workload. In this context the cloud is the equivalent of the physical hardware. The target of the disaster recovery is not recover the hardware, but the applications, services and their data.<br />
A separate recovery mechanism should address making the primary cloud available to run workloads following a disaster. The disaster recovery mechanism for applications and services will handle the fail-back to the primary cloud.<br />
<br />
=== Examples ===<br />
* Application service running on customer cloud and protected by recovery on hosted cloud.<br />
* Application service running on customer cloud in data center #1 and protected by recovery on customer data center #2.<br />
<br />
[[File:C:\Users\ronenkat\Desktop\DR.png|thumbnail]]<br />
The plan is to provide a solution for both the born-in-the-cloud applications, as well as legacy applications that require storage and state.<br />
<br />
== Is this a new OpenStack project? ==<br />
Not necessarily. A better description would be an umbrella topic that describes the required APIs and features that OpenStack needs in order to support DR for hosted workloads. Some APIs and features will be integrated into existing projects such as Nova (DR features for compute) and Cinder (Storage replication). Some functionality, like DR orchestration may leverage Heat, or be a new project, or even be outside the scope of OpenStack.<br />
<br />
Disaster Recovery is a complex task where different applications and use-cases have different requirements; some use-cases can be easily supported while others may be more complex, this is targeted as a long-term effort with incremental steps.<br />
<br />
== Vision and Roadmap ==<br />
Disaster Recovery should include support for:<br />
* Capturing the metadata of the cloud management stack, relevant for the protected workloads/resources: either as point-in-time snapshots of the metadata, or as continuous replication of the metadata.<br />
* Making available the VM images needed to run the hosted workload on the target cloud.<br />
* Replication of the workload data using storage replication, application level replication, or backup/restore.<br />
<br />
We note that metadata changes are less frequent than application data changes, and different mechanisms can handle replication of different portions of the metadata and data (volumes, images, etc)<br />
<br />
The approach is built around:<br />
# Identify required enablement and missing features in OpenStack projects <br />
# Create enablement in specific OpenStack projects <br />
# Create orchestration scripts to demonstrate DR <br />
<br />
When resources to be protected are logically associated with a workload (or a set of inter-related workloads), both the replication and the recovery processes should be able to incorporate hooks to ensure consistency of the replicated data & metadata, as well as to enable customization (automated or manual) of the individual workload components at recovery site. Heat can be used to represent such workloads, as well as to automate the above processes (when applicable).<br />
<br />
== Design Tenets ==<br />
* The DR is between a primary cloud and a target cloud - independently managed.<br />
* The approach should enable a hybrid deployment between private and public cloud.<br />
* Note that some of the work related to DR may be relevant to enabling high-availability between regions, availability zones or cells which do share some of the OpenStack services.<br />
* Ideally (but not as an immediate step) one of the clouds (primary or target) could be non-OpenStack or even non-cloud bare-metal environments.<br />
* The primary and target cloud interact through a “mediator” - a DR middleware or gateway to make sure the clouds are decoupled.<br />
* The DR scheme will protect a set of VMs and related resources (VM images, persistent storage, network definitions, metadata, etc). The resources would be typically associated with a workload or a set of workloads owned by a tenant.<br />
* Allow flexibility in choice of Recovery Point Objective (RPO) and Recovery Time Objective (RTO).<br />
<br />
=== Disaster Recovery functionality to be supported ===<br />
* Fail-over - switch to recovery site upon failure<br />
* Fail-back - switch back to primary site<br />
* Test - test application in a sandbox at the recovery site <br />
<br />
=== End goal for Disaster Recovery ===<br />
* Define RPO/RTO objectives<br />
** Defines the replication params (sync/async, bandwidth, etc.)<br />
** Defines DR policy type<br />
* Enablement of multiple DR Policy options<br />
** backup to Swift<br />
** Active - Cold standby<br />
** Active - Hot standby<br />
** Active - Active (requires application awareness and support)<br />
** Plugable DR policies - e.g. DR to the cloud<br />
* Ability to mark a complete composite application as protected<br />
* Ability to elect DR region or availability zone per application<br />
* Ability to create one to many DR relationships per application<br />
* Ability to scale down the application at the recovery site if needed<br />
* Replication of all configuration and metadata required by an application - Neutron, Cinder, Nova, etc.<br />
* Ability to ensure consistency of the replicated data & metadata<br />
* Supporting a wide range of data replication methods<br />
** Storage systems based replication<br />
** Hypervisor assisted (possibly between heterogeneous storage systems). For example, using DRBD or Qemu based replication<br />
** Backup and Restore methods<br />
** Pluggable application level replication methods<br />
* Integration with horizon for basic DR orchestration<br />
<br />
== Activities ==<br />
=== Related sessions in Icehouse summit ===<br />
* [http://openstacksummitnovember2013.sched.org/event/36ef8daa098c248d7fbb4ac7409f802a#%20 Surviving the worst: A vision for OpenStack disaster recovery - November 7, 9:50am]<br />
* Storage replication (Cinder) - [http://summit.openstack.org/cfp/details/69%20 Volume continuous replication]<br />
<br />
=== Related projects and topics ===<br />
* Resource reservation on target cloud - [[Resource-reservation-service|https://wiki.openstack.org/wiki/Resource-reservation-service]]<br />
* Heat description of workload (Icehouse session proposal) - [http://summit.openstack.org/cfp/details/98 Create Heat stack from existing resources]<br />
<br />
=== Contacts and (current) team ===<br />
* Ronen Kat (ronenkat) - ronenkat@il.ibm.com<br />
* Ayal Baron (abaron) - abaron@redhat.com<br />
* Sean Cohen (scohen) - scohen@redhat.com <br />
* Alex Glikson (glikson) - glikson@il.ibm.com<br />
* Avishay Traeger (avishay-il) - avishay@il.ibm.com</div>Ronenkathttps://wiki.openstack.org/w/index.php?title=DisasterRecovery&diff=33250DisasterRecovery2013-10-20T08:42:51Z<p>Ronenkat: /* Contacts and (current) team */</p>
<hr />
<div>= Disaster Recovery for OpenStack =<br />
<br />
'''Disaster Recovery (DR)''' for OpenStack is an umbrella topic that describes what needs to be done for '''applications and services''' (generally referred to as workload) running in an OpenStack cloud to survive a '''large scale disaster'''.<br />
Providing DR for a workload is a complex task involving infrastructure, software and an understating of the workload. To enable recovery following a disaster, the administrator needs to execute a complex set of provisioning operations that will mimic the day-to-day setup in a different environment.<br />
Enabling DR for OpenStack hosted workloads requires enablement (APIs) in OpenStack components (e.g., Cinder) and tools which may be outside of OpenStack (e.g., scripts) to invoke, orchestrate and leverage the component specific APIs. <br />
<br />
== What is Disaster Recovery? ==<br />
Disaster Recovery is the process of ensuring continuity of a set of workloads following or in advance of a large scale disaster that disrupts the current environment or infrastructure. By large scale disaster, we are considering disasters which can lead to a complete loss of a data center such as floods, tornadoes, hurricanes, fires, etc. To provide DR, we need a geographically distant site which will be the target of recovery. Any resources, data, etc., needed by the application to recover need to be at the target site prior to the disaster.<br />
<br />
== High Availability versus Disaster Recovery ==<br />
While both High Availability (HA) and Disaster Recovery strive to achieve continued operations in face of failures, High Availability usually deals with individual components failures, while Disaster Recovery deals with large scale failures. <br />
<br />
Some distinguish HA from DR by networking scope - LAN for HA and WAN for DR, in the cloud context a better distinction is probably the autonomy of management. High Availability will be the mechanism for continued operations within a single cloud environment - one deployment of OpenStack in a single location or multiple locations. Disaster Recovery will be the mechanism for continued operations when you have multiple cloud environments - multiple OpenStack deployment in various locations. In this context DR is a continued workload operations in an alternative deployment, the recovery target clouds.<br />
<br />
== Scope and Scenarios ==<br />
The goal is to provide a mechanism to mark and protect from disaster applications and services (a set of OpenStack entities) also referred to as a hosted workload. In this context the cloud is the equivalent of the physical hardware. The target of the disaster recovery is not recover the hardware, but the applications, services and their data.<br />
A separate recovery mechanism should address making the primary cloud available to run workloads following a disaster. The disaster recovery mechanism for applications and services will handle the fail-back to the primary cloud.<br />
<br />
=== Examples ===<br />
* Application service running on customer cloud and protected by recovery on hosted cloud.<br />
* Application service running on customer cloud in data center #1 and protected by recovery on customer data center #2.<br />
<br />
The plan is to provide a solution for both the born-in-the-cloud applications, as well as legacy applications that require storage and state.<br />
<br />
== Is this a new OpenStack project? ==<br />
Not necessarily. A better description would be an umbrella topic that describes the required APIs and features that OpenStack needs in order to support DR for hosted workloads. Some APIs and features will be integrated into existing projects such as Nova (DR features for compute) and Cinder (Storage replication). Some functionality, like DR orchestration may leverage Heat, or be a new project, or even be outside the scope of OpenStack.<br />
<br />
Disaster Recovery is a complex task where different applications and use-cases have different requirements; some use-cases can be easily supported while others may be more complex, this is targeted as a long-term effort with incremental steps.<br />
<br />
== Vision and Roadmap ==<br />
Disaster Recovery should include support for:<br />
* Capturing the metadata of the cloud management stack, relevant for the protected workloads/resources: either as point-in-time snapshots of the metadata, or as continuous replication of the metadata.<br />
* Making available the VM images needed to run the hosted workload on the target cloud.<br />
* Replication of the workload data using storage replication, application level replication, or backup/restore.<br />
<br />
We note that metadata changes are less frequent than application data changes, and different mechanisms can handle replication of different portions of the metadata and data (volumes, images, etc)<br />
<br />
The approach is built around:<br />
# Identify required enablement and missing features in OpenStack projects <br />
# Create enablement in specific OpenStack projects <br />
# Create orchestration scripts to demonstrate DR <br />
<br />
When resources to be protected are logically associated with a workload (or a set of inter-related workloads), both the replication and the recovery processes should be able to incorporate hooks to ensure consistency of the replicated data & metadata, as well as to enable customization (automated or manual) of the individual workload components at recovery site. Heat can be used to represent such workloads, as well as to automate the above processes (when applicable).<br />
<br />
== Design Tenets ==<br />
* The DR is between a primary cloud and a target cloud - independently managed.<br />
* The approach should enable a hybrid deployment between private and public cloud.<br />
* Note that some of the work related to DR may be relevant to enabling high-availability between regions, availability zones or cells which do share some of the OpenStack services.<br />
* Ideally (but not as an immediate step) one of the clouds (primary or target) could be non-OpenStack or even non-cloud bare-metal environments.<br />
* The primary and target cloud interact through a “mediator” - a DR middleware or gateway to make sure the clouds are decoupled.<br />
* The DR scheme will protect a set of VMs and related resources (VM images, persistent storage, network definitions, metadata, etc). The resources would be typically associated with a workload or a set of workloads owned by a tenant.<br />
* Allow flexibility in choice of Recovery Point Objective (RPO) and Recovery Time Objective (RTO).<br />
<br />
=== Disaster Recovery functionality to be supported ===<br />
* Fail-over - switch to recovery site upon failure<br />
* Fail-back - switch back to primary site<br />
* Test - test application in a sandbox at the recovery site <br />
<br />
=== End goal for Disaster Recovery ===<br />
* Define RPO/RTO objectives<br />
** Defines the replication params (sync/async, bandwidth, etc.)<br />
** Defines DR policy type<br />
* Enablement of multiple DR Policy options<br />
** backup to Swift<br />
** Active - Cold standby<br />
** Active - Hot standby<br />
** Active - Active (requires application awareness and support)<br />
** Plugable DR policies - e.g. DR to the cloud<br />
* Ability to mark a complete composite application as protected<br />
* Ability to elect DR region or availability zone per application<br />
* Ability to create one to many DR relationships per application<br />
* Ability to scale down the application at the recovery site if needed<br />
* Replication of all configuration and metadata required by an application - Neutron, Cinder, Nova, etc.<br />
* Ability to ensure consistency of the replicated data & metadata<br />
* Supporting a wide range of data replication methods<br />
** Storage systems based replication<br />
** Hypervisor assisted (possibly between heterogeneous storage systems). For example, using DRBD or Qemu based replication<br />
** Backup and Restore methods<br />
** Pluggable application level replication methods<br />
* Integration with horizon for basic DR orchestration<br />
<br />
== Activities ==<br />
=== Related sessions in Icehouse summit ===<br />
* [http://openstacksummitnovember2013.sched.org/event/36ef8daa098c248d7fbb4ac7409f802a#%20 Surviving the worst: A vision for OpenStack disaster recovery - November 7, 9:50am]<br />
* Storage replication (Cinder) - [http://summit.openstack.org/cfp/details/69%20 Volume continuous replication]<br />
<br />
=== Related projects and topics ===<br />
* Resource reservation on target cloud - [[Resource-reservation-service|https://wiki.openstack.org/wiki/Resource-reservation-service]]<br />
* Heat description of workload (Icehouse session proposal) - [http://summit.openstack.org/cfp/details/98 Create Heat stack from existing resources]<br />
<br />
=== Contacts and (current) team ===<br />
* Ronen Kat (ronenkat) - ronenkat@il.ibm.com<br />
* Ayal Baron (abaron) - abaron@redhat.com<br />
* Sean Cohen (scohen) - scohen@redhat.com <br />
* Alex Glikson (glikson) - glikson@il.ibm.com<br />
* Avishay Traeger (avishay-il) - avishay@il.ibm.com</div>Ronenkathttps://wiki.openstack.org/w/index.php?title=DisasterRecovery&diff=33249DisasterRecovery2013-10-20T08:42:40Z<p>Ronenkat: /* Related projects and topics */</p>
<hr />
<div>= Disaster Recovery for OpenStack =<br />
<br />
'''Disaster Recovery (DR)''' for OpenStack is an umbrella topic that describes what needs to be done for '''applications and services''' (generally referred to as workload) running in an OpenStack cloud to survive a '''large scale disaster'''.<br />
Providing DR for a workload is a complex task involving infrastructure, software and an understating of the workload. To enable recovery following a disaster, the administrator needs to execute a complex set of provisioning operations that will mimic the day-to-day setup in a different environment.<br />
Enabling DR for OpenStack hosted workloads requires enablement (APIs) in OpenStack components (e.g., Cinder) and tools which may be outside of OpenStack (e.g., scripts) to invoke, orchestrate and leverage the component specific APIs. <br />
<br />
== What is Disaster Recovery? ==<br />
Disaster Recovery is the process of ensuring continuity of a set of workloads following or in advance of a large scale disaster that disrupts the current environment or infrastructure. By large scale disaster, we are considering disasters which can lead to a complete loss of a data center such as floods, tornadoes, hurricanes, fires, etc. To provide DR, we need a geographically distant site which will be the target of recovery. Any resources, data, etc., needed by the application to recover need to be at the target site prior to the disaster.<br />
<br />
== High Availability versus Disaster Recovery ==<br />
While both High Availability (HA) and Disaster Recovery strive to achieve continued operations in face of failures, High Availability usually deals with individual components failures, while Disaster Recovery deals with large scale failures. <br />
<br />
Some distinguish HA from DR by networking scope - LAN for HA and WAN for DR, in the cloud context a better distinction is probably the autonomy of management. High Availability will be the mechanism for continued operations within a single cloud environment - one deployment of OpenStack in a single location or multiple locations. Disaster Recovery will be the mechanism for continued operations when you have multiple cloud environments - multiple OpenStack deployment in various locations. In this context DR is a continued workload operations in an alternative deployment, the recovery target clouds.<br />
<br />
== Scope and Scenarios ==<br />
The goal is to provide a mechanism to mark and protect from disaster applications and services (a set of OpenStack entities) also referred to as a hosted workload. In this context the cloud is the equivalent of the physical hardware. The target of the disaster recovery is not recover the hardware, but the applications, services and their data.<br />
A separate recovery mechanism should address making the primary cloud available to run workloads following a disaster. The disaster recovery mechanism for applications and services will handle the fail-back to the primary cloud.<br />
<br />
=== Examples ===<br />
* Application service running on customer cloud and protected by recovery on hosted cloud.<br />
* Application service running on customer cloud in data center #1 and protected by recovery on customer data center #2.<br />
<br />
The plan is to provide a solution for both the born-in-the-cloud applications, as well as legacy applications that require storage and state.<br />
<br />
== Is this a new OpenStack project? ==<br />
Not necessarily. A better description would be an umbrella topic that describes the required APIs and features that OpenStack needs in order to support DR for hosted workloads. Some APIs and features will be integrated into existing projects such as Nova (DR features for compute) and Cinder (Storage replication). Some functionality, like DR orchestration may leverage Heat, or be a new project, or even be outside the scope of OpenStack.<br />
<br />
Disaster Recovery is a complex task where different applications and use-cases have different requirements; some use-cases can be easily supported while others may be more complex, this is targeted as a long-term effort with incremental steps.<br />
<br />
== Vision and Roadmap ==<br />
Disaster Recovery should include support for:<br />
* Capturing the metadata of the cloud management stack, relevant for the protected workloads/resources: either as point-in-time snapshots of the metadata, or as continuous replication of the metadata.<br />
* Making available the VM images needed to run the hosted workload on the target cloud.<br />
* Replication of the workload data using storage replication, application level replication, or backup/restore.<br />
<br />
We note that metadata changes are less frequent than application data changes, and different mechanisms can handle replication of different portions of the metadata and data (volumes, images, etc)<br />
<br />
The approach is built around:<br />
# Identify required enablement and missing features in OpenStack projects <br />
# Create enablement in specific OpenStack projects <br />
# Create orchestration scripts to demonstrate DR <br />
<br />
When resources to be protected are logically associated with a workload (or a set of inter-related workloads), both the replication and the recovery processes should be able to incorporate hooks to ensure consistency of the replicated data & metadata, as well as to enable customization (automated or manual) of the individual workload components at recovery site. Heat can be used to represent such workloads, as well as to automate the above processes (when applicable).<br />
<br />
== Design Tenets ==<br />
* The DR is between a primary cloud and a target cloud - independently managed.<br />
* The approach should enable a hybrid deployment between private and public cloud.<br />
* Note that some of the work related to DR may be relevant to enabling high-availability between regions, availability zones or cells which do share some of the OpenStack services.<br />
* Ideally (but not as an immediate step) one of the clouds (primary or target) could be non-OpenStack or even non-cloud bare-metal environments.<br />
* The primary and target cloud interact through a “mediator” - a DR middleware or gateway to make sure the clouds are decoupled.<br />
* The DR scheme will protect a set of VMs and related resources (VM images, persistent storage, network definitions, metadata, etc). The resources would be typically associated with a workload or a set of workloads owned by a tenant.<br />
* Allow flexibility in choice of Recovery Point Objective (RPO) and Recovery Time Objective (RTO).<br />
<br />
=== Disaster Recovery functionality to be supported ===<br />
* Fail-over - switch to recovery site upon failure<br />
* Fail-back - switch back to primary site<br />
* Test - test application in a sandbox at the recovery site <br />
<br />
=== End goal for Disaster Recovery ===<br />
* Define RPO/RTO objectives<br />
** Defines the replication params (sync/async, bandwidth, etc.)<br />
** Defines DR policy type<br />
* Enablement of multiple DR Policy options<br />
** backup to Swift<br />
** Active - Cold standby<br />
** Active - Hot standby<br />
** Active - Active (requires application awareness and support)<br />
** Plugable DR policies - e.g. DR to the cloud<br />
* Ability to mark a complete composite application as protected<br />
* Ability to elect DR region or availability zone per application<br />
* Ability to create one to many DR relationships per application<br />
* Ability to scale down the application at the recovery site if needed<br />
* Replication of all configuration and metadata required by an application - Neutron, Cinder, Nova, etc.<br />
* Ability to ensure consistency of the replicated data & metadata<br />
* Supporting a wide range of data replication methods<br />
** Storage systems based replication<br />
** Hypervisor assisted (possibly between heterogeneous storage systems). For example, using DRBD or Qemu based replication<br />
** Backup and Restore methods<br />
** Pluggable application level replication methods<br />
* Integration with horizon for basic DR orchestration<br />
<br />
== Activities ==<br />
=== Related sessions in Icehouse summit ===<br />
* [http://openstacksummitnovember2013.sched.org/event/36ef8daa098c248d7fbb4ac7409f802a#%20 Surviving the worst: A vision for OpenStack disaster recovery - November 7, 9:50am]<br />
* Storage replication (Cinder) - [http://summit.openstack.org/cfp/details/69%20 Volume continuous replication]<br />
<br />
=== Related projects and topics ===<br />
* Resource reservation on target cloud - [[Resource-reservation-service|https://wiki.openstack.org/wiki/Resource-reservation-service]]<br />
* Heat description of workload (Icehouse session proposal) - [http://summit.openstack.org/cfp/details/98 Create Heat stack from existing resources]<br />
<br />
=== Contacts and (current) team ===<br />
Ronen Kat (ronenkat) - ronenkat@il.ibm.com<br />
Ayal Baron (abaron) - abaron@redhat.com<br />
Sean Cohen (scohen) - scohen@redhat.com <br />
Alex Glikson (glikson) - glikson@il.ibm.com<br />
Avishay Traeger (avishay-il) - avishay@il.ibm.com</div>Ronenkathttps://wiki.openstack.org/w/index.php?title=DisasterRecovery&diff=33248DisasterRecovery2013-10-20T08:42:28Z<p>Ronenkat: /* Related sessions in Icehouse summit */</p>
<hr />
<div>= Disaster Recovery for OpenStack =<br />
<br />
'''Disaster Recovery (DR)''' for OpenStack is an umbrella topic that describes what needs to be done for '''applications and services''' (generally referred to as workload) running in an OpenStack cloud to survive a '''large scale disaster'''.<br />
Providing DR for a workload is a complex task involving infrastructure, software and an understating of the workload. To enable recovery following a disaster, the administrator needs to execute a complex set of provisioning operations that will mimic the day-to-day setup in a different environment.<br />
Enabling DR for OpenStack hosted workloads requires enablement (APIs) in OpenStack components (e.g., Cinder) and tools which may be outside of OpenStack (e.g., scripts) to invoke, orchestrate and leverage the component specific APIs. <br />
<br />
== What is Disaster Recovery? ==<br />
Disaster Recovery is the process of ensuring continuity of a set of workloads following or in advance of a large scale disaster that disrupts the current environment or infrastructure. By large scale disaster, we are considering disasters which can lead to a complete loss of a data center such as floods, tornadoes, hurricanes, fires, etc. To provide DR, we need a geographically distant site which will be the target of recovery. Any resources, data, etc., needed by the application to recover need to be at the target site prior to the disaster.<br />
<br />
== High Availability versus Disaster Recovery ==<br />
While both High Availability (HA) and Disaster Recovery strive to achieve continued operations in face of failures, High Availability usually deals with individual components failures, while Disaster Recovery deals with large scale failures. <br />
<br />
Some distinguish HA from DR by networking scope - LAN for HA and WAN for DR, in the cloud context a better distinction is probably the autonomy of management. High Availability will be the mechanism for continued operations within a single cloud environment - one deployment of OpenStack in a single location or multiple locations. Disaster Recovery will be the mechanism for continued operations when you have multiple cloud environments - multiple OpenStack deployment in various locations. In this context DR is a continued workload operations in an alternative deployment, the recovery target clouds.<br />
<br />
== Scope and Scenarios ==<br />
The goal is to provide a mechanism to mark and protect from disaster applications and services (a set of OpenStack entities) also referred to as a hosted workload. In this context the cloud is the equivalent of the physical hardware. The target of the disaster recovery is not recover the hardware, but the applications, services and their data.<br />
A separate recovery mechanism should address making the primary cloud available to run workloads following a disaster. The disaster recovery mechanism for applications and services will handle the fail-back to the primary cloud.<br />
<br />
=== Examples ===<br />
* Application service running on customer cloud and protected by recovery on hosted cloud.<br />
* Application service running on customer cloud in data center #1 and protected by recovery on customer data center #2.<br />
<br />
The plan is to provide a solution for both the born-in-the-cloud applications, as well as legacy applications that require storage and state.<br />
<br />
== Is this a new OpenStack project? ==<br />
Not necessarily. A better description would be an umbrella topic that describes the required APIs and features that OpenStack needs in order to support DR for hosted workloads. Some APIs and features will be integrated into existing projects such as Nova (DR features for compute) and Cinder (Storage replication). Some functionality, like DR orchestration may leverage Heat, or be a new project, or even be outside the scope of OpenStack.<br />
<br />
Disaster Recovery is a complex task where different applications and use-cases have different requirements; some use-cases can be easily supported while others may be more complex, this is targeted as a long-term effort with incremental steps.<br />
<br />
== Vision and Roadmap ==<br />
Disaster Recovery should include support for:<br />
* Capturing the metadata of the cloud management stack, relevant for the protected workloads/resources: either as point-in-time snapshots of the metadata, or as continuous replication of the metadata.<br />
* Making available the VM images needed to run the hosted workload on the target cloud.<br />
* Replication of the workload data using storage replication, application level replication, or backup/restore.<br />
<br />
We note that metadata changes are less frequent than application data changes, and different mechanisms can handle replication of different portions of the metadata and data (volumes, images, etc)<br />
<br />
The approach is built around:<br />
# Identify required enablement and missing features in OpenStack projects <br />
# Create enablement in specific OpenStack projects <br />
# Create orchestration scripts to demonstrate DR <br />
<br />
When resources to be protected are logically associated with a workload (or a set of inter-related workloads), both the replication and the recovery processes should be able to incorporate hooks to ensure consistency of the replicated data & metadata, as well as to enable customization (automated or manual) of the individual workload components at recovery site. Heat can be used to represent such workloads, as well as to automate the above processes (when applicable).<br />
<br />
== Design Tenets ==<br />
* The DR is between a primary cloud and a target cloud - independently managed.<br />
* The approach should enable a hybrid deployment between private and public cloud.<br />
* Note that some of the work related to DR may be relevant to enabling high-availability between regions, availability zones or cells which do share some of the OpenStack services.<br />
* Ideally (but not as an immediate step) one of the clouds (primary or target) could be non-OpenStack or even non-cloud bare-metal environments.<br />
* The primary and target cloud interact through a “mediator” - a DR middleware or gateway to make sure the clouds are decoupled.<br />
* The DR scheme will protect a set of VMs and related resources (VM images, persistent storage, network definitions, metadata, etc). The resources would be typically associated with a workload or a set of workloads owned by a tenant.<br />
* Allow flexibility in choice of Recovery Point Objective (RPO) and Recovery Time Objective (RTO).<br />
<br />
=== Disaster Recovery functionality to be supported ===<br />
* Fail-over - switch to recovery site upon failure<br />
* Fail-back - switch back to primary site<br />
* Test - test application in a sandbox at the recovery site <br />
<br />
=== End goal for Disaster Recovery ===<br />
* Define RPO/RTO objectives<br />
** Defines the replication params (sync/async, bandwidth, etc.)<br />
** Defines DR policy type<br />
* Enablement of multiple DR Policy options<br />
** backup to Swift<br />
** Active - Cold standby<br />
** Active - Hot standby<br />
** Active - Active (requires application awareness and support)<br />
** Plugable DR policies - e.g. DR to the cloud<br />
* Ability to mark a complete composite application as protected<br />
* Ability to elect DR region or availability zone per application<br />
* Ability to create one to many DR relationships per application<br />
* Ability to scale down the application at the recovery site if needed<br />
* Replication of all configuration and metadata required by an application - Neutron, Cinder, Nova, etc.<br />
* Ability to ensure consistency of the replicated data & metadata<br />
* Supporting a wide range of data replication methods<br />
** Storage systems based replication<br />
** Hypervisor assisted (possibly between heterogeneous storage systems). For example, using DRBD or Qemu based replication<br />
** Backup and Restore methods<br />
** Pluggable application level replication methods<br />
* Integration with horizon for basic DR orchestration<br />
<br />
== Activities ==<br />
=== Related sessions in Icehouse summit ===<br />
* [http://openstacksummitnovember2013.sched.org/event/36ef8daa098c248d7fbb4ac7409f802a#%20 Surviving the worst: A vision for OpenStack disaster recovery - November 7, 9:50am]<br />
* Storage replication (Cinder) - [http://summit.openstack.org/cfp/details/69%20 Volume continuous replication]<br />
<br />
=== Related projects and topics ===<br />
Resource reservation on target cloud - [[Resource-reservation-service|https://wiki.openstack.org/wiki/Resource-reservation-service]]<br />
Heat description of workload (Icehouse session proposal) - [http://summit.openstack.org/cfp/details/98 Create Heat stack from existing resources]<br />
=== Contacts and (current) team ===<br />
Ronen Kat (ronenkat) - ronenkat@il.ibm.com<br />
Ayal Baron (abaron) - abaron@redhat.com<br />
Sean Cohen (scohen) - scohen@redhat.com <br />
Alex Glikson (glikson) - glikson@il.ibm.com<br />
Avishay Traeger (avishay-il) - avishay@il.ibm.com</div>Ronenkathttps://wiki.openstack.org/w/index.php?title=DisasterRecovery&diff=33247DisasterRecovery2013-10-20T08:41:58Z<p>Ronenkat: Disaster Recovery for OpenStack</p>
<hr />
<div>= Disaster Recovery for OpenStack =<br />
<br />
'''Disaster Recovery (DR)''' for OpenStack is an umbrella topic that describes what needs to be done for '''applications and services''' (generally referred to as workload) running in an OpenStack cloud to survive a '''large scale disaster'''.<br />
Providing DR for a workload is a complex task involving infrastructure, software and an understating of the workload. To enable recovery following a disaster, the administrator needs to execute a complex set of provisioning operations that will mimic the day-to-day setup in a different environment.<br />
Enabling DR for OpenStack hosted workloads requires enablement (APIs) in OpenStack components (e.g., Cinder) and tools which may be outside of OpenStack (e.g., scripts) to invoke, orchestrate and leverage the component specific APIs. <br />
<br />
== What is Disaster Recovery? ==<br />
Disaster Recovery is the process of ensuring continuity of a set of workloads following or in advance of a large scale disaster that disrupts the current environment or infrastructure. By large scale disaster, we are considering disasters which can lead to a complete loss of a data center such as floods, tornadoes, hurricanes, fires, etc. To provide DR, we need a geographically distant site which will be the target of recovery. Any resources, data, etc., needed by the application to recover need to be at the target site prior to the disaster.<br />
<br />
== High Availability versus Disaster Recovery ==<br />
While both High Availability (HA) and Disaster Recovery strive to achieve continued operations in face of failures, High Availability usually deals with individual components failures, while Disaster Recovery deals with large scale failures. <br />
<br />
Some distinguish HA from DR by networking scope - LAN for HA and WAN for DR, in the cloud context a better distinction is probably the autonomy of management. High Availability will be the mechanism for continued operations within a single cloud environment - one deployment of OpenStack in a single location or multiple locations. Disaster Recovery will be the mechanism for continued operations when you have multiple cloud environments - multiple OpenStack deployment in various locations. In this context DR is a continued workload operations in an alternative deployment, the recovery target clouds.<br />
<br />
== Scope and Scenarios ==<br />
The goal is to provide a mechanism to mark and protect from disaster applications and services (a set of OpenStack entities) also referred to as a hosted workload. In this context the cloud is the equivalent of the physical hardware. The target of the disaster recovery is not recover the hardware, but the applications, services and their data.<br />
A separate recovery mechanism should address making the primary cloud available to run workloads following a disaster. The disaster recovery mechanism for applications and services will handle the fail-back to the primary cloud.<br />
<br />
=== Examples ===<br />
* Application service running on customer cloud and protected by recovery on hosted cloud.<br />
* Application service running on customer cloud in data center #1 and protected by recovery on customer data center #2.<br />
<br />
The plan is to provide a solution for both the born-in-the-cloud applications, as well as legacy applications that require storage and state.<br />
<br />
== Is this a new OpenStack project? ==<br />
Not necessarily. A better description would be an umbrella topic that describes the required APIs and features that OpenStack needs in order to support DR for hosted workloads. Some APIs and features will be integrated into existing projects such as Nova (DR features for compute) and Cinder (Storage replication). Some functionality, like DR orchestration may leverage Heat, or be a new project, or even be outside the scope of OpenStack.<br />
<br />
Disaster Recovery is a complex task where different applications and use-cases have different requirements; some use-cases can be easily supported while others may be more complex, this is targeted as a long-term effort with incremental steps.<br />
<br />
== Vision and Roadmap ==<br />
Disaster Recovery should include support for:<br />
* Capturing the metadata of the cloud management stack, relevant for the protected workloads/resources: either as point-in-time snapshots of the metadata, or as continuous replication of the metadata.<br />
* Making available the VM images needed to run the hosted workload on the target cloud.<br />
* Replication of the workload data using storage replication, application level replication, or backup/restore.<br />
<br />
We note that metadata changes are less frequent than application data changes, and different mechanisms can handle replication of different portions of the metadata and data (volumes, images, etc)<br />
<br />
The approach is built around:<br />
# Identify required enablement and missing features in OpenStack projects <br />
# Create enablement in specific OpenStack projects <br />
# Create orchestration scripts to demonstrate DR <br />
<br />
When resources to be protected are logically associated with a workload (or a set of inter-related workloads), both the replication and the recovery processes should be able to incorporate hooks to ensure consistency of the replicated data & metadata, as well as to enable customization (automated or manual) of the individual workload components at recovery site. Heat can be used to represent such workloads, as well as to automate the above processes (when applicable).<br />
<br />
== Design Tenets ==<br />
* The DR is between a primary cloud and a target cloud - independently managed.<br />
* The approach should enable a hybrid deployment between private and public cloud.<br />
* Note that some of the work related to DR may be relevant to enabling high-availability between regions, availability zones or cells which do share some of the OpenStack services.<br />
* Ideally (but not as an immediate step) one of the clouds (primary or target) could be non-OpenStack or even non-cloud bare-metal environments.<br />
* The primary and target cloud interact through a “mediator” - a DR middleware or gateway to make sure the clouds are decoupled.<br />
* The DR scheme will protect a set of VMs and related resources (VM images, persistent storage, network definitions, metadata, etc). The resources would be typically associated with a workload or a set of workloads owned by a tenant.<br />
* Allow flexibility in choice of Recovery Point Objective (RPO) and Recovery Time Objective (RTO).<br />
<br />
=== Disaster Recovery functionality to be supported ===<br />
* Fail-over - switch to recovery site upon failure<br />
* Fail-back - switch back to primary site<br />
* Test - test application in a sandbox at the recovery site <br />
<br />
=== End goal for Disaster Recovery ===<br />
* Define RPO/RTO objectives<br />
** Defines the replication params (sync/async, bandwidth, etc.)<br />
** Defines DR policy type<br />
* Enablement of multiple DR Policy options<br />
** backup to Swift<br />
** Active - Cold standby<br />
** Active - Hot standby<br />
** Active - Active (requires application awareness and support)<br />
** Plugable DR policies - e.g. DR to the cloud<br />
* Ability to mark a complete composite application as protected<br />
* Ability to elect DR region or availability zone per application<br />
* Ability to create one to many DR relationships per application<br />
* Ability to scale down the application at the recovery site if needed<br />
* Replication of all configuration and metadata required by an application - Neutron, Cinder, Nova, etc.<br />
* Ability to ensure consistency of the replicated data & metadata<br />
* Supporting a wide range of data replication methods<br />
** Storage systems based replication<br />
** Hypervisor assisted (possibly between heterogeneous storage systems). For example, using DRBD or Qemu based replication<br />
** Backup and Restore methods<br />
** Pluggable application level replication methods<br />
* Integration with horizon for basic DR orchestration<br />
<br />
== Activities ==<br />
=== Related sessions in Icehouse summit ===<br />
[http://openstacksummitnovember2013.sched.org/event/36ef8daa098c248d7fbb4ac7409f802a#%20 Surviving the worst: A vision for OpenStack disaster recovery - November 7, 9:50am]<br />
Storage replication (Cinder) - [http://summit.openstack.org/cfp/details/69%20 Volume continuous replication]<br />
=== Related projects and topics ===<br />
Resource reservation on target cloud - [[Resource-reservation-service|https://wiki.openstack.org/wiki/Resource-reservation-service]]<br />
Heat description of workload (Icehouse session proposal) - [http://summit.openstack.org/cfp/details/98 Create Heat stack from existing resources]<br />
=== Contacts and (current) team ===<br />
Ronen Kat (ronenkat) - ronenkat@il.ibm.com<br />
Ayal Baron (abaron) - abaron@redhat.com<br />
Sean Cohen (scohen) - scohen@redhat.com <br />
Alex Glikson (glikson) - glikson@il.ibm.com<br />
Avishay Traeger (avishay-il) - avishay@il.ibm.com</div>Ronenkat