CinderRockyPTGSummary

Introduction
This page contains a summary of the subjects covered during the Queens PTG in Dublin, Ireland Wednesday February 28th, through Friday March 2nd, 2018.

The full etherpad and all associated notes may be found here.

Rocky Development Priorities
*HA Documentation *HA Fixes *Scheduler Fixes to Support HA *Replication Failover FIxes *Quota Fixes *Fixes for Attachment Handling During Replication Failover *Image Signing *Volume Type AZ *Generic Backup Implementation *Privsep *Support Sold Out Mechanism in Scheduler

Wednesday 2/28/2018
Etherpad with Detailed Notes

Video Recording Part 1

Support signature verification when create from image
*Agreement: We will add a config option for this. It should be enabled by default. If the signature is in the image we verify it, if not there is nothing to do. *Agreement: We will need to look close at where there are places in the code path that an image is being moved and make sure that we do appropriate checking. *Action (tommylikehu): Ensure that we have pointers the documentation explaining how to set up the keys and signing. *Action (jungleboyj): Add the spec to the spec review list:  https://review.openstack.org/#/c/384143/
 * Decision: We do wish to go forward with this feature.

Generic Backup
*Action (e0ne): Create documentation that indicates where this works and does not. *Action (e0ne): Update the spec based on how the code is developed. *Action (eharney):  To review the following patch:  https://review.openstack.org/#/c/543967/
 * Decision: We do want to continue forward implementing this.

Video Recording Part 2

Support restore backup to different volumes simultaneously
*Action (abishop): To create a bug on narrowing the lock timing on creating a volume from the image cache. (method: _create_from_image_cache_or_download in cinder/volume/flows/manager/create_volume.py)
 * Decision: This is not something that we want to do as it really seems to be an inappropriate use of backup.

Automatic config generation
*Agreement: This is a relatively critical thing to work on given the user experience impact. *Action (patrickeast): See if we can find the documentation for shared backend config options. *Action (jungleboyj): Figure out how to bring in the extension for oslo.sphinx. *Action (jungleboyj): Try migrating over the LVM driver to be automated as a first patch.
 * Decision: This needs to be investigated and understood as far as how this can be automated/updated.

Revisit grouping release notes
*Action (jungleboyj): Start documenting some standards around release note file names. I.E. -- *Action (jungleboyj): Figure out how the release note build process is working and document anything that might help us to better leverage the process. *Action (jungleboyj): Clean up issues with existing notes.
 * Decision: The release notes are not as user friendly as we would like.  Work in this area would be good.

Condense/Standardize on Base Set of Config Options

 * Decision: We do not want to pursue this further right now as we have enough other work to do to get the config options cleaned up.

Video Recording Part 3

Support AZ in volume type
*Action (team): Need to review the spec:  https://review.openstack.org/#/c/542691/ *Action (tommylikehu): To propose code based on the final spec.
 * Decision: The team supports making this change.

Mark volume backend or pool sold out
*Action (tommylikehu): To create a spec that addresses the items discussed and documented in the etherpad and we will continue discussion/review.
 * Decision: The team needs to better understand what is really being proposed before making a decision.

Video Recording Part 4

Path forward with our documentation
*Action (datasundae): First step is to run through the process and verify it works for Ubuntu and fix it. Determine what may need to be added or removed. *Action (datasundae): Update the installation guide landing page to better describe what it is there for installation instructions. Add links to devstack for development environment installation. Add links to distributor install guides if appropriate. *Action (eharney and hemna):  Verify the SuSE and Red Hat documentation. It is broken ... should be fixed. *Action (smcginnis): Review upgrade guide to verify validity for current release. *Action (smcginnis): Add the option changes information for Queens. *Action (jungleboyj): Propose forum topic on on documentation for Cinder.
 * Decision: Our documentation does need work and is something to put attention into during the Rocky cycle.

Mutable config options
*Action (NEEDS OWNER): Ensure that the existing log level support for a mutable option works. *Action (NEEDS OWNER): Make sure we don't restart our services with a SIGHUP. *Agreement: If additional mutable config options are needed it will mostly likely be by driver developers. We can work with them to get the features added when that time comes.
 * Decision: We need to make sure that what we think works for Cinder is working and can address doing more in the future.

Zoning enhancements
*Action (gman-tx): Will put up a patch to add the ability for the FCZM to do blacklists and also look into options to zone in parallel. *Action (hemna): Will rebase the following patch to move the FCZM out to its own library:  https://review.openstack.org/#/c/472855/ *Action (hemna): Going to check if they have 3Pars that have FC and looking into what would need to be done to zone only on the first volume attach for an array.
 * Decision: Team is in support of continuing to improve the zoning support.

Auto max_over_subscription_ratio agreement
*Action (erlon): To talk to patrickeast and Nikesh to find out if the options that are in their drivers can be changed to use the common option that was added.
 * Decision: Do not want to have multiple config options supporting this.

Thursday 3/1/2018
Etherpad with Detailed Notes

Cinder new attach flow fixes, Multi-attach
*Agreement: Second attachment is going to RW by default. If the user wants RO then they need to specify it. You can turn off multi-attach by policy if admins don't want to have this possible. *Agreement: The compute API should be changed to allow the user to pass through the attach mode so Nova and then tell Cinder what mode to use for attachment. *Agreement: Server multi-create with attaching to the same volume will be supported. There is a bug in the Cinder state machine that needs to be addressed. It will require a compute API microversion.
 * Decision: There are a number of issues that need to be investigated and fixed.

Update attachments on replication failover
*Agreement: Need to understand if Nova needs to detach/attach a volume to make this work. *Action (NEED OWNER): Write a spec and prototype the code for this. *Agreement: Cinder drivers need to indicate the type of replication and what the recovery on the nova side needs to be. *Agreement: Nova API microversion for the os-server-external-events change (like extended volume).
 * Decision: Cinder and Nova would like to work out a solution that supports this.

Volume details shows attached compute host for non-admins
*Action (NEED OWNER):  Add a policy to not display the info for non-admins. Should be just a Cinder change.
 * Decision: This is something that should be fixed and should be able to be fixed all on the Cinder side.  https://bugs.launchpad.net/cinder/+bug/1740950

Bulk Volume Create/Attach

 * Decision: Nova agreed that this is something that just makes things much more complicated and recommended against it.  Cinder team agrees with not going forward with this work.

Core Activity Concerns
*Agreement:  Jungleboyj to keep the specs review list curated. *Action (core team): Add a comment like 'target-rocky-1' to reviews to organize what reviews are targeted for what milestone.
 * Decision: Going to try to make it a bit easier for the team to know review priorities.  Team is going to work on being more active.

python3 concerns
*Action (eharney): Try turning on Python3 in the CEPH or LIO job. *Action (smcginnis): Ask the TC if it is ok for us to deprecate this? Can we do that if other projects are not doing it?
 * Decision: We need to do further investigation into how OpenStack is planning to go forward with this and then decide how Cinder will proceed.

Cinder-tempest-plugin discussions
*Action (jungleboyj): Add a note to the driver CI instruction page indicating that the 'all-plugin' option is specified when running tempest. *Action (jungleboyj): Send a note to the mailing list about this and try to get attention to this from the CI maintainers. *Action (jungleboyj): Send angrygrams to people who's CIs are just barfing right now. *Action (jungleboyj): Prioritize pinging the owners of CI with CG support as they are the ones that really need to be running all tests. *Action (jungleboyj): Start checking CI output for the tests. Start pinging people who are not running it with a goal of all CIs running the tempest-plugin tests by the end of Rocky.
 * Decision: We need to make sure that our 3rd Party CI systems are running using the cinder-tempest-plugin tests.

Work for an Intern
*Action (geguileo): Write a spec for adding per user and per project volume type defaults. This is work that an intern could possibly do with guidance. *Action (jungleboyj): To create a page where we can keep a list of possible work items for interns.
 * Decision: Don't have a lot of work that we can think of at the moment but it would be good to keep ideas for such work.

Revisiting the capabilities matrix
*Agreement: We need to mention since when a capability was available. *Agreement: Matrix should include the 'unsupported' flag for drivers. *Agreement: New matrix will start with just the drivers that are currently in Cinder. Drivers will come and go as they do for each release. *Agreement: Nova is using an approach like this:  https://review.openstack.org/#/c/472488  We should go the same direction.
 * Decision: We need a capabilities matrix with useful information but we need a better way of providing it.

---

Friday 3/2/2018
Etherpad with Detailed Notes

Quota improvement work
*Action (tommylikehu and jbernard): Need to investigate DB and ensure we have all the appropriate indexes in place. *Action (tommylikehu and jbernard): Work with Nova to get all the patches in place to improve Quota functionality and performance.
 * Decision: Nova has been working to improve/fix quotas.  We are based on the same code so we should do the same.

Migration to privsep
*Action (eharney): Start incrementally changing commands over to privsep and get a little improvement at a time.
 * Decision: We are not currently leveraging privsep as we should.  It would be good to start looking into this and implementing changes.

Numerous problems with backup discovered by Gorka
*Action (lpetrut): To investigate solutions for logging in native threads. Will work with Oslo to try and find a long term solution. *Action (geguileo): To push up patches to fix issues when force deleting a backup that is in progress. It should stop the backup and clean it up. *Action (geguileo): Propose patches that make it possible to run more than one backup worker at a time. This is intended to make the backup process more efficient.
 * Decision: Team supports patches and work going into place to improve/fix backup support.

Scheduler concerns
*Action (geguileo): Submit patches to address some of the issues with scheduling in HA environments. (I.E. the problem that not all the schedulers are updated simultaneously as mutliple requests are submitted.) *Action (geguileo): To submit a spec for switching to using the database to keep multiple scheduler instances synchronized.
 * Decision: We continue to have issues with the scheduler that should be addressed.

Using Cinder Drivers outside of Cinder
https://github.com/akrog/cinderlib *Action (geguileo): To complete documentation/implementation of this stand alone library. *Action (geguileo): Should present/demonstrate this to the rest of the Cinder community in a weekly meeting. *Action (team): In the future we will need to decide where this library lives. In OpenStack? Keep it independent?
 * Decision: The team was supportive of the new library that Gorka has created to act as a wrapper to make independent use of Cinder drivers with no message queue or database.

HA Development Support
*Action (jungleboyj): Add a section to the spec template asking if there is an Active/Active HA impact from the change. *Action (jungleboyj): Add a recurring time slot in the Cinder weekly meeting to discuss HA development progress. *Action (geguileo):  Put together documentation in Cinder to start collecting the information that driver developers need to ensure they are HA compliant. Use as a way to track our development progress forward. Include information on how this can impact new features. *Action (geguileo):  Ensure the new attachment code is fixed so that is supports HA. May require working with John and Ildiko.
 * Decision: This is the next top development priority for Cinder.  Need to focus on this during Rocky.