Jump to: navigation, search

Difference between revisions of "CinderRockyPTGSummary"

(Wednesday 2/28/2018)
(Rocky Development Priorities)
Line 18: Line 18:
 
  *'''Privsep'''
 
  *'''Privsep'''
 
  *'''Support Sold Out Mechanism in Scheduler'''
 
  *'''Support Sold Out Mechanism in Scheduler'''
 +
--------------------------------------------------------------------------------------------------------------------------------------------------------
  
 
==Wednesday 2/28/2018==
 
==Wednesday 2/28/2018==

Revision as of 19:55, 6 March 2018

Introduction

This page contains a summary of the subjects covered during the Queens PTG in Dublin, Ireland Wednesday February 28th, through Friday March 2nd, 2018.

The full etherpad and all associated notes may be found here.


Rocky Development Priorities

*HA Documentation
*HA Fixes
*Scheduler Fixes to Support HA
*Replication Failover FIxes
*Quota Fixes
*Fixes for Attachment Handling During Replication Failover
*Image Signing
*Volume Type AZ
*Generic Backup Implementation
*Privsep
*Support Sold Out Mechanism in Scheduler

Wednesday 2/28/2018

Etherpad with Detailed Notes

Video Recording Part 1

Support signature verification when create from image

  • Decision: We do wish to go forward with this feature.


*Agreement: We will add a config option for this.  It should be enabled by default.  If the signature is in the image we verify it, if not there is nothing to do.
*Agreement:  We will need to look close at where there are places in the code path that an image is being moved and make sure that we do appropriate checking.
*Action (tommylikehu):  Ensure that we have pointers the documentation explaining how to set up the keys and signing.
*Action (jungleboyj):  Add the spec to the spec review list:  https://review.openstack.org/#/c/384143/


Generic Backup

  • Decision: We do want to continue forward implementing this.


*Action (e0ne):  Create documentation that indicates where this works and does not.
*Action (e0ne):  Update the spec based on how the code is developed.
*Action (eharney):   To review the following patch:  https://review.openstack.org/#/c/543967/


Video Recording Part 2


Support restore backup to different volumes simultaneously

  • Decision: This is not something that we want to do as it really seems to be an inappropriate use of backup.


*Action (abishop):  To create a bug on narrowing the lock timing on creating a volume from the image cache. (method: _create_from_image_cache_or_download in cinder/volume/flows/manager/create_volume.py)


Automatic config generation

  • Decision: This needs to be investigated and understood as far as how this can be automated/updated.


*Agreement:  This is a relatively critical thing to work on given the user experience impact. 
*Action (patrickeast):  See if we can find the documentation for shared backend config options.
*Action (jungleboyj):  Figure out how to bring in the extension for oslo.sphinx.
*Action (jungleboyj):  Try migrating over the LVM driver to be automated as a first patch.


Revisit grouping release notes

  • Decision: The release notes are not as user friendly as we would like. Work in this area would be good.


*Action (jungleboyj):  Start documenting some standards around release note file names.  I.E. <driver>-<what is being done (add|remove|fix|destroy)>-<description>
*Action (jungleboyj):  Figure out how the release note build process is working and document anything that might help us to better leverage the process.
*Action (jungleboyj):  Clean up issues with existing notes.


Condense/Standardize on Base Set of Config Options

  • Decision: We do not want to pursue this further right now as we have enough other work to do to get the config options cleaned up.



Video Recording Part 3


Support AZ in volume type

  • Decision: The team supports making this change.


*Action (team):  Need to review the spec:  https://review.openstack.org/#/c/542691/
*Action (tommylikehu):  To propose code based on the final spec.


Mark volume backend or pool sold out

  • Decision: The team needs to better understand what is really being proposed before making a decision.


*Action (tommylikehu):  To create a spec that addresses the items discussed and documented in the etherpad and we will continue discussion/review.


Video Recording Part 4


Path forward with our documentation

  • Decision: Our documentation does need work and is something to put attention into during the Rocky cycle.


*Action (datasundae):  First step is to run through the process and verify it works for Ubuntu and fix it.  Determine what may need to be added or removed.
*Action (datasundae):  Update the installation guide landing page to better describe what it is there for installation instructions.  Add links to devstack for development environment installation.  Add links to distributor install guides if appropriate.
*Action (eharney and hemna):   Verify the SuSE and Red Hat documentation.  It is broken ... should be fixed.
*Action (smcginnis):  Review upgrade guide to verify validity for current release. 
*Action (smcginnis):  Add the option changes information for Queens.
*Action (jungleboyj):  Propose forum topic on on documentation for Cinder.


Mutable config options

  • Decision: We need to make sure that what we think works for Cinder is working and can address doing more in the future.


*Action (NEEDS OWNER):  Ensure that the existing log level support for a mutable option works.
*Action (NEEDS OWNER):  Make sure we don't restart our services with a SIGHUP.
*Agreement:  If additional mutable config options are needed it will mostly likely be by driver developers.  We can work with them to get the features added when that time comes.


Zoning enhancements

  • Decision: Team is in support of continuing to improve the zoning support.


*Action (gman-tx):  Will put up a patch to add the ability for the FCZM to do blacklists and also look into options to zone in parallel.
*Action (hemna):  Will rebase the following patch to move the FCZM out to its own library:  https://review.openstack.org/#/c/472855/
*Action (hemna):  Going to check if they have 3Pars that have FC and looking into what would need to be done to zone only on the first volume attach for an array.


Auto max_over_subscription_ratio agreement

  • Decision: Do not want to have multiple config options supporting this.


*Action (erlon):  To talk to patrickeast and Nikesh to find out if the options that are in their drivers can be changed to use the common option that was added.



Thursday 3/1/2018

Etherpad with Detailed Notes

Nova/Cinder Cross Project Time

Cinder new attach flow fixes, Multi-attach

  • Decision: There are a number of issues that need to be investigated and fixed.


*Agreement:  Second attachment is going to RW by default.  If the user wants RO then they need to specify it.  You can turn off multi-attach by policy if admins don't want to have this possible.
*Agreement:  The compute API should be changed to allow the user to pass through the attach mode so Nova and then tell Cinder what mode to use for attachment.
*Agreement:  Server multi-create with attaching to the same volume will be supported.  There is a bug in the Cinder state machine that needs to be addressed.  It will require a compute API microversion.


Volume replication for in-use volumes

  • Decision: Cinder and Nova would like to work out a solution that supports this.


*Agreement:  Need to understand if Nova needs to detach/attach a volume to make this work.
*Action (NEED OWNER):  Write a spec and prototype the code for this.
*Agreement:  Cinder drivers need to indicate the type of replication and what the recovery on the nova side needs to be.
*Agreement:  Nova API microversion for the os-server-external-events change (like extended volume).


Volume details shows attached compute host for non-admins


*Action (NEED OWNER):   Add a policy to not display the info for non-admins.  Should be just a Cinder change.


Bulk Volume Create/Attach

  • Decision: Nova agreed that this is something that just makes things much more complicated and recommended against it. Cinder team agrees with not going forward with this work.


Continued Cinder Only Time

Core Activity Concerns

  • Decision: Going to try to make it a bit easier for the team to know review priorities. Team is going to work on being more active.


*Agreement:   Jungleboyj to keep the specs review list curated.
*Action (core team):  Add a comment like 'target-rocky-1' to reviews to organize what reviews are targeted for what milestone.


python3 concerns

  • Decision: We need to do further investigation into how OpenStack is planning to go forward with this and then decide how Cinder will proceed.


*Action (eharney):  Try turning on Python3 in the CEPH or LIO job.
*Action (smcginnis):  Ask the TC if it is ok for us to deprecate this?  Can we do that if other projects are not doing it?


Cinder-tempest-plugin discussions

  • Decision: We need to make sure that our 3rd Party CI systems are running using the cinder-tempest-plugin tests.


*Action (jungleboyj):  Add a note to the driver CI instruction page indicating that the 'all-plugin' option is specified when running tempest.
*Action (jungleboyj):  Send a note to the mailing list about this and try to get attention to this from the CI maintainers.
*Action (jungleboyj):  Send angrygrams to people who's CIs are just barfing right now.
*Action (jungleboyj):  Prioritize pinging the owners of CI with CG support as they are the ones that really need to be running all tests.
*Action (jungleboyj):  Start checking CI output for the tests.  Start pinging people who are not running it with a goal of all CIs running the tempest-plugin tests by the end of Rocky.


Work for an Intern

  • Decision: Don't have a lot of work that we can think of at the moment but it would be good to keep ideas for such work.


*Action (geguileo):  Write a spec for adding per user and per project volume type defaults.  This is work that an intern could possibly do with guidance.
*Action (jungleboyj):  To create a page where we can keep a list of possible work items for interns.


Revisiting the capabilities matrix

  • Decision: We need a capabilities matrix with useful information but we need a better way of providing it.


*Agreement:  We need to mention since when a capability was available.
*Agreement:  Matrix should include the 'unsupported' flag for drivers.
*Agreement:  New matrix will start with just the drivers that are currently in Cinder.  Drivers will come and go as they do for each release.
*Agreement:  Nova is using an approach like this:  https://review.openstack.org/#/c/472488  We should go the same direction.



Friday 3/2/2018

Etherpad with Detailed Notes


Quota improvement work

  • Decision: Nova has been working to improve/fix quotas. We are based on the same code so we should do the same.


*Action (tommylikehu and jbernard):  Need to investigate DB and ensure we have all the appropriate indexes in place.
*Action (tommylikehu and jbernard):  Work with Nova to get all the patches in place to improve Quota functionality and performance.


Migration to privsep

  • Decision: We are not currently leveraging privsep as we should. It would be good to start looking into this and implementing changes.


*Action (eharney): Start incrementally changing commands over to privsep and get a little improvement at a time.


Numerous problems with backup discovered by Gorka

  • Decision: Team supports patches and work going into place to improve/fix backup support.


*Action (lpetrut):  To investigate solutions for logging in native threads.  Will work with Oslo to try and find a long term solution.
*Action (geguileo):  To push up patches to fix issues when force deleting a backup that is in progress.  It should stop the backup and clean it up.
*Action (geguileo):  Propose patches that make it possible to run more than one backup worker at a time.  This is intended to make the backup process more efficient.


Scheduler concerns

  • Decision: We continue to have issues with the scheduler that should be addressed.


*Action (geguileo):  Submit patches to address some of the issues with scheduling in HA environments.  (I.E. the problem that not all the schedulers are updated simultaneously as mutliple requests are submitted.)
*Action (geguileo):  To submit a spec for switching to using the database to keep multiple scheduler instances synchronized.


Using Cinder Drivers outside of Cinder

  • Decision: The team was supportive of the new library that Gorka has created to act as a wrapper to make independent use of Cinder drivers with no message queue or database.

https://github.com/akrog/cinderlib

*Action (geguileo):  To complete documentation/implementation of this stand alone library.
*Action (geguileo):  Should present/demonstrate this to the rest of the Cinder community in a weekly meeting.
*Action (team):  In the future we will need to decide where this library lives.  In OpenStack?  Keep it independent?


HA Development Support

  • Decision: This is the next top development priority for Cinder. Need to focus on this during Rocky.


*Action (jungleboyj):  Add a section to the spec template asking if there is an Active/Active HA impact from the change.
*Action (jungleboyj):  Add a recurring time slot in the Cinder weekly meeting to discuss HA development progress.
*Action (geguileo):   Put together documentation in Cinder to start collecting the information that driver developers need to ensure they are HA compliant.  Use as a way to track our development progress forward.  Include information on how this can impact new features.
*Action (geguileo):   Ensure the new attachment code is fixed so that is supports HA.  May require working with John and Ildiko.