Jump to: navigation, search

Difference between revisions of "CinderSteinPTGSummary"

(Created page with "=== Introduction === This page contains a summary of the subjects covered during the Stein PTG in Denver, Colorado Wednesday September 12th, through Friday September 14th, 201...")
 
Line 77: Line 77:
 
  *'''Action (team):''' Reviewers can now approve Privsep patches.
 
  *'''Action (team):''' Reviewers can now approve Privsep patches.
  
[https://www.youtube.com/watch?v=_b9tgzKE63c&feature=youtu.be Wednesday Video Recording Part 3]
+
[https://www.youtube.com/watch?v=uHCuF3Qpql4&t=4s Wednesday Video Recording Part 3]
  
 
===Driver Capabilities in Cinder===
 
===Driver Capabilities in Cinder===
Line 97: Line 97:
 
  *'''Action (jungleboyj):''' Working on getting changes into OSC and Horizon could be a work opportunity for an outreachy intern.
 
  *'''Action (jungleboyj):''' Working on getting changes into OSC and Horizon could be a work opportunity for an outreachy intern.
 
  *'''Action (jungleboyj):''' Look at the current gap analysis for functionality and create tasks to implement missing functions.
 
  *'''Action (jungleboyj):''' Look at the current gap analysis for functionality and create tasks to implement missing functions.
 +
 +
===Remove python-cinderclient API V1 support===
 +
*'''Decision:''' We would like to remove V1 API support if possible but it will take more investigation to determine if we can.
 +
<br>
 +
*'''Action (smcginnis):''' Talk to the OSC team and see what their though is on removing the V1 API.
 +
*'''Action (e0ne):''' To check if the OpenStack SDK is using the V1 API.  That will also inform our decision.
 +
 +
===Stein Release Schedule and Deadlines===
 +
*'''Decision:''' The team is ok with sticking to our regular release cadence and deadlines for Stein.
 +
<br>
 +
*'''Action (jungleboyj):''' Add target driver proposal freeze date to the schedule.
 +
*'''Action (jungleboyj):''' To propose the release deadlines to the release website.
  
 
[https://www.youtube.com/watch?v=ycpxB2-Tq9I&feature=youtu.be Wednesday Video Recording Part 4]
 
[https://www.youtube.com/watch?v=ycpxB2-Tq9I&feature=youtu.be Wednesday Video Recording Part 4]
Line 106: Line 118:
 
  *'''Action (eharney):''' To check with the Ceph team at RedHat about any possible issues with going forward with this.
 
  *'''Action (eharney):''' To check with the Ceph team at RedHat about any possible issues with going forward with this.
 
  *'''Action (jungleboyj):'''  To work with TheJulia to figure out how both Ironic and Cinder can respond to questions/requests in the User Survey feedback.
 
  *'''Action (jungleboyj):'''  To work with TheJulia to figure out how both Ironic and Cinder can respond to questions/requests in the User Survey feedback.
 +
 +
 +
==Thursday 9/13/2018==
 +
[https://etherpad.openstack.org/p/cinder-ptg-stein-thursday-rebuild Etherpad with Detailed Notes]
 +
 +
===Nova Cross Project Time===
 +
*'''Decision:'''  Nova was agreeable to the changes to shared_targets and an API to re-image an attached volume.
 +
<br>
 +
*'''Action (geguileo):''' Will start implementing the change to get locking for shared_targets into os-brick instead of Nova.
 +
*'''Action (tommylikehu):''' Will start implementing a new API to enable re-imaging an attached volume.
 +
 +
[https://www.youtube.com/watch?v=_tzmgQX_vxQ&feature=youtu.be  Thursday Video Recording Part 1]
 +
 +
===Edge Computing Discussion===
 +
*'''Decision:'''  CInder is already on track to support what is needed for edge computing purposes.  Further discussion may be needed in the future once they have better defined what level of high availability is needed at the far edge.
 +
<br>
 +
*'''Action (geguileo):''' Implement new scheduler locking mechanism to make Active/Active HA volume services possible.
 +
*'''Action (jungleboyj):''' To follow up with Ildiko about a forum session in Berlin to better understand use cases.
 +
 +
===Install Documentation Discussion===
 +
*'''Decision:'''  This continues to be a place where we need to make improvements.  One place to start is by getting documentation in place that explains how to do Standalone Cinder and Containerized Cinder installation.
 +
<br>
 +
*'''Action (datasundae):''' Start implementing documentation updates for containerized deployment, Standalone installation with NOAUTH set and installation of Cinder with Kubernetes.
 +
*'''Action (jungleboyj):''' To get datasundae in touch with someone who can help with the Kubernetes deployment discussion.
 +
 +
[https://www.youtube.com/watch?v=3ufafyIHXe8&feature=youtu.be Thursday Video Recording Part 2]
 +
 +
===Using Cinder drivers outside of OpenStack===
 +
*'''Decision:'''  Given that this is functionality that is clearly wanted by RedHat and others, the best user experience will come from keeping this integrated in Cinder's code tree.
 +
<br>
 +
*'''Agreement:''' The cinderlib should live in Cinder.  (possibly in cinder/contrib).
 +
*'''Agreement:''' Cinderlib should be gate tested. Gorka will do it against LVM and Ceph.  Initially the job will be non-voting and can be made voting in the future.
 +
*'''Agreement:''' The test won't be a completely separate job as time can be saved by not spinning up a separate devstack instance.
 +
*'''Action (geguileo):''' The limitations of cinderlib will be clearly documented so that people think about using Standalone Cinder instead.
 +
*'''Action (geguileo):''' In the future we will need tomake sure that we verify which drivers work with cinderlib and include it in our documentation (support matrix?)
 +
*'''Action (geguileo):''' Clearly document the fact that this is a tech preview with now commitment of future support.
 +
*'''Action (geguileo):''' Investigate adding this to our 3rd Party CI requirements.
 +
 +
[https://www.youtube.com/watch?v=nBzS1JJr_Eg&feature=youtu.be Thursday Video Recording Part 3]
 +
 +
===How to Handle Merging Policy Changes===
 +
*'''Decision:'''  We can take steps towards resolving this by improving our test coverage.
 +
<br>
 +
*'''Action (eharney):''' Need to work with lbragstad and see what can be done to test policy changes.
 +
*'''Action (eharney):''' Checking to see if his test coverage person that may be joining his team could help with this.
 +
*'''Action (team):''' Need to be careful about merging policy changes late in the release until we have test coverage improved.
 +
 +
===Python 3 testing for 3rd Party CIs===
 +
*'''Decision:'''  To go along with the move to Python3 for OpenStack, the 3rd Party CI testing should change as well.
 +
<br>
 +
*'''Action (3rd Party CI Maintainers):''' Everyone should start working on moving their testing to use Python 3.
 +
*'''Action (jungleboyj):''' Send an e-mail to the mailing making people aware that they need to start making this change.
 +
 +
===Core Participation===
 +
*'''Decision:'''  Team was in agreement that we had a problem but didn't have recommendations for a solution.  Decided that we can just be more efficient with our reviews.
 +
<br>
 +
*'''Action (jungleboyj):''' Propose adding Gorka to stable core.
 +
*'''Action (jungleboyj):''' Send an e-mail to the mailing list proposing removal of cores that have been inactive for a while.
 +
*'''Action (team):''' Work on getting more new people involved in reviews.
 +
 +
===How to Handle No Longer Having Mid-Cycle PTGs===
 +
*'''Decision:'''  A majority agreed that we should still meet at some point between Berlin and the Summit in Denver.
 +
<br>
 +
*'''Action (jungleboyj):''' Start an etherpad to get a list of who would be interested/able to do a mid-cycle PTG.
 +
*'''Action (jungleboyj):''' Try to get the team to converge on a date that works for the most people.
 +
*'''Action (team):''' Check with employers for possible hosts.
 +
 +
===Untyped Volumes/Default Volume Type===
 +
*'''Decision:'''  The current approach doesn't provide a good user experience and should be improved.
 +
<br>
 +
*'''Agreement''' We want to create a default type at Cinder database migration time.
 +
*'''Agreement''' We want an online data migration that changes all volume with volume_type=none to use the default type created.
 +
*'''Agreement''' Admins can go into cinder.conf and change the default volume type to something other than the default we create.
 +
*'''Agreement''' The default_volume_type will be set by default now to use the default that is created if it hasn't already been set.
 +
*'''Agreement''' We will not enable admins to configure Cinder in a way that results in untyped volumes.
 +
*'''Action (eharney):''' Spec to be written to propose this functionality.
 +
 +
==Friday 9/14/2018==
 +
[https://etherpad.openstack.org/p/cinder-ptg-stein-friday Etherpad with Detailed Notes]
 +
[https://www.youtube.com/watch?v=-82uL1FZMl8&feature=youtu.be Friday Video Recording Part 1]
 +
 +
===Barbican with Live Migration Issue===
 +
*'''Decision:'''  This issue does need to be investigated but we need more information to be able to effectively debug this.
 +
<br>
 +
*'''Action: (eharney)''' Will work on looking at the problem when he returns from vacation.
 +
*'''Action: (Michael McAleer)''' To re-run the operation and update the bug with additional debug information.
 +
 +
===Cinder API for Storage Live Migration===
 +
*'''Decision:'''  This feature sounds like the complement to Nova's evacuate command.  As such it is worth investigating implementing it.
 +
<br>
 +
*'''Agreement:''' This could be a useful feature for the future.
 +
*'''Action: (hoangcx)''' Fujitsu to create a spec proposing this functionality.
 +
*'''Action: (hoangcx)''' Fujitsu will get an engineer to work on implementing the future once we reach agreement on the design.
 +
 +
===Third Party CI===
 +
*'''Decision:'''  We need to be paying more attention to 3rd Party CI testing.
 +
<br>
 +
*'''Action:(jungleboyj)''' Follow up with TheJulia on how we can document the jobs for running testing of Cinder in a shared location used by everyone.
 +
*'''Action: (jungleboyj)''' Submit patches for the drivers that are not properly running 3rd Party CI as per our process.
 +
*'''Action: (jungleboyj)''' Spot check running CIs to make sure it appears they are running the right tests.
 +
*'''Action: (jungleboyj)''' Work with the team in the future to see if there is something we could do to create some containers to help people get 3rd Party CI running properly.
 +
 +
[https://www.youtube.com/watch?v=XBCe04kx93Y&feature=youtu.be Friday Video Recording Part 2]
 +
 +
===Online Extend Support on Drivers===
 +
*'''Decision:'''  We should be testing this functionality but want to make sure we implement it in such a way that doesn't cause a bunch of 3rd Party CI failures.
 +
<br>
 +
*'''Action:(team)''' Merge the following patch:  https://review.openstack.org/#/c/578463/  We need to do that first and give people time to set the appropriate variable depending on whether they support it or not.
 +
*'''Action:(erlon)''' Notify the mailing list about the above change merging.
 +
*'''Action:(team)''' Merge the change to tests:  https://review.openstack.org/#/c/572188/
 +
 +
===Reinitialize a Failed Volume Driver===
 +
*'''Decision:''' The team was ok with this idea for cases like a backend failure where doing retries could result in success but not for cases where the config is bad or something like that.
 +
<br>
 +
*'''Action:(lixiaoy1)''' Update the spec based on team discussion.  The team will review it.
 +
 +
===Force Deletion of Volumes===
 +
*'''Decision:''' No action needed as there is code in the new attach code created by John that should handle this situation.
 +
<br>
 +
 +
===Priority Setting for Stein===
 +
*'''Action:(jungleboyj)''' Updated the priority etherpad for Stein.

Revision as of 21:09, 18 September 2018

Introduction

This page contains a summary of the subjects covered during the Stein PTG in Denver, Colorado Wednesday September 12th, through Friday September 14th, 2018.

The full planning etherpad and all associated notes may be found here. Note that the detailed notes have been split out into etherpads for each day of our meetings.


Stein Development Priorities

For the Stein release we will continue to use an etherpad to track execution towards our development priorities.

*Update Scheduler to use Locking Methods from the Placement Service (for HA support)
*Generic Backup Implementation
*Deferred Deletion in RBD
*Update Backup's Size when Backup is Created
*Driver Capabilities Reporting
*Parallel Attach and Detach of Volumes in os-brick
*Cinder API to re-image an Attached Volume
*shared_targets improvements
*Adding Default volume_type to Avoid Untyped Volumes
*Re-initialize Failed Volume Driver
*Architect/Design Storyboard Usage with goal of Migration at the end of Stein
*cinderlib inclusion into Cinder
*Installation Documentation Improvements
*Implementation of Privsep
*Cinder API to Evacuate a Storage Backend
*Adding tests for Policy Changes
*Adding Ceph iSCSI Support

Wednesday 9/12/2018

Etherpad with Detailed Notes

Wednesday Video Recording Part 1

Rocky Retrospective

  • Decision: We got better in a number of areas and agreed that we had some actions going into Stein to continue improvement.


*Agreement: We would like to be able to respond to user comments in the user survey.
*Agreement: Team is glad that we are focusing on bug fixes rather than getting more features in place.
*Agreement: We have issues in our security bug handling that will need to be resolved in the future.
*Agreement: We should try to do more active bug triage during weekly meetings.
*Action (jungleboyj): Follow up with Kendall Waters on a good way to respond to user survey comments.
*Action (team): Review responses to the user survey, categorize the comments and respond accordingly.
*Action (whoami-rajat): Get a list of 2 to 3 bugs ready to discuss in our next weekly meeting.
*Action (team): Try to use the bug review process as an opportunity to make sure we are backporting bug fixes. 

Explore Using Placement with Cinder

  • Decision: Trying to move to the placement service would introduce more problems than it would resolve. We are instead going to pull in the locking technology from the Placement Service to our Scheduler.


*Action (jaypipes) Review the Cinder scheduler design and provide guidance on how to integrate better locking.
*Action (geguileo) Work to implement the locking improvements recommended by Jay Pipes.

Wednesday Video Recording Part 2

Storyboard Migration

  • Decision: We will have to migrate at some point and it seems that this could be an opportunity to better document our processes and fix things about the current process that we do not like.


*Agreement: Most sensible to design our processes using Storyboard via documentation and then use that to prototype things in the tool.
*Action (jungleboyj) Start writing documentation for Storyboard usage and prototyping in storyboard-dev.  This will serve as our design process.
*Action (team) Team to review patches and collaboratively help design the new processes.
*Action (jungleboyj) Figure out how we keep people from using Launchpad once we make the cut over to Storyboard.
*Goal: Migrate to Storyboard by the end of the Stein release.

Revisit the ides of having a Cinder Data Service

  • Decision: We will do some investigation into this proposal but this is not a high priority.


*Action (team): Keep the data service in mind when we have Active/Active volume services fully functional.  They may solve this problem.
*Action (jungleboyj): Propose this as a forum topic to get input as for as interest from the users.
*Action (smcginnis): Look at the volume manager to see what functions might be able to be split out into a data service as an effort to see how feasible this idea is.

Discuss Issues Related to Privsep

  • Decision: Tommy Hu submitted a patch that resolves the threading issues with Privsep. This makes it possible for us to move forward with Privsep.


*Action (eharney and others): Continue to work to get Cinder over to using Privsep instead of rootwrap.
*Action (team): Reviewers can now approve Privsep patches.

Wednesday Video Recording Part 3

Driver Capabilities in Cinder

  • Decision: This is a weak point in Cinder and our users are asking for improvements. We should work on getting capabilities reporting from drivers to be more useful.


*Action (eharney): Write a spec that makes a more solid proposal of what the functionality should be.
*Action (eharney): Fix the current default capabilities in driver.py .  What is in there currently should be opt-in capabilities rather than default.

Revisit 'shared_targets' idea

  • Decision: Changes to the latest Open iSCSI Code make it necessary to handle 'shared_targets' differently than we currently have been.


*Action (geguileo): Propose handling 'shared_target' locking to the Nova team in os-brick instead of in Nova.
*Action (geguileo): To ensure that there are no upgrade implications with the change being made to os-brick/nova.

Process for ensuring that changes make it from python-cinderclient to openstackclient and Horizon

  • Decision: We have an opportunity to improve the process of tracking changes into OSC and Horizon as we move to using Storyboard.


*Action (jungleboyj): Consider creating tags for this when creating the workflow for Storyboard.
*Action (jungleboyj): Working on getting changes into OSC and Horizon could be a work opportunity for an outreachy intern.
*Action (jungleboyj): Look at the current gap analysis for functionality and create tasks to implement missing functions.

Remove python-cinderclient API V1 support

  • Decision: We would like to remove V1 API support if possible but it will take more investigation to determine if we can.


*Action (smcginnis): Talk to the OSC team and see what their though is on removing the V1 API.
*Action (e0ne): To check if the OpenStack SDK is using the V1 API.  That will also inform our decision.

Stein Release Schedule and Deadlines

  • Decision: The team is ok with sticking to our regular release cadence and deadlines for Stein.


*Action (jungleboyj): Add target driver proposal freeze date to the schedule.
*Action (jungleboyj): To propose the release deadlines to the release website.

Wednesday Video Recording Part 4

Cross Project Discussion time with Ironic

  • Decision: Cross project time with Ironic was helpful and we agreed on a few things that we could work together upon.


*Action (jungleboyj): Given Ironic's interest in having iSCSI support for Ceph Jay will work on getting people from his Shanghai development team to help with this.
*Action (eharney): To check with the Ceph team at RedHat about any possible issues with going forward with this.
*Action (jungleboyj):  To work with TheJulia to figure out how both Ironic and Cinder can respond to questions/requests in the User Survey feedback.


Thursday 9/13/2018

Etherpad with Detailed Notes

Nova Cross Project Time

  • Decision: Nova was agreeable to the changes to shared_targets and an API to re-image an attached volume.


*Action (geguileo): Will start implementing the change to get locking for shared_targets into os-brick instead of Nova.
*Action (tommylikehu): Will start implementing a new API to enable re-imaging an attached volume.

Thursday Video Recording Part 1

Edge Computing Discussion

  • Decision: CInder is already on track to support what is needed for edge computing purposes. Further discussion may be needed in the future once they have better defined what level of high availability is needed at the far edge.


*Action (geguileo): Implement new scheduler locking mechanism to make Active/Active HA volume services possible.
*Action (jungleboyj): To follow up with Ildiko about a forum session in Berlin to better understand use cases.

Install Documentation Discussion

  • Decision: This continues to be a place where we need to make improvements. One place to start is by getting documentation in place that explains how to do Standalone Cinder and Containerized Cinder installation.


*Action (datasundae): Start implementing documentation updates for containerized deployment, Standalone installation with NOAUTH set and installation of Cinder with Kubernetes.
*Action (jungleboyj): To get datasundae in touch with someone who can help with the Kubernetes deployment discussion.

Thursday Video Recording Part 2

Using Cinder drivers outside of OpenStack

  • Decision: Given that this is functionality that is clearly wanted by RedHat and others, the best user experience will come from keeping this integrated in Cinder's code tree.


*Agreement: The cinderlib should live in Cinder.  (possibly in cinder/contrib).
*Agreement: Cinderlib should be gate tested. Gorka will do it against LVM and Ceph.  Initially the job will be non-voting and can be made voting in the future.
*Agreement: The test won't be a completely separate job as time can be saved by not spinning up a separate devstack instance.
*Action (geguileo): The limitations of cinderlib will be clearly documented so that people think about using Standalone Cinder instead.
*Action (geguileo): In the future we will need tomake sure that we verify which drivers work with cinderlib and include it in our documentation (support matrix?)
*Action (geguileo): Clearly document the fact that this is a tech preview with now commitment of future support.
*Action (geguileo): Investigate adding this to our 3rd Party CI requirements.

Thursday Video Recording Part 3

How to Handle Merging Policy Changes

  • Decision: We can take steps towards resolving this by improving our test coverage.


*Action (eharney): Need to work with lbragstad and see what can be done to test policy changes.
*Action (eharney): Checking to see if his test coverage person that may be joining his team could help with this.
*Action (team): Need to be careful about merging policy changes late in the release until we have test coverage improved.

Python 3 testing for 3rd Party CIs

  • Decision: To go along with the move to Python3 for OpenStack, the 3rd Party CI testing should change as well.


*Action (3rd Party CI Maintainers): Everyone should start working on moving their testing to use Python 3.
*Action (jungleboyj): Send an e-mail to the mailing making people aware that they need to start making this change.

Core Participation

  • Decision: Team was in agreement that we had a problem but didn't have recommendations for a solution. Decided that we can just be more efficient with our reviews.


*Action (jungleboyj): Propose adding Gorka to stable core.
*Action (jungleboyj): Send an e-mail to the mailing list proposing removal of cores that have been inactive for a while.
  • Action (team): Work on getting more new people involved in reviews.

How to Handle No Longer Having Mid-Cycle PTGs

  • Decision: A majority agreed that we should still meet at some point between Berlin and the Summit in Denver.


*Action (jungleboyj): Start an etherpad to get a list of who would be interested/able to do a mid-cycle PTG.
*Action (jungleboyj): Try to get the team to converge on a date that works for the most people.
*Action (team): Check with employers for possible hosts.

Untyped Volumes/Default Volume Type

  • Decision: The current approach doesn't provide a good user experience and should be improved.


*Agreement We want to create a default type at Cinder database migration time.
*Agreement We want an online data migration that changes all volume with volume_type=none to use the default type created.
*Agreement Admins can go into cinder.conf and change the default volume type to something other than the default we create.
*Agreement The default_volume_type will be set by default now to use the default that is created if it hasn't already been set.
*Agreement We will not enable admins to configure Cinder in a way that results in untyped volumes.
*Action (eharney): Spec to be written to propose this functionality.

Friday 9/14/2018

Etherpad with Detailed Notes Friday Video Recording Part 1

Barbican with Live Migration Issue

  • Decision: This issue does need to be investigated but we need more information to be able to effectively debug this.


*Action: (eharney) Will work on looking at the problem when he returns from vacation.
*Action: (Michael McAleer) To re-run the operation and update the bug with additional debug information.

Cinder API for Storage Live Migration

  • Decision: This feature sounds like the complement to Nova's evacuate command. As such it is worth investigating implementing it.


*Agreement: This could be a useful feature for the future.
*Action: (hoangcx) Fujitsu to create a spec proposing this functionality.
*Action: (hoangcx) Fujitsu will get an engineer to work on implementing the future once we reach agreement on the design.

Third Party CI

  • Decision: We need to be paying more attention to 3rd Party CI testing.


*Action:(jungleboyj) Follow up with TheJulia on how we can document the jobs for running testing of Cinder in a shared location used by everyone.
*Action: (jungleboyj) Submit patches for the drivers that are not properly running 3rd Party CI as per our process.
*Action: (jungleboyj) Spot check running CIs to make sure it appears they are running the right tests.
*Action: (jungleboyj) Work with the team in the future to see if there is something we could do to create some containers to help people get 3rd Party CI running properly.

Friday Video Recording Part 2

Online Extend Support on Drivers

  • Decision: We should be testing this functionality but want to make sure we implement it in such a way that doesn't cause a bunch of 3rd Party CI failures.


*Action:(team) Merge the following patch:  https://review.openstack.org/#/c/578463/  We need to do that first and give people time to set the appropriate variable depending on whether they support it or not.
*Action:(erlon) Notify the mailing list about the above change merging.
*Action:(team) Merge the change to tests:  https://review.openstack.org/#/c/572188/

Reinitialize a Failed Volume Driver

  • Decision: The team was ok with this idea for cases like a backend failure where doing retries could result in success but not for cases where the config is bad or something like that.


*Action:(lixiaoy1) Update the spec based on team discussion.  The team will review it.

Force Deletion of Volumes

  • Decision: No action needed as there is code in the new attach code created by John that should handle this situation.


Priority Setting for Stein

*Action:(jungleboyj) Updated the priority etherpad for Stein.