Jump to: navigation, search

Difference between revisions of "CinderVictoriaPTGSummary"

m (Remove volume driver abc classes)
m (Backup Driver issues)
Line 177: Line 177:
 
===Backup Driver issues===
 
===Backup Driver issues===
 
Discussion led by Shatadru
 
Discussion led by Shatadru
 +
 +
Well, one issue anyway.  The issue is the container parameter for the backup drivers.  It's handled differently depending on what driver you're using:
 +
* swift creates a bucket
 +
* nfs creates a directory
 +
* ceph uses it as a pool name
 +
 +
 +
What happens with Ceph is that if the pool doesn't exist, the backup will eventually fail, which can be mystifying to users:
 +
* they expect it to act like the other drivers, where you can use it to organize your backups
 +
* it's not documented anywhere that for Ceph, you need to give it an existing pool name or your request will fail
 +
 +
 +
Most operators using Ceph probably don't want this used by end users because they've probably set aside a particular pool they want used for backups and don't want users to put their backups in some other pool.  Also, they probably don't want users to even know any of the pool names.
 +
 +
Would like a configuration setting to allow the Ceph driver to ignore the container if it's passed.  Problem is, this doesn't make so much sense for the other backend drivers, so we don't want to make it a global setting -- and that means it won't be the kind of thing we can detect and reject at the API layer.
 +
 +
It's a real problem for Ceph because an end user can probe to find out what pools they have access to, and they can put things into the "wrong" pool.  But it seems perfectly OK for the other drivers.
 +
 +
====Conclusions====
 +
* should add a configuration option for the Ceph driver, something like 'ignore_container' and then the configured backup pool is used even when a user specifies something else
 +
* we'll track this as a Launchpad bug: https://bugs.launchpad.net/cinder/+bug/1881934
 +
* '''ACTION''' - Shatadru - will put up a documentation patch
 +
* '''ACTION''' - rosmaita - will update the bug with this discussion to point the direction the fix should take
  
 
===Backporting 'supported' status===
 
===Backporting 'supported' status===

Revision as of 20:52, 9 June 2020

Contents

Introduction

This page contains a summary of the subjects covered during the Cinder project sessions at the Victoria PTG, held virtually June 2-5, 2020.

The Cinder Team at the Victoria (Virtual) PTG, June 2020.


This document aims to give a summary of each session. More context is available on the cinder PTG etherpad:


The sessions were recorded, so to get all the details of any discussion, you can watch/listen to the recording. Links to the recordings are located at appropriate places below.

Tuesday 02 June: Current Cinder Issues

recordings

Discuss the security issue when an attached volume exposes host information to non-admins

Discussion led by whoami-rajat

This was about https://bugs.launchpad.net/cinder/+bug/1740950 and Related-Bug: https://bugs.launchpad.net/cinder/+bug/1736773

The compute host is leaked in the volume-detail response, and also in the REST Attachments API.

We've already agreed that the volume-detail response will only display the host when called in an administrative context (the API response is already allowed to have a null value there, so there is no API change).

For the Attachments API response, need to do some research to see whether this field is needed and in what contexts. Doesn't look like Nova needs it, doesn't look like any of our os-brick connectors are using it, but on the other hand, the info has been available in the response so long that we need to be careful, because something could be using it. We have to be careful about what info is left out of the response, because if it's too restricted, the caller won't be able to connect to the backend.

Conclusions

  • short term, need to get policies in place to govern this
  • open issue: the proposal was to keep the default behavior, but that continues to allow the leakage of compute host name; would probably be better to have a more restrictive policy (since we don't believe that any services depend on it), and then operators can adjust the policies to allow users who have use cases that need this info to see it
  • ACTION - whoami-rajat to continue working on this

Sizing encrypted volumes

Discussion led by eharney and enriquetaso

When you create an encrypted volume, you lose some space for the LUKS header. This has shown up in some migration bugs. If you have an unencrypted volume and migrate to an encrypted type, it fails because there isn't enough space on the target to fit the entire source volume.


The exact amount of space lost isn't clear, it may vary by luks format and also iscsi (cryptsetup) vs rbd (luks). It's probably on the order of 1-2 MB, but we allocate space in GB. Whether you can do really fine-grained space allocation or not probably depends on the backend containing the volume. We also need to keep the gigabyte boundaries for accounting purposes (usage, quotas, charges, etc.) because that's what everyone's tooling expects.

Walt suggested that we change the API to allow a new size to be specified during a volume migration. For retype, Gorka suggested that we could allow a flag similar to the current migration_policy in the retype request, which gives the user the option to say it's OK to perform a migration if it's required. We could have a flag indicating that it's OK for the volume to be increased in size if it's required for the retype to succeed. This way we don't go behind the user's back -- we fail if the volume won't fit, and allow them to retype to a new size if they want the retype to succeed.

Conclusions

  • Take the Walt/Gorka approach. It will allow us to fix a bunch of bugs, and we can move to something more elaborate later if there's demand for it.
  • ACTION - Sofia - add a note to the docs explaining that retyping an unencrypted volume to the same size encrypted volume will most likely fail (so that users know not to try this in older releases)

Quotas!!!

Discussion led by rosmaita

The problem is that there are race conditions and bugs in the code so that operators see stuff like duplicate entries in the the quotas table, incorrect usage reports, etc.

There is a spec proposed for removing the quota usage cache (database table): https://review.opendev.org/730701

The idea is that instead of trying to keep this table updated, we determine usage on the fly. Some people are suspicious that this could cause a performance problem, but Eric thinks that with appropriate indices, this would be very fast for a database to compute. It would definitely be a very small number relative to the amount of time it takes to actually build a volume.

There may still be some race conditions we need to worry about with the new approach. Our usage code was based on Nova's, but they've rewritten theirs to eliminate the database table. But they also accepted that it's OK for users to occasionally be over quota. We may have to do something similar.

Two related issues came up during the discussion:

  • the keystone "unified limits" work -- this has to do with where a user's limits are stored, but doesn't impact usage computations. Moving the user limits to keystone would be an independent effort
  • the REST API currently returns 413 or 500 for usages. The API SIG suggests returning 403 instead of 413, but there doesn't seem to be a point in us changing. (You'd need a new microversion to return the "correct" response code, but you'd still need to be able to handle the "incorrect" one if that microversion weren't available, so why not just stick with what we have.) The 500s should be fixed, though. (But that wouldn't require a microversion change.)

Conclusions

  • The approach in https://review.opendev.org/730701 looks promising. It would be nice if we had some performance numbers to assess the impact of the change.
  • Need to figure out what new race conditions we will see and how to deal with them. (Looking at what Nova did should help here.)
  • ACTION - continue discussion on the spec


(The team adjourned early so that the Cinder core security team could review the patches and plan for rolling out the changes for the bugfix for Launchpad Bug #1823200, which would become public the next day.)

Wednesday 03 June: Third Party Drivers

recordings

Bug: Fail to extend attached volume using generic NFS driver

Discussion led by lseki

The bug: https://bugs.launchpad.net/cinder/+bug/1870367

It affects Quobyte and NetApp drivers relying on generic NFS implementation we need to fix it in 2 steps:

(1) don't allow extend of an attached volume for nfs drivers: https://review.opendev.org/#/c/725805/

(2) figure out how to do the online extend safely, and then re-enable that functionality

This was working at some point, but has stopped. Suspect that changes in qemu, qemu-img seem to be doing more locking (versions 2.10, 2.11, somewhere around there). The hypothesis is that nova locks the file and then the cinder process can't write to it.

Need to research whether there is a qemu operation that we could ask nova to do, like,

virsh qemu-monitor-command <DOMAIN> --hmp "block_resize <DISK> 2G"

The idea would be to model this on the snapshot, you have a nova API that's implemented through libvirt.

Conclusions

  • continue the two-stage approach, that is, block the operation for nfs drivers so that you don't get failures, and research how to actually fix it
  • seems like we need to get nova cooperation on this, we shouldn't try to get around the locking somehow on our own
  • testing: we run devstack-plugin-nfs-tempest-full in the cinder gates; when ready, should enable volume-feature-enabled.extend_attached_volume in tempest on that job

Improvement proposal for LVM volume driver, direct attached storage via cinder

The proposer wasn't available, but we discussed the issue anyway because it's come up several times.


The basic idea is that you can get some optimization by opening volumes as devices instead of connecting via iSCSI when they're on the same node. The problem is that this is a very special case.

The team discussed a few ways to do this:

  • introduce this as an optimization in the LVM driver, if you are on same node, skip iscsi layer
    • pro: would allow more functionality
    • con: makes the driver more complex and likely to introduce bugs
  • add an option in cinder to enable "try local connection" on the LVM driver
    • if it's enabled adds path to the connection info
    • If os-brick receives the path, then tries to use it if it exists, if it doesn't, it does the normal iSCSI connection
  • do something similar in the lio target


What is the attitude of the project to this?

  • if it's a new target, could clone the lio job and run those tests (because it's a single node case)
    • need to have good documentation, especially about the limitations
  • if it's the "optional" one, it would use the same lio job but setting the "try local connection" config
  • bottom line is that we want as little interference with other drivers as possible

Conclusions

  • we want to know more about the use case -- it really sounds like using cinder to manage nova ephemeral storage
  • proposer can bring this up again after addressing the above points

Ceph iSCSI driver

Discussion led by hemna

Walt began work on this a while back but got stalled because some of the dependencies weren't packaged for a distribution, so there wasn't a way to get infra-approved CI for it.

It came up again on the mailing list in April: http://lists.openstack.org/pipermail/openstack-discuss/2020-April/014278.html

Walt wrote an update this morning: http://lists.openstack.org/pipermail/openstack-discuss/2020-June/015237.html

Ubuntu Focal now has packages, so we can aim for Focal as the CI target. Packages may be coming soon for Fedora.

There's an effort to get the ceph devstack plugin updated to a modern version of ceph: https://review.opendev.org/#/c/676722/ -- would be good to help get this merged so we have a modern ceph to work with.

Conclusions

  • Walt has some time to continue with this now; he'll coordinate with the Linaro developers to work on it

Remove volume driver abc classes

Discussion led by eharney

The design of the base driver was to include a bunch of interfaces as abstract classes, for example: https://opendev.org/openstack/cinder/src/commit/be9d5f3fa54b1b7948f44a9c91cf3f4753c0a03b/cinder/volume/driver.py#L1998

It's a whole bunch of code that didn't work out well (abc didn't do exactly what we had hoped), and it would be good to get rid of it. (This has been a TODO for a few years.)

The general idea is to start removing these various "xxxVD" abstract classes and have a single volume driver class instead of all these subclasses with abc on them.

General observations:

  • probably phase this over a cycle or so
  • will definitely affect out-of-tree drivers, will need to publicize this
  • maybe leave the classes there, but remove abc from them and make them no-op classes
  • change in-tree drivers to inherit only from the base driver class, instead of that class comma xxxVD comma yyyVD

We have two sets of classes that describe the interface for drivers:

  • the abc classes in cinder.volume.driver
  • the classes in cinder.interface

It would be better to have a single source of interface class that defines what you need to do to implement a driver.

Conclusions

  • doesn't seem to be conceptually difficult, but will touch a lot of code
  • e0ne is interested in seeing this happen
  • need to work out the points in "General observations" above more clearly before starting, but the general sense of the community is this is a good idea

Backup Driver issues

Discussion led by Shatadru

Well, one issue anyway. The issue is the container parameter for the backup drivers. It's handled differently depending on what driver you're using:

  • swift creates a bucket
  • nfs creates a directory
  • ceph uses it as a pool name


What happens with Ceph is that if the pool doesn't exist, the backup will eventually fail, which can be mystifying to users:

  • they expect it to act like the other drivers, where you can use it to organize your backups
  • it's not documented anywhere that for Ceph, you need to give it an existing pool name or your request will fail


Most operators using Ceph probably don't want this used by end users because they've probably set aside a particular pool they want used for backups and don't want users to put their backups in some other pool. Also, they probably don't want users to even know any of the pool names.

Would like a configuration setting to allow the Ceph driver to ignore the container if it's passed. Problem is, this doesn't make so much sense for the other backend drivers, so we don't want to make it a global setting -- and that means it won't be the kind of thing we can detect and reject at the API layer.

It's a real problem for Ceph because an end user can probe to find out what pools they have access to, and they can put things into the "wrong" pool. But it seems perfectly OK for the other drivers.

Conclusions

  • should add a configuration option for the Ceph driver, something like 'ignore_container' and then the configured backup pool is used even when a user specifies something else
  • we'll track this as a Launchpad bug: https://bugs.launchpad.net/cinder/+bug/1881934
  • ACTION - Shatadru - will put up a documentation patch
  • ACTION - rosmaita - will update the bug with this discussion to point the direction the fix should take

Backporting 'supported' status

Keeping 'unsupported' drivers in-tree

Ussuri Development Cycle Retrospective for Driver Developers

Broken RBD live migration

This wasn't actually a discussion, someone left a question on the etherpad and Gorka answered it.

Thursday 04 June: Cinder Improvements

recordings

Recent Security Issues

Quick discussion of OSSN-0086, which was announced yesterday.

Reviewing our reviewing strategies

Discussion led by eharney

Interop WG interlude

Looking at type annotation

Discussion led by eharney

Cinderclient CLI usability issue

Discussion led by rosmaita

OpenStack Client tangent

Because we were talking about working on the cinderclient, the question came up: what about the openstackclient? Since Jay is on the TC, and the TC is thinking about the unified client issue, we talked about this a bit.

Community Goals

Ussuri Development Cycle Retrospective

(We adjourned early for the Virtual Happy Hour, which was not recorded.)

Friday 05 June 2020: XP & TCB (cross-project and taking care of business)

recordings

Cinder/Glance creating image from volume with Ceph

Discussion led by abhishekk

Make cinder store of glance compatible with multiple stores

Disucssion led by abhishekk

Gate job for glance cinder store

Discussion led by whoami-rajat

Glance image co-location

Discussion led by jokke_

Refresh connection_info

Discussion led by eharney (lyarwood couldn't be here)

DR operations and broken volume connections

Related topic introduced by sfernand

Dynamic front-end QoS

Victoria spec review

Technical debt

Cycle Priorities and Schedule