- 1 Introduction
- 2 Tuesday 20 April
- 3 Wednesday 21 April
- 3.1 recordings
- 3.2 Removing the Block Storage API v2
- 3.3 mypy status and next steps
- 3.4 Quotas testing
- 3.5 Fix up volume driver classes
- 3.6 Cinder throttle and cgroup v2
- 3.7 Cross-project meeting with Nova
- 4 Thursday 22 April (Drivers' Day)
- 4.1 recordings
- 4.2 Using Software Factory for Cinder Third Party CI
- 4.3 NVMe-oF and MDRAID replication approach - next steps for connector and agent
- 4.4 How to handle retyping/migrating nonencrypted volumes to encrypted volumes of the same size
- 4.5 Small topics
- 4.6 Cross-project meeting with Glance
- 5 Friday 23 April
- 5.1 recordings
- 5.2 Market trends and new cinder features
- 5.3 Snapshotting attached volumes
- 5.4 Multiple volumes goes to a same backend/pool as scheduler intances (HA) use only its internal state to balance the volumes among pools
- 5.5 Making the backup process asynchronous
- 5.6 Several small topics
- 5.7 OpenStack client still doesn't support cinder microversions
- 5.8 Consistent and Secure RBAC
- 5.9 Xena Cycle Priorities
This page contains a summary of the subjects covered during the Cinder project sessions at the Project Team Gathering for the Xena development cycle, held virtually April 19-23, 2021. The Cinder project team met from Tuesday 20 April to Friday 23 April, for 3 hours each day (1300-1600 UTC), with Friday's sessions stretching to 4 hours.
This document aims to give a summary of each session. More context is available on the cinder PTG etherpad:
The sessions were recorded, so to get all the details of any discussion, you can watch/listen to the recording. Links to the recordings are located at appropriate places below.
Tuesday 20 April
Greetings and some Cinder project business
Business in no particular order:
- The final release from stable/train for all deliverables is 12 May 2021, so check for any critical bugs that should be backported and get those moving along right away.
- We need to do a release from the rbd-iscsi-client to verify that all the tooling is working. Want to do this before any critical bug is discovered that will require a release. The patch formatting rbd-iscsi-client as an OpenStack project still needs reviews: https://review.opendev.org/c/openstack/rbd-iscsi-client/+/774748/
- There's a patch up placing cinder-specific dates on the OpenStack release schedule. The only notable change from the usual dates is to move the spec freeze to 2 weeks after the R-18 midcycle. (One week didn't give people enough time to revise and resubmit, which required a bunch of spec freeze exceptions. Hopefully we can avoid that this time.) Please check over the dates and leave comments if you see any problems: https://review.opendev.org/c/openstack/releases/+/786951
- The two midcycles at weeks R-18 and R-9 have been working well, so let's do that again. Format will be 2 hours. I propose holding them in the place of the cinder weekly meeting that week; that way, everyone should be able to attend for at least the one hour they have set aside for the cinder meeting. Time would probably be 1300-1500 UTC. Please check for conflicts:
- R-18 midcycle: 2 June
- R-9 midcycle: 4 August
- Some quick links to be aware of
- general info on cinder project: tiny.cc/cinder-info
- cinder group info: tiny.cc/cinder-groups
- We've got someone interested in helping with documentation. I've asked her to start by reviewing release notes on patches for clarity, grammar, etc. These are user-facing documents so it's helpful to get them correct.
- We didn't schedule time for a Wallaby Release Cycle Retrospective, so maybe we can do that during happy hour today.
Not possible for a non-admin user to determine via API if a volume will be multi-attach
Here's the context for the discussion: https://github.com/kubernetes/cloud-provider-openstack/pull/1368#issuecomment-819442191 and the comments following that one.
Basic idea is that cloud end users want to run kubernetes in OpenStack clouds. The kubernetes installation "operator" (kubernetes term, we're talking about a program, not the human operator of the OpenStack cloud) for an end user thus runs with normal OpenStack project credentials (not admin). For this particular issue, the kubernetes installer program wants to allow multi-attach block devices if the underlying cinder in that cloud supports it. (It does this by looking at the volume types available to the end user and creating a kubernetes storage class for each volume type.) Multiattach info is in volume-type extra-specs that are only visible to admins (because they could contain backend-specific info about the deployment that shouldn't be exposed to end users). So currently the only way for the installer to determine if multi-attach is possible is to create a volume of each available volume-type and to check if the resulting volume is multiattach. This is non-optimal (some might say time-consuming and stupid).
The issue of exposing non-sensitive extra-specs info to end users has come up before in the history of cinder, and the suggestion was to either name volume-types appropriately (e.g., "Gold level with multiattach") or to include this info in the volume-type description. But this is variable from cloud to cloud and not suitable for machine consumption. It would be better to have some kind of standard field for this in the volume-type-show response.
A few issues came up in the course of this discussion:
- We should probably provide some clear documentation around this. For instance, if you have a volume-type that can match multiple backends, whether a built volume is multiattach will depend both on (a) whether the volume type supports multi-attach, and (b) the scheduler puts the volume on a backend that also supports multiattach (which also depends on: (c) how you've configured the scheduler). So to support this use case (or, really, any use case where an end user wants to know whether a volume will be multiattach without having to build a volume first), a cinder admin needs to make sure that volume-types supporting multiattach are tied to backends that also support it (and the scheduler is configured appropriately).
- but what if the backend runs out of space? Well, that's ok, the end user won't be able to create a volume in it. But that seems better than getting a volume that doesn't have the properties you want.
- this didn't come up in discussion, but it looks like there's a tension between having very general volume types (e.g., could be multiattach, but maybe not, but as long as there's space on some backend, you'll get a volume) and very specific types (e.g., only has multiattach <is> True extra spec if all the backends that map to the type support multiattach). If we're just going to expose the multiattach extra spec, then a cinder admin can't have a general type that supports multiattach when available, because that won't meet the kubernetes use case we're discussing. Now maybe that's not a big deal, because for multiattach, it seems like if you want multiattach for your workload, you really want it, and so it would be kind of pointless to have a multiattach volume-type that sometimes didn't give you a multiattach volume. But I wonder if that's the case for other non-sensitive extra-specs. What I'm getting at is I wonder whether what we want is to introduce volume-type-metadata where a cinder admin could expose whatever info the admin deems suitable for end users. The downside is that the admin would have to configure info twice, once in extra-specs and once in volume-type-metadata, but it would also make it possible to add key/values for particular consumers (e.g., "k8s:supports_multiattach": "true" (these would have to be string values), "k8s:whatever": "something", or even metadata indicating that the volume-type should be used by kubernetes at all). The upside is that the key/values could be defined by the kubernetes-cinder-csi operator, not cinder.
- Manila currently exposes a set of backend capabilities to end users while keeping others secret: https://docs.openstack.org/manila/latest/admin/capabilities_and_extra_specs.html
- This is related to the capabilities reporting that Eric discussed at the Denver PTG and put up an initial spec for: https://review.opendev.org/c/openstack/cinder-specs/+/655939
- the basic idea is that although there's capabilities reporting defined (and implemented in some drivers), it's not specific enough to help an operator to write extra-specs (e.g., supports_qos: True doesn't help you write qos extra-specs for a backend (though it does tell you to look at the driver docs or code to figure out what to do))
- this is the init_vendor_properties function: see https://github.com/openstack/cinder/commit/0e2783360ce730beed3423bee31ad9726a51c8e1
- implementing this wouldn't help the kubernetes use case directly (because this capabilities API wouldn't be exposed to end users), but getting it defined could help in the effort to identify suitable backend-capabilities that could later be exposed to end users (or at least developing a suitable vocabulary)
- the basic idea is that although there's capabilities reporting defined (and implemented in some drivers), it's not specific enough to help an operator to write extra-specs (e.g., supports_qos: True doesn't help you write qos extra-specs for a backend (though it does tell you to look at the driver docs or code to figure out what to do))
- Walt pointed out that we currently expose some volume-admin-metadata to the cinder admin in the volume-show response, so that could be a model for what we do here, expose select extra-specs to end users in the volume-type-show response
- might be better to add an API to ask what volume types support a feature the end user is interested in
- useful capabilities to expose:
- availability zones
- volume replication
- online extend capability
- might want to make this configurable (sort of like resource_filters) so cinder admin can decide what to expose?
- this would be useful to Horizon or other UIs -- for example, don't offer an end user to try to do an extend of an attached volume of the volume-type doesn't support it
- This would be a good addition to Cinder
- For the OpenShift/kubernetes use case, exposing some extra-specs in a standard format to non-admin users is sufficient.
- Matt Booth and Tom Barron will draft a spec
- the Cinder team will review and help keep the spec moving along
- Eric will resuscitate his capabilities API spec; the team expressed general support for it
The Interoperability Working Group
Part of what the Interop WG does is to determine a set of tests that cloud operators can run to demonstrate the portability of workloads across OpenStack clouds. Vendors who pass the tests can use the OpenStack logo. Currently, these are a specific subset of tempest tests defined in the interop git repository.
The idea is that the tests cover two releases back and one forward. So the 2020.11 guidelines cover ussuri and victoria and wallaby.
You can have "required" or "advisory" capabilities. The tests for "required" capabilities must pass for a vendor to be allowed to use the logo. A capability must be "advisory" for at least one cycle before it can become required.
The Interop WG is interested in learning what new features have been added to Cinder in Wallaby that should be included as advisory capabilities. What they're looking for is anything that might impact workload portability.
If we identify such a capability, we can write a tempest test for it, and then propose it to the Interop WG for inclusion as an advisory capability. What happens is that vendors use refstack to run the tests in their clouds. Refstack allows submission of results directly to the Foundation. That way, the Interop WG can get data about advisory capabilities (for example, if those aren't being exposed in OpenStack clouds, then they may not be candidates for becoming "required").
- The current tempest tests cover at most the 3.0 microversion.
- Many of the subsequent microversions make the Block Storage API more user friendly, but may not impact workload portability.
- Cinder team: will review mv changes between 3.0 and 3.64 to see if any "advisory" tests should be added
- Arkady: will follow up with the cinder team
Wallaby cycle retrospective
We clearly have a review bandwidth problem and need to start looking for new cores to help us out.
People currently working with Cinder who are interested should reach out to the PTL (rosmaita) or any of the current cores to get some guidance both about what we'd like to see from the Cinder side as well as some guidance about how to approach your management to get yourself sufficient time to devote to the project.
Gorka asked whether we could have "specialist" cores who have expertise in specific areas. We actually have a precedent for this with volume drivers. Lucio (who unfortunately for us switched jobs) had done a lot of high-quality driver reviews, and the core team agreed that although he wasn't experienced with all of Cinder yet, he could continue to develop that knowledge and only +2 patches that he was comfortable with. This allowed us to increase review bandwidth without sacrificing review quality.
So the point is don't feel like you have to know all of Cinder before even thinking about becoming a core. Similarly, current cores should keep this in mind when looking for candidates.
- Current cores should keep a watch for people who might be helpful. Since this is a personnel matter, I'd like to keep a bit of confidentiality here, so circulate an email only to the current cores (that is, not on the mailing list) so we can discuss among ourselves and the PTL can contact the candidate directly with constructive feedback. Of course, when we reach the stage that someone's been identified as a serious core candidate, we will have the usual vote in public on the mailing list.
- Some tips for core hopefuls:
- don't leave +1 or -1 without comments on patches ... say something about what you looked at in the patch. If the code looks great, check the unit tests. If they look great too, check the test coverage report. If it's a driver, check the third-party CI results. You may not find anything negative, which is perfectly OK, but leave something more than your +1 indicating what's good about the patch
- look at -1s left by core reviewers to see what kind of issues they're looking for when reviewing patches
Wednesday 21 April
Removing the Block Storage API v2
The Block Storage API v2 was deprecated in Pike, and has been eligible for removal for quite a while. It didn't happen in Wallaby, and the lesson learned there is that it must be removed by Milestone-1, or it won't happen.
Xena Milestone-1 is week R-19 (the week of 24 May), so roughly 1 month away.
We've got two deliverables affected by this: the python-cinderclient, and cinder itself.
We proposed removing v2 support from the cinderclient in a series of emails to the ML in Fall 2020:
There were no objections raised, so we should be good to go.
This will happen in a series of patches:
- there's a client class that gives you a v2 or v3 client; change this to return a v3 client only
- remove the v2 client class completely and revise tests
- remove any v2 classes that only import v3 classes and revise tests
- for any v3 classes that simply import the v2 class, move the v2 class to v3 and revise any affected tests
- for any remaining v2 classes, replace 'v2' in the module name with 'internal' (or something else that indicates that they aren't supposed to be used directly by consumers) and revise tests
Unlike the cinderclient v2 classes, that are specifically designed to be used by consumers as python language bindings for the Block Storage API v2, the cinder v2 classes aren't meant to be exposed to consumers. Thus it's not critical to remove/rename those classes (at least not right now), so we can just leave those as they are. (This will also make backports easier.)
What we need to do for cinder, then, is:
- no longer return v2 as a version option
- return 404 when the v2 API is requested (or whatever we currently do for v1)
- update tests
- the v2 section of the api-ref will need to be removed
- documentation update to remove v2 references
Impact on other teams:
- osc has code that checks for cinderclient v1 by trying to do an import from whatever cinderclient it's got available; could put up a patch with same check for v2
- openstacksdk currently only supports v2 and v3; it's branched, so could remove v2 from the xena development branch (looks like the v3 classes have no dependencies on the v2 classes)
- will need to remove the v2 endpoint from the service catalog (probably more a devstack thing than a keystone thing?)
- verify that they're using v3 (checked during the cross-project session with the nova team; they have no v2 dependency)
- looks like they use only v3: https://github.com/openstack/ironic/blob/af0e5ee096fa237290776969a37f3bced96b7456/ironic/common/cinder.py#L19
- alan has already removed v2 dependency/support
- send another general announcement to the ML saying that we really mean it this time
- cinderclient: someone posed this question: "Since this is a backwards incompatible change, how are we going to prevent deployments that still have v2 from breaking with a pip -U ?"
- I don't know that there's anything we can do other than send a message to the mailing list announcing that version 8 of the python-cinderclient will no longer support the Block Storage API v2, and if anyone requires v2 support, any occurrence of 'pip install -U python-cinderclient' in a script (or written instructions) should be replaced with "pip install -U 'python-cinderclient<8.0.0'"
- rosmaita will post the python-cinderclient patches soon-ish
- rosmaita will coordinate with enriquetaso and whoami-rajat (and anyone else interested) about the cinder-side changes
- rosmaita will check with osc/sdk team
- rosmaita will check with qa team about devstack
- rosmaita will send general announcement to the ML
mypy status and next steps
Eric reported that there are still some cinder mypy patches unreviewed, and some new os-brick patches: https://review.opendev.org/q/(project:openstack/cinder+OR+project:openstack/os-brick)+topic:%2522mypy%2522+(status:open)
The pile of patches is getting large and they depend on each other, so it's a PITA to keep them current. We decided to do a review festival so everyone can set aside some time to review them.
Eric reminded us that we're not shooting for full coverage right now, we're just trying to cover as much as possible with minimal interruptions and hit the interesting cases first. Also, these changes aren't likely to impact the way the code is running now.
There are some open issues:
- still need to figure out how to handle cinder oslo versioned objects
- need to figure out how to handle libraries that we import that don't have annotations
- Eric has a patch up to automatically add annotations to oslo libraries that we use, though he's not sure if this is a good long-term solution
- Festival of mypy Reviews, Friday 30 April 2021 from 1400-1600 UTC, https://meetpad.opendev.org/cinder-festival-of-reviews
- It would be good to land the zuul mypy experimental job: https://review.opendev.org/c/openstack/cinder/+/736857
- Walt is interested in looking at adding type annotations to the rbd-iscsi-client library
Gorka fixed a bunch of long-standing quotas bugs in Wallaby and has some ideas for simplifying the code and fixing some more. It would be good to have some testing to give the code a workout and be able to detect regressions.
Ivan had suggested that we could use Rally to test quotas, but he couldn't find a good example for using it to do what we want. Also, we probably want high concurrency, and Rally doesn't look very concurrent.
Some possible tests:
- Need multiple users doing things at the same time.
- Create/delete volumes and snapshots
- Probably want to start with artificially low quotas.
- Not sure how much space we have to test in the check ... but if we use the fake driver, that's not an issue
- Ivan isn't sure that the fake driver is still working, but it should be easy to fix if there are issues
- Manage/unmanage testing
- this functionality would have to be added to the fake driver, so we can hold off on this for now
Gorka mentioned that some of the bugs he fixed are related to temporary resources, so maybe we can add code to do a whole tempest run and validate that the quotas are correct at the end. Ivan mentioned that he's not sure that the current tempest tests clean everything properly, and we want to make sure we're testing cinder quotas, not tempest. (Gorka mentioned later that tempest cleanup doesn't matter for these tests. Cinder is in charge of creating/deleting temporary resources, so even if the "real" resources aren't cleaned up afterward by tempest, if what we're interested in is making sure temporary resources are correctly counted, we should be ok.)
Someone asked about the status of Keystone Unified Limits, but no one has looked into that recently.
- start with adding quota checking to the tempest jobs
- Ivan will continue to look at Rally
Fix up volume driver classes
The situation is:
- we have some abstract driver classes, namely, BaseVD, CloneableImageVD, MigrateVD, ManageableVD, and ManageableSnapshotsVD
- we also have a concrete driver class VolumeDriver that inherits from ManageableVD, CloneableImageVD, ManageableSnapshotsVD, MigrateVD, BaseVD
- most of the in-tree third party drivers just inherit from VolumeDriver
- plus we have the interface classes that are used to test that drivers implement the required functions
Breaking out functionality into the various abstract VD classes doesn't seem to have helped much, and makes it confusing for new driver implementers. It would be better to go back to just having the VolumeDriver.
So, how can we remove the abstract VD classes? First step would be to deprecate them. We can fix any impacted in-tree drivers, but there could be out-of-tree drivers that will be impacted.
- rosmaita send a note to the mailing list to warn out of tree driver maintainers that this is coming. See if we get a response.
Cinder throttle and cgroup v2
Cinder has been using cgroup v1 for (optional, operator-configured) throttling for a long time. cgroup v2 has been around for a long time as well, but there hasn't been a big reason to switch to v2 ... until Fall 2020, when container systems began supporting v2. So now linux distributions don't have a reason to worry about v1 any more, and many are making v2 the default and talking about removing v1 support all together.
Given the above, it looks like we can just switch to using cgroup v2 and not worry about also supporting v1.
The libcgroup library has code merged (commits 9fe521f, 8fb91a0, da13073)--but not yet released--that adds cgroup v2 support to the cgroup tools we are currently using, so once that's released, we'll be able to use our current code (with a minor change for the different control group name used for i/o in v2).
- We should move cinder to using cgroup v2 in Xena; further, there is no reason to worry about v1-compatibility
- rosmaita check on the status of of a new libcgroup release and report back at the R-18 midcycle (the current release is 0.42.0, and 2.0rc1 was just tagged, so we should be able to get this done in cinder before M-3 for sure)
Cross-project meeting with Nova
There are good notes about this session on the etherpad: https://etherpad.opendev.org/p/nova-cinder-xena-ptg
- The cinder team didn't have any objections to Lee's proposals that can't be worked out on the patches.
- Nova is OK with cinder updating os-brick requirements, though the current approach may be a tad aggressive.
Thursday 22 April (Drivers' Day)
Using Software Factory for Cinder Third Party CI
Adam Krpan from Pure Storage gave a presentation about how he used Software Factory to set up Pure's Third Party CI system. Here's a link to the recording cued up to 7:06: https://www.youtube.com/watch?v=hVLpPBldn7g&t=426 so you can see for yourself.
These are some useful links connected with this:
- software factory: https://www.softwarefactory-project.io/
- WIP 3rd-party CI guide: https://softwarefactory-project.io/r/c/software-factory/sf-docs/+/17097
- OpenStack guide to Zuul v3 job migration: https://docs.openstack.org/project-team-guide/zuulv3.html
During the discussion, Peter Penchev mentioned that he used an older version of Software Factory to set up the Storpool CI, but he hasn't tried upgrading yet.
- Adam is working on a blog post about this
- the software factory 3rd party CI guide needs reviews! https://softwarefactory-project.io/r/c/software-factory/sf-docs/+/17097
- it sounded like Peter and Adam had overall positive experiences using software factory, so the cinder team encourages other third-party CI maintainers to take a look -- having a software factory user community could have a lot of benefits to the cinder 3rd party CIs
NVMe-oF and MDRAID replication approach - next steps for connector and agent
Here are the slides from Zohar's discussion: https://docs.google.com/presentation/d/1lPU8mQ7jJmr9Tybu5gXkbE7NC1ppkMnoBS4cgSFhzWc
And here's the recording cued up to 46:42 if you want to follow along: https://www.youtube.com/watch?v=hVLpPBldn7g&t=2802
Zohar worked on upgrading the os-brick nvmeof connector during Wallaby so it can do mdraid on the hypervisor (client-side replication). In Xena, he wants to add a healing agent to increase resiliency. This agent would run on the hypervisor as an independent process configured by the operator.
Gorka suggested that the code can live in os-brick and be accessible to operators as a console script. This is a bit nonstandard, but it is pretty convenient because the agent will only be used in places where os-brick is installed, so it's nice for the code to be there for you when you install os-brick. Plus, as far as the Cinder Community goes, libraries that have a console script apparently don't bother us, because we've already done this in cinderlib: https://github.com/openstack/cinderlib/blob/3f20993c2d8e5f9d4ce0fb250e5b186d3c3fbc73/setup.cfg#L36
- Kioxia is taking a general approach to the agent so that it can be used as a basis for other nmveof solutions using mdraid that want a healing agent
- The cinder team consensus is that having the agent code live in os-brick and be accessible to operators via a console script is preferable to having the agent code live in its own OpenStack project (reduces a lot of project administration overhead)
- There's still an open question: the healing agent will need to use the REST client code that the Kumoscale driver in cinder uses -- is there a way we can consume it in os-brick without copying it over?
How to handle retyping/migrating nonencrypted volumes to encrypted volumes of the same size
Figuring out how to do this is turning out to be a very difficult problem. Sofia presented her latest thinking. See the etherpad and recording for full details; I'll try to give a quick summary here.
- etherpad: https://etherpad.opendev.org/p/apr2021-ptg-cinder (look around line 462)
- recording: https://www.youtube.com/watch?v=xWUO4TufqEM (start right at the beginning)
The problem we have is that the encryption metadata for an encrypted volume is stored in the volume. Thus the usable size of an encrypted volume is strictly less than the usable size of a non-encrypted volume. Further, while the usable size of a non-encrypted volume is clearly known by an end user (it's the number of GiB that the user has requested), the usable size of a non-encrypted volume is not because the size of the encryption header varies by encryption method (1-2 MB for luks1, up to 16M for luks2). So if you fill up a non-encrypted volume, and then want to retype it to an encrypted volume, the operation can fail because all the data can't be fit into the available space. This isn't a big deal when you have a laptop and you buy a new drive: you know that when you encrypt it, you will lose some space. But that isn't very "cloudy", because cloud users want to be able to retype volumes (which may entail a migration to a different backend), etc., and a 2 GiB volume of type A should be retypeable to a 2GiB volume of type B. Gorka pointed out that this is a paradigm shift in how cinder thinks about volume size (and all the current code was written using the old paradigm).
There's a similar size problem with creating a volume from an image: if the image is almost as many bytes as the requested volume size, the image won't fit into an encrypted volume even though users would expect that since, say, 1073741000 bytes < 1GiB, the image should fit in a 1GiB volume.
Sofia's proposal is that we have a "display size" and a "real size" for each volume. The display size is what users would see and any volume of display size n could be retyped to a volume of display size n regardless of whether either is of an encrypted type or not. Similarly, a 1 GiB encrypted volume could hold a glance image of 1073741824 bytes.
(Side point: on this model, when an encrypted volume is uploaded to Glance as an image, the image size will be the real size of the volume in bytes. So an encrypted volume with display_size==1GiB will have an image size strictly greater than 1GiB, which normally would mean it cannot be used to create a 1GiB volume. It would have the cinder_encryption_key_id image property, though, and presumably it would be used to create a volume of an encrypted volume type (or it wouldn't be readable), so maybe this isn't a problem. We just don't want to force an end user to have to select a volume of size 2GiB to restore an image of a 1GiB volume.)
The API and quotas/usage would use the display size, but everywhere else in cinder would use the real size. Operators might want to know the real size for accounting purposes, so this would need to be documented clearly and we'd have to include everything in the volume-create notification so accounting can decide how to charge: by display size, or by real size, or by user size + an encryption premium to make up for the extra space, or ...
Some of the problems we have to figure out:
- we need a generic way to handle retype for drivers that don't have a rich feature set
- we also have the issue that ceph and nfs-based drivers use qemu-img for encryption instead of cryptsetup
- driver differences:
- some drivers may be able to give us the requested GiB size + a few MiB for the encryption metadata
- some may only give us GiB sizes, so we'd have to request a 2 GiB volume for a display_size of 1 GiB
- some drivers just increase by 8GiB multiples
- we don't want to create a corresponding problem in the other direction. That is, suppose that the user asks for an encrypted volume of size S. We provide a volume of size S + some slack M. Suppose the encryption header uses E which is less than M. If the user is able to fill up S + (M - E), then if we retype to a non-encrypted volume of size S, we won't be able to fit the extra data into the retyped volume
- need to figure out the impact on backup/restore
- need to figure out the impact on uploading a volume as an image to Glance (and creating a volume from that image)
- need a plan for handling "legacy" volumes
We need more thorough documentation of the current failure scenarios, and then we'll need new tempest tests for these scenarios. This will have to include "filling up" a volume and then making sure that the retyped volume has all the data contained in the original. Eric has a prototype test that uses md5sum to verify the data: https://review.opendev.org/c/openstack/cinder-tempest-plugin/+/785514
- documenting the current failure scenarios (including backup/restore and create-from-image) is important; it will help us assess the proposals
- we have to distinguish the ceph/NFS case from the "real" block storage device case because they may require different solutions
What gate tests need to be done for A/A coverage of a driver?
We currently don't have any gate tests for A/A.
- General advice is to run tempest and the cinder-tempest-plugin against your A/A setup, as that can sometimes reveal problems
- A later clarification is that the particular A/A setup being talked about is DCN ("Distributed Compute Nodes", a.k.a. "Edge"). As this deployment configuration becomes more popular, it may be worth looking into setting up a gate job.
Using cinder oslo-versioned-objects (OVOs) in tests instead of dicts
I just wanted to point out a recent small refactoring to unit tests that Gorka did when fixing a driver bug.
In recent versions of cinder (where "recent" goes back farther than any of the existing stable branches), drivers are being passed Cinder OVOs (Oslo Versioned Objects), but the unit tests of many drivers are using dicts (both because dicts pre-date OVOs and because dicts are seen as easier to use in unit tests). This isn't a good practice because some subtle bugs can go undetected. Additionally, the cinder testing framework has some features that make it really easy to create and use cinder OVOs, so the "dicts are easier, it's only a unit test" excuse doesn't hold any more.
Anyway, the point is to take a look at this example to see how easy it is to convert code to using OVOs instead of dicts: https://review.opendev.org/c/openstack/cinder/+/766296/4/cinder/tests/unit/volume/drivers/netapp/fakes.py#80
- Developers and reviewers should look for this in new tests and make sure cinder objects are used where appropriate
- Driver maintainers should be aware of this issue and refactor when they modify any unit tests
Cross-project meeting with Glance
This session was held in meetpad. There is not a recording. I had a really bad connection and this summary is mostly taken from the notes on the etherpad: https://etherpad.opendev.org/p/cinder-glance-xena-ptg
Rajat (who is a cinder core) is helping to maintain the cinder glance_store (which uses cinder as a storage backend for Glance). The cinder glance_store is fairly primitive and doesn't take advantage of many cinder features. In particular, it doesn't support multiattach, which is really bad when you think about it because a major Glance use case is for an operator to have a small set of public images that are used to boot instances. Thus it's very likely that there will be multiple simultaneous requests to use the same image, and without multiattach, only one of these requests can be satisfied at a time, leading to great sadness. Thus Rajat has a glance spec up that will add multiattach support to the cinder glance_store. To do this, he will have to hip glance_store to Cinder's Attachment API (introduced in microversion 3.27, in ocata, so it's been around for a while).
- spec: https://review.opendev.org/c/openstack/glance-specs/+/787515
- patch: https://review.opendev.org/c/openstack/glance_store/+/782200
This change won't require any work on the Cinder side, so it's fine with us!
Erno pointed out that we need to collaborate on some documentation for operators to efficiently use the cinder glance_store. For example, if the cinder backend where glance images will be stored can do multiattach, the operator should create a volume-type for that backend that supports multiattach, and that is the volume-type that should be used by the cinder glance_store. We should collect some best practices (for example, using the image-volume-cache on the cinder side, which is probably good advice even if Glance isn't using the cinder glance_store).
- Rajat will continue with the cinder glance_store multiattach spec and patch
- it would be good for someone to start a best practices document for the cinder glance_store
Friday 23 April
Market trends and new cinder features
We had a quick discussion about how feature development is driven in Cinder. Currently, our project focus is on stability and reducing technical debt (for example, removing Block Storage API v2, moving away from sqlalchemy-migrate, implementing default secure RBAC), and we are relying on people to bring feature topics to the PTG and propose specs. In other words, the project team is mostly looking for stuff to fix, not new features to implement.
Jake Thorne from NetApp mentioned that we should probably be more proactive about making sure that OpenStack doesn't get out of date with what people are expecting from clouds these days. He "volunteered" to organize a presentation or discussion or something to expose the cinder team to current market trends. The last cinder meeting of each month (which we hold in videoconference) would be a good time to do this, or something at the R-18 midcycle, or we could have a new monthly meeting devoted to this topic. There seems to be sufficient interest among the project team, so it's a matter of getting product managers and other people who pay close attention to market trends to present or lead a discussion.
- Jake will get something together; Chuck Piercey from Kioxia volunteered to help out
- The Cinder team thinks this is a great idea; once Jake and Chuck have figured out the format and date of the first discussion, we can all communicate this up to our managers, who I think will be interested in attending, and hopefully interested in helping to develop more topics
Snapshotting attached volumes
Eric noted that we allow users to snapshot attached volumes, but they must use a --force flag on the request. This seems kind of pointless, especially because he suspects that the real use case is to snapshot while attached, so it's kind of dumb to make people opt-in to the expected behavior. Additionally, nova and cinder-csi always call with the flag set.
Eric's position is that crash-consistency is what people are expecting from snapshots. Gorka suggested that maybe not all end users understand this, and the --force flag reminds them that they are only getting crash consistency. Simon mentioned that pretty much everywhere, when you ask for a snapshot, you get one, and it's up to you to know what the implications are. Eric thinks we're not gaining much by not doing what everyone else does.
Someone mentioned that when you snapshot from the nova side, the instance is quiesced before the snapshot is made. So we could enhance the documentation to explain that cinder gives you crash-consistency, and to take the snapshot from the Compute API if you want quiescence.
Walt mentioned that there's a similar situation with backups, where people want to backup attached volumes but we require the --force flag to do this. We also have this situation with the upload-volume-to-image action.
- Eric will continue to work on the spec for this: https://review.opendev.org/c/openstack/cinder-specs/+/781914
- Eric will also look into the backup and upload-volume-to-image situations and decide whether they can be dealt with in the same spec
- Documentation about snapshots/backups/images being crash-consistent should be added to the api-ref where end users will see it
Multiple volumes goes to a same backend/pool as scheduler intances (HA) use only its internal state to balance the volumes among pools
The issue is described in one of our docs: https://docs.openstack.org/cinder/victoria/contributor/high_availability.html#cinder-scheduler
Fernando wanted to know what is the status of the "ongoing efforts to fix this problem" described in that doc, because he has a customer running into this issue.
At this point, the "ongoing efforts" are workarounds. One thing you can do is use the stochastic weigher in the scheduler to mitigate this.
Ivan pointed out that it's important to know whether the operator's goal is really A/A or whether what they want is H/A. Walt described a "semi-A/A" setup that has multiple APIs but only one scheduler (and thus the scheduler never gets out of sync). This setup addresses a problem where the API nodes were saturated with requests (users, scripts checking on, e.g., resource status). The scheduler has normal cpu, memory, and is able to keep up with provisioning requests. (Basically, GET /volumes is free, so users make a lot of calls, but POST /volumes costs money, so users only make provisioning requests when they really mean it.)
There's a related spec: https://review.opendev.org/c/openstack/cinder-specs/+/556529 ("Moving to using the database to get information for the scheduler"). The database is probably not ideal for this, so maybe memcached or redis. Key thing is that we'd need common storage because we can't count on the process of getting stats from the backends being fast enough so that we could just ask the backend for its current state.
If what we're really after is H/A, however, we could use locks, which would effectively serialize the requests of multiple schedulers, but that would be appropriate for H/A.
- This was a good discussion and probably worth finding on the recording if you're interested in this issue.
Making the backup process asynchronous
Walt's got a WIP patch up for this issue: https://review.opendev.org/c/openstack/cinder/+/784477
Basically, as what's considered a "normal" volume size has increased, so has the time required to perform a backup. So a 2TB volume can take on the order of 10 hours, so we're getting RPC timeouts and failed backups. Roughly, when a user requests a backup, the volume is cloned and then the clone is uploaded to the backup backend, and both steps can take a long time.
Walt's solution is to change the RPC calls to casts. The code change is simple, but getting the unit tests refactored is hard. Walt is planning to include compatibility code to handle rolling upgrades.
- Walt to continue working on this.
Several small topics
Volume list queried by error state should contain error_managing state
Need to speed up review of this patch and fix this particular bug, then follow up with a more thorough fix removing these weird states.
Gorka has a patch up to introduce a new db column for quotas that may get rid of a few of these.
We should be using user messages to get this kind of info to end users instead of trying to smuggle it into the status field.
Support revert any snapshot to the volume
One issue that's still not addressed in the spec is how the end user will see the snapshot chain (it goes from linear to being a tree).
Most modern storage can do this in an efficient way, so even if the lowest common denominator performance is bad, there are good reasons to allow this. Probably ask the driver to do it, and if it can't, use the LCD method.
- review the spec carefully so the coding can start
Update volume az after service restart when the volume backend az was changed
- This seems more appropriate as something implemented in the cinder-manage tool rather than something we try to do automatically.
OpenStack client still doesn't support cinder microversions
Currently, microversion support for cinder in osc has to be added for each mv. Example: https://review.opendev.org/c/openstack/python-openstackclient/+/761633
The OpenStack SDK is supposed to support microversions, but it's also reputed to be an "opinionated" interface to the OpenStack APIs. If so, it can't replace the cinderclient because we need to supply python language bindings for the entire Block Storage API.
People want to use the osc. One thing we can do on the cinder side is to add support to the osc for any new microversions. It won't help the catch-up issue (currently cinder is at 3.64 and osc has support for one mv (3.42)), but it should help. One problem is that the Block Storage API interprets a request for mv 3.64 to be a request for 3.1 through 3.64. So it's possible that a request for a recent mv could include some elements in the API response that the osc isn't ready for. But we can deal with that when it happens.
- Changes that introduce a new microversion into the Block Storage API require a cinderclient patch and osc patch before they can be merged. (We will not, however, require that the osc patch be merged in order to merge the cinder patch.)
Consistent and Secure RBAC
This is a large cooperative effort across all OpenStack projects that use policies because, basically, until everyone supports the changes that have been in keystone since Queens (!), they can't be used by anyone. They are a big win for operators because they facilitate a default policy configuration that operators have been requesting for years, but which (using the pre-Queens-keystone policy paradigm, which is the way Cinder handles policies) are very complicated to implement. The 2020 User Survey indicates that Cinder is used in 87% of OpenStack installations, so this impacts a lot of operators, and getting this done in Cinder is a priority for this development cycle.
Lance's general summary of the current status across OpenStack with links to more info: http://lists.openstack.org/pipermail/openstack-discuss/2021-April/022117.html
Roughly, the situation is that Keystone provides two dimensions on tokens: scope and roles. Currently Cinder recognizes only roles but can be converted easily to recognizing project scope + roles. It will take a bit more work to get Cinder to recognize system scope, but it looks like we have a way forward. So here's what needs to be accomplished:
- Natural language description of the (eventual) default configuration. This will be helpful to operators, but has also located some deficiencies in the current cinder policies. It also gives us something against which to validate tests of the default policies. https://review.opendev.org/c/openstack/cinder/+/763306
- Policy update patches (adding project scope): https://review.opendev.org/q/project:openstack/cinder+topic:secure-rbac
- Testing patches. Groundwork patch is https://review.opendev.org/c/openstack/cinder-tempest-plugin/+/772915 (can safely be merged now). Initial test patches: https://review.opendev.org/q/project:openstack/cinder-tempest-plugin+topic:secure-rbac
- Client changes to support system scope: https://review.opendev.org/c/openstack/python-cinderclient/+/776469
- Relax the cinder REST API to handle system scope: https://review.opendev.org/c/openstack/cinder/+/776468
We're still working on how to handle the Block Storage API v3 URLs, which mostly require a project_id in the path. With a pure system-scoped token as used by administrators, there's not an associated project. The current idea is to use 00000000-0000-0000-0000-000000000000 as a filler when working in system scope and augment the cinderclient to use this appropriately (and document this for admins/scripts that use curl or the requests library directly to interact with the API).
Though we will not be implementing personas that utilize domain scope as part of this effort, we need to make sure we don't make any design decisions that will require major refactoring when they are implemented later.
- we can work immediately on the natural language description, and the project-scoped changes (links to patches are above); aim to have this part done by milestone-1
- concurrently work on the changes needed to support system scope and have this done by milestone-2
Xena Cycle Priorities
throughout the cycle
- address stability issues in upstream CI: use gerrit topic cinder-xena-ci-stability on your patches
- improve test coverage in cinder-tempest-plugin (including new quotas tests discussed at the PTG): use gerrit topic cinder-xena-ci-coverage
- address usage of deprecated stuff anywhere in the cinder project deliverables (that is, in cinder, os-brick, cinderlib, python-cinderclient, python-brick-cinderclient-ext, rbd-iscsi-client): use topic cinder-xena-cleanup on your patches
- drivers: third party CI stability
- community building
before Xena milestone-1
- remove Block Storage API v2 support from the client
- remove the Block Storage API v2 from cinder
- secure RBAC
before Xena milestone-2
- complete secure RBAC
- replace sqlalchemy-migrate with alembic (will be discussed at R-18 midcycle meeting)
before Xena milestone-3
- patches for approved specs and driver feature blueprints