Jump to: navigation, search

CinderBobcatPTGSummary

Introduction

The Seventh virtual PTG for the 2023.2 Bobcat cycle of Cinder was conducted from Tuesday, 28th March, 2023 to Friday, 31st March, 2023, 4 hours each day (1300-1700 UTC). This page will provide a summary of all the topics discussed throughout the PTG.

Cinder Bobcat Virtual PTG 29 March 2023


This document aims to give a summary of each session. More context is available on the cinder Bobcat PTG etherpad:


The sessions were recorded, so to get all the details of any discussion, you can watch/listen to the recording. Links to the recordings are located at appropriate places below.

Tuesday 28 March

recordings

Announcements

  • 2023.1 (Antelope) is released!

ML: https://lists.openstack.org/pipermail/openstack-discuss/2023-March/032872.html

  • 2023.1 (Antelope) Project update in OpenInfra live

Link: https://www.youtube.com/watch?v=YdLTUTyJ1eU

PTL and TC Interaction Summary

We discussed about what are the challenges faced by a new contributor in OpenStack and what steps we can take to help improve the process. Some of the difficulties discussed are:

  • Gerrit interface isn't very intuitive
  • Devstack errors are not easy to debug and resolve: Cinder team didn't mandate installing devstack for Outreachy contribution this time which improved the contribution significantly.
  • OpenStack requires more devops/linux-style knowledge than many other OSS projects


There is a crash course for learning linux concepts

Link: https://missing.csail.mit.edu/

Manila team is working on a guide for outreachy applicants

Link: https://wiki.openstack.org/wiki/Outreachy_Applicants_Guide#Outreachy_Applicants_Guide

2023.1 (Antelope) Retrospective

What was good?

  • Added a new core -- Jon Bernard


What was bad?

  • Delaying of RC1 and RC2 due to certain fixes


What should we stop doing?

  • Someone mentioned that they don't like the recent practice of adding reviews to the meeting agenda, but on the other hand, it does get them some attention
    • Keystone has "reviewathon" on Fridays. They separate managing bugs/reviews and doing reviews in a meeting. - We also have "festivals", but not every week.
    • It would be good to have more people joining
    • It was also mentioned that we should have a meeting where people can bring their own patches for reviews (not specifically XS)
    • There was a concern regarding driver patches taking time for third party CI to respond and patches keep waiting even after CI passes
      • we can discuss about their CI status and if not reporting, we can flag and warn them to fix it


What should we continue doing?

  • Festival of XS reviews
  • once-a-month video team meeting


Cinder contribution Information:

Link: https://tiny.cc/cinder-info

Outreachy Overview

Sofia provided a great presentation on outreachy which is available here.

Link: https://docs.google.com/presentation/d/e/2PACX-1vRrCWvWw6YV13LafHBBSu9EHm8deZu4WTjIebWt0AZEOkovbjhIY9ft9TIk75gL7HZa3lp2apRMQIli/pub?start=false&loop=false&delayms=3000

Quick question about NFS encryption

are Dell and NetApp developers interested in getting the encryption support for their drivers? If we enable encryption in generic nfs driver, drivers inheriting from it automatically gets the support which is not something we want. Responses from driver vendors:

  • NetApp
    • Netapp already has backend encryption but they don't enable is since they don't have any customer request for encryption
  • Dell
    • No plan to use NFS encryption as there is no real ask from their customers

Cinder backup improvements

Christian couldn't attend the meeting so here are the specs he mentioned that requires attention.

tobias-urdin brought up a problem with cinder backup/restore and availability zones.

  • Bug: https://bugs.launchpad.net/cinder/+bug/1949313
  • Gorka thinks the source of the bug is that we don't pass the availability zone while creating the volume to restore to
  • One solution is have a config option to allow cross AZ volume backup relation, example, enable_cross_az_backups = true (default)
  • #action: Someone to take up the task to fix the bug

EM vs EOL for rocky and stein and in general

We thought that it would be a good idea to remove all jobs from a branch but still keep it for collaboration purpose but there were few points opposing that idea:

  • if there are multiple patches proposed to a branch where we aren't merging anything, we will end up patches conflicting with each other
  • Keeping branches EM signals that it is still maintained (based on the naming extended maintenance) which isn't a good message from our side


Also there was a mention of the idea that we mark the branch as EOL but still keep it for collaboration:

  • If we mark branches as EOL and still keep it, we will need to convince other projects about the proposal of marking branch as EOL but not deleting it


Another discussion related to stable branches that should report third party CI runs:

  • There are 3 active stable branches at any point, currently for 2023.2 development, they are 2023.1, Zed and Yoga (Xena will move to EM)
  • We need to keep track of the ubuntu and python version while doing this testing


Action Items:

Wednesday 29 March

recordings

Image Encryption - Current State

Patches in python-barbicanclient and castellan have merged and castellan release will be out soon.

From cinder perspective, we will have patches for os-brick and cinder (for the create bootable volume operation).

The glance and cinder changes will be dependent upon the os-brick patch so the priority should be os-brick > glance and cinder.

The team feels tempest scenario tests would be good to have including glance, cinder, os-brick code paths.

Action Items

FIPS jobs

We have ubuntu and centos jobs proposed.


Since Ubuntu Focal (20.04) doesn't have a kernel supporting anything else than MD5, we can't use lvm + iscsi target. We can probably try with LVM+ NVMe-TCP or LVM + nvme-rdma using the Soft-RoCE. Also FIPS is only enabled on Focal so Jammy isn't qualified for FIPS testing as of now.

Action Items

  • #action: review and merge proposed patches to start running jobs as non-voting but also keep an eye on failures

Operator Hour

Etherpad: https://etherpad.opendev.org/p/march2023-ptg-operator-hour-cinder

There were no operators that joined the cinder operator hour. To make better use of the time, we discussed a topics that were proposed by an operator.

Are we ready for SQLAlchemy 2?

oslo.db 13.0.0 will be released during 2023.2 Bobcat development and will remove sqlalchemy-migrate support and formally add support for sqlalchemy 2.x. For cinder to adopt to this change, we will need to merge the following patches.


There is also an effort to remove the abstraction in the DB code and make sqlalchemy as our only DB ORM. The team agrees that we should move forward with this.


Action Items

Thursday 30 March

recordings

OpenStack Client update

We added missing commands in OSC in the 2023.1 Antelope release and got parity between cinderclient and OpenStack client. Following are the changes we are planning for 2023.2 Bobcat development cycle:

  • We will make OSC as the default CLI and only add new commands to OSC and not cinderclient
    • We will still need to add python bindings to cinderclient
  • We will Improve openstacksdk to add support for missing cinder operations

Action Items

  • #action: go forward with the plan of working towards parity with SDK

Quotas

Partial work has been done but unfortunately Gorka won't be able to continue working on it due to other priorities. Rajat has proposed to work on it and appropriate handover will be done for continuing the work.

  • #action: Rajat to understand the current state and take handover from Gorka
  • #action: All cores to read the spec when it's finalized after the handover

Active/Active support with NFS

This should be doable but will require a lock to avoid two services working on the same resource.


Gorka has a series of posts written on Active-Active that should be helpful.


Testing:

  • A sanity test by running tempest with multiple volume services
  • More thorough testings should be done with browbeat or rally

Glance Cinder Cross Project

RBD deletion issues

When cinder and glance both use RBD as their backend and we create a bootable volume from image, COW cloning is performed which creates a dependency chain. This is also True for cloning from a source volume operation. This dependency causes problem when deleting the parent resource. The current work on cinder allows the deletion to happen by using RBD's trash functionality.

Currently there is a cinder patch in progress. We need changes in glance similar to cinder to allow deletion of parent images that have dependent volumes.


Action Items

  • #action: Eric to do a POC for glance RBD store and propose a spec accordingly

Glance-Cinder-Nova cross project

Glance Image Direct URL access

The work started on the glance side and the spec was merged, however the implementation wasn't started.


Nova team requires a separate spec for the nova changes to handle their upgrade and backward compatibility scenarios.

Some requirements from nova team for nova side changes:


Action Items

  • #action: Repropose glance spec to 2023.2
  • #action: Propose a nova spec handling nova specific use cases
  • #action: Glance team to start working on implementation

Nova Cinder Cross Project

NFS encryption

This is an effort to enable encryption for the generic NFS driver. This feature will require changes on both nova and cinder side.

Nova team would require a blueprint to track the work. A spec wouldn't be required since there are no DB or API changes. It would be good to share as much code as possible with the nova provisioned disk encryption feature.

Action Items:

  • #action: Propose a nova blueprint
  • #action: Handle the upgrade concern about making sure we are not scheduling to an older compute (using a specific trait + a prefilter)
  • #action: For testing, cinder will enable encryption in their existing NFS job and nova could run it on our periodic jobs


Allow specifing a hardware model for cinder volume on a per volume basis

Currently nova allows us to select the disk model via the image using hw_disk_bus image property, example, hw_disk_bus=virtio or hw_disk_bus=sata If we wanted to support this as a pre volume attribute, we can use volume metadata for it. The validation of the value will be done o

Action Items:

  • #action: The disk model can be set in volume metadata and nova would validate if the value is correct
  • #action: Implement a precedence order in nova to provide higher priority to volume metadata field than glance image metadata field

Friday 31 March

recordings

Release notes guidelines for SLURP/NON-SLURP cadence

We need to handle release notes case for SLURP/non-SLURP releases. Brian has a proposal up documentation for the same.


Gorka also has a Documentation patch for cinder related changes in SLURP vs non-SLURP releases


Action Items

  • #action: Review documentation proposed by Brian and Gorka

Upload volume to image optimization for RBD

Currently the work is on hold for the service role to be available and RBD deletion fixes to be merged, breaking the dependency chain. We can also use the service role without keystone bootstrapping it and document it as a required prerequisite for this feature to work.


Action Items:

  • #action: Rajat to go through nova docs and add documentation regarding using this with service role and service token
  • #action: Go through the RBD patch to see if we need to include any custom changes to make RBD delete work for this

Cinder retype for migration , passing the new_volume_type_id to the drivers

The concern was regarding cinder not passing new_volume_type_id while calling migrate volume functionality. The driver team wanted to replicate the generic migration flow where they create a new volume on the new host and copy data from old volume to new volume. This doesn't seem like a reasonable approach for a driver since the migration from a driver is expected to be efficient. The driver can always rely on the generic migration by not implementing the migrate_volume method.

Action Items

  • #action: work on the patch to allow drivers to return extra_specs properties that are OK for retyping (with migration) a volume