Jump to: navigation, search

Difference between revisions of "CinderCaracalPTGSummary"

(Bobcat Retrospective)
(Backup/Restore performance)
Line 66: Line 66:
  
 
===Backup/Restore performance===
 
===Backup/Restore performance===
 +
 +
There is a bug reported against S3 and swift backend complaining they are very slow.
 +
Launchpad: https://bugs.launchpad.net/cinder/+bug/1918119
 +
 +
The general discussion was around the following points:
 +
* If there are issues in backup/restore, report a bug, it helps the team to be aware about all potential improvements
 +
* Using stream instead of chunks
 +
** The basic infrastructure in the backup metadata should be there, as we can store the version
 +
** https://github.com/openstack/cinder/blob/04e11c1f773b40b16b95a8638c473972d3b42886/cinder/backup/chunkeddriver.py#L107
 +
* We don't have a backup or restore progress status anywhere
 +
* We can work on something to get data about long running operations and their current status.
 +
* #action: zaitcev is to investigate what we have now and propose a spec for observability, in particular for restores ~ we have %% notifications already, but no current %%
 +
 +
 +
Few of the specs related to backup were also mentioned:
 +
* Encrypted backups
 +
** https://review.opendev.org/c/openstack/cinder-specs/+/862601
 +
* Introduce a new backup status field
 +
** https://review.opendev.org/c/openstack/cinder-specs/+/868761
 +
* Multiple cinder backup backends
 +
** https://review.opendev.org/c/openstack/cinder-specs/+/712301

Revision as of 23:50, 24 October 2023

Introduction

The Eighth virtual PTG for the 2024.1 (Caracal) cycle of Cinder was conducted from Tuesday, 24th October, 2023 to Friday, 27th October, 2023, 4 hours each day (1300-1700 UTC). This page will provide a summary of all the topics discussed throughout the PTG.

Cinder Bobcat Virtual PTG 29 March 2023


This document aims to give a summary of each session. More information is available on the cinder 2024.1 Caracal PTG etherpad:


The sessions were recorded, so to get all the details of any discussion, you can watch/listen to the recording. Links to the recordings for each day are below their respective day's heading.


Tuesday 24 October

recordings

Bobcat Retrospective

We categorized the discussion into the following sub sections:

  • What went good?
    • Active contribution across different companies
    • We got 2 Outreachy interns in summer internship having great contributions in api-ref sample tests
    • Releases happened on time
  • What went bad?
    • CI failure rate impacting productivity
    • Sofia leaving the team really affected review bandwidth
  • What should we continue doing?
    • Sponsoring outreachy interns
      • proposal accepted with at least one applicant
  • What should we stop doing?
    • Lack of structure around the review request section in cinder meetings
      • Too many patches discourage the reviewers from taking a look
      • Authors should add explanation if the patch is complicated and ask if they have any doubts
      • Author should try to add only the relevant patch based on milestone instead of adding all patches possible
      • Author should be active in reviews since the core team prioritizes patches of contributors that do active reviews
      • #action: whoami-rajat to follow up on this in a cinder meeting

Gate Issues

Continuing the discussion from 2023.2 Bobcat midcycle, we are still seeing gate issues consisting of OOMs and timeouts. Looking at a sample gate job failure, we decided on the following points:

  • Increase system memory of VM if possible, 8GB is not enough for tempest tests
  • Increase swap space (make it same size as RAM for one to one mapping)
  • Change cinder to use the file lock for the coordination in order to get rid of etcd
  • Reduce number of processes
    • We see the pattern of multiple services running multiple instances
      • neutron-server: 5
      • nova-conductor: 6
      • nova-scheduler: 2
      • swift: 3 for each of its service
  • Reduce concurrency to 2 for testing purposes to see how many VMs we end up running
  • #action: rosmaita to propose patch for few straightforward tasks like increasing swap space

Backup/Restore performance

There is a bug reported against S3 and swift backend complaining they are very slow. Launchpad: https://bugs.launchpad.net/cinder/+bug/1918119

The general discussion was around the following points:

  • If there are issues in backup/restore, report a bug, it helps the team to be aware about all potential improvements
  • Using stream instead of chunks
  • We don't have a backup or restore progress status anywhere
  • We can work on something to get data about long running operations and their current status.
  • #action: zaitcev is to investigate what we have now and propose a spec for observability, in particular for restores ~ we have %% notifications already, but no current %%


Few of the specs related to backup were also mentioned: