Jump to: navigation, search

Difference between revisions of "Meetings/InfraTeamMeeting"

m (Agenda for next meeting)
(Agenda for next meeting)
 
(252 intermediate revisions by 13 users not shown)
Line 10: Line 10:
  
 
* Announcements
 
* Announcements
** Gerrit User Summit happening December 2&3 virtually.
+
** OpenStack Release next week then PTG the week after
** clarkb out next week. Should we skip the meeting November 23?
+
** Put your PTG agenda items on the etherpad: https://etherpad.opendev.org/p/apr2024-ptg-opendev
  
 
* Actions from last meeting
 
* Actions from last meeting
Line 18: Line 18:
  
 
* Topics
 
* Topics
** Improving OpenDev's CD throughput (clarkb 20211116)
+
** Upgrading Bionic servers to Focal/Jammy (clarkb 20230627)
*** We can run many of our jobs in parallel in all of our CD pipelines. But this requires we properly document/address dependencies
+
*** https://etherpad.opendev.org/p/opendev-bionic-server-upgrades
**** Need to understand our job dependencies and properly note them in Zuul config or address them by combining jobs.
+
*** https://review.opendev.org/q/topic:jitsi_meet-jammy-update
***** Example 1: Combine service-gitea-lb and service-gitea jobs.
+
*** Started looking at the wiki there are rough notes at: https://etherpad.opendev.org/p/opendev-bionic-server-upgrades#L58
***** Example 2: Combine letsencrypt and nameserver jobs
+
** MariaDB Upgrades (clarkb 20240220)
***** Example 3: Have all jobs with webserver config express a dependency on the letsencrypt job
+
*** Relying on the container image MARIADB_AUTO_UPGRADE flag
**** Suggest we document the known job dependencies in a human readable format, then encode this into zuul, then we can switch to parallel runs.
+
*** Etherpad, Gitea, Gerrit, and Mailman could use upgrades.
**** https://review.opendev.org/c/opendev/system-config/+/807672
+
*** https://review.opendev.org/c/opendev/system-config/+/911000 Upgrade etherpad mariadb to 10.11
***** should list dependencies for all jobs
+
** AFS Mirror cleanups (clarkb 20240220)
***** zuul doesn't trigger on this?  not sure on best approach to make it mergable
+
*** Ubuntu Xenial is next but currently busy with PTG, Release, and other tasks.
**** https://review.opendev.org/c/opendev/base-jobs/+/807807
+
*** Can followup with webserver log processing to determine which other mirrors may be dead.
***** currently every executor adds keys for bridge, then logs in and clones system-config before running playbooks
+
** Rebuilding Gerrit Images (clarkb 20240312)
***** this change makes split jobs to do this.  however, production remains the same as both are called.
+
*** Gerrit 3.9.2 has been released finally.
**** https://review.opendev.org/c/opendev/system-config/+/807808
+
*** https://review.opendev.org/c/opendev/system-config/+/912470 Update our 3.9 image to 3.9.2
***** this is a follow-on that adds a base job to clone system-config, and stops the other production jobs re-cloning.
+
**** This will also rebuild our 3.8.4 image so we should try and restart prod gerrit on the new 3.8.4 image when available.
***** this job must run first, but then all other jobs can run in parallel, as they are all in the same buildset and using the same "view" of system-config for that particular run
+
*** Sounds like there are a number of bugfixes that a rebuild will get us. May be worth doing this just after the openstack release completes?
** Gerrit Account cleanups (clarkb 20211116)
+
** Review02 had an oops last night (clarkb 20240326)
*** 33 conflicts remain. Clarkb has written notes on proposed plans for each user in the comments of review02:~clarkb/gerrit_user_cleanups/audit-results-annotated.yaml
+
*** Found the server was shutdown. After giving it a few minutes to potentially resolve itself (mostly worried about cloud action) clarkb proceeded to manually start the instance then start the containers.
** Zuul multi scheduler setup (clarkb 20211116)
+
*** mnaser reports it may have been an OOM event on the hosting side.
*** Zuul is currently running with two schedulers (zuul01.o.o and zuul02.o.o with zuul02.o.o being "primary")
+
** Rackspace MFA Requirement (clarkb 20240312)
*** Did first rolling restart of schedulers over the weekend.
+
*** MFA is enabled. Enforcement day is today. Please lookout for any issues.
*** Zuul-web should return consistent results now as it talk to ZooKeeper directly.
+
** Project Renames (clarkb 20240227)
** User management on our systems (clarkb 20211116)
+
*** https://review.opendev.org/c/opendev/system-config/+/911622 Move gerrit replication queue aside during project renames.
*** Give gerritbot and matrix-gerritbot a shared user: https://review.opendev.org/c/opendev/system-config/+/816769/
+
*** Penciled in April 19, 2024 submit your rename requests now.
*** Eventually convert mariadb container's from uid 999 to something that makes more sense on the system.
+
** Nodepool image delete after upload (clarkb 20240319)
** Caching openstack/openstack on our DIB images (clarkb 20211116)
+
*** Nodepool now has the ability to delete on disk files for images after they are uploaded. We could potentially keep only small qcow2s using this functionality to save disk space.
*** There are semi frequent errors when updating the DIB cache for openstack/openstack
 
*** Seems related to verifying or updating submodule content.
 
*** Should we simply stop caching this repo entirely? It isn't really used for much.
 
** UbuntuOne/Launchpad two-factor OpenID authentication availability (fungi 20211130)
 
*** http://lists.opendev.org/pipermail/service-discuss/2021-November/000298.html
 
** Adding a lists.openinfra.dev mailman site (fungi 20211130)
 
*** https://review.opendev.org/818826
 
** Proxying and caching Ansible Galaxy in our providers (fungi 20211130)
 
*** https://review.opendev.org/818787
 
  
 
* Open discussion
 
* Open discussion
Line 58: Line 49:
 
== Upcoming Project Renames ==
 
== Upcoming Project Renames ==
 
(any additions should mention original->new full names and link to the corresponding project-config rename change in Gerrit)
 
(any additions should mention original->new full names and link to the corresponding project-config rename change in Gerrit)
 +
Changes should have their topic set to project-rename.
  
* Rename foo/example -> bar/example: https://review.opendev.org/123456
+
* Rename vexxhost/ansible-role-frrouting > openstack/ansible-role-frrouting: https://review.opendev.org/c/openstack/project-config/+/910018
  
 
== Previous meetings ==
 
== Previous meetings ==
 
Previous meetings, with their notes and logs, can be found at http://eavesdrop.openstack.org/meetings/infra/ and earlier at http://eavesdrop.openstack.org/meetings/ci/
 
Previous meetings, with their notes and logs, can be found at http://eavesdrop.openstack.org/meetings/infra/ and earlier at http://eavesdrop.openstack.org/meetings/ci/

Latest revision as of 16:08, 26 March 2024

Weekly Project Infrastructure team meeting

The OpenDev Team holds public weekly meetings in #opendev-meeting on OFTC, Tuesdays at 1900 UTC. Everyone interested in infrastructure and process surrounding automated testing and deployment is encouraged to attend.

Please feel free to add agenda items (and your IRC nick in parenthesis).

Agenda for next meeting

  • Actions from last meeting
  • Specs Review
  • Topics
    • Upgrading Bionic servers to Focal/Jammy (clarkb 20230627)
    • MariaDB Upgrades (clarkb 20240220)
    • AFS Mirror cleanups (clarkb 20240220)
      • Ubuntu Xenial is next but currently busy with PTG, Release, and other tasks.
      • Can followup with webserver log processing to determine which other mirrors may be dead.
    • Rebuilding Gerrit Images (clarkb 20240312)
      • Gerrit 3.9.2 has been released finally.
      • https://review.opendev.org/c/opendev/system-config/+/912470 Update our 3.9 image to 3.9.2
        • This will also rebuild our 3.8.4 image so we should try and restart prod gerrit on the new 3.8.4 image when available.
      • Sounds like there are a number of bugfixes that a rebuild will get us. May be worth doing this just after the openstack release completes?
    • Review02 had an oops last night (clarkb 20240326)
      • Found the server was shutdown. After giving it a few minutes to potentially resolve itself (mostly worried about cloud action) clarkb proceeded to manually start the instance then start the containers.
      • mnaser reports it may have been an OOM event on the hosting side.
    • Rackspace MFA Requirement (clarkb 20240312)
      • MFA is enabled. Enforcement day is today. Please lookout for any issues.
    • Project Renames (clarkb 20240227)
    • Nodepool image delete after upload (clarkb 20240319)
      • Nodepool now has the ability to delete on disk files for images after they are uploaded. We could potentially keep only small qcow2s using this functionality to save disk space.
  • Open discussion

Upcoming Project Renames

(any additions should mention original->new full names and link to the corresponding project-config rename change in Gerrit) Changes should have their topic set to project-rename.

Previous meetings

Previous meetings, with their notes and logs, can be found at http://eavesdrop.openstack.org/meetings/infra/ and earlier at http://eavesdrop.openstack.org/meetings/ci/