Jump to: navigation, search

Meetings/InfraTeamMeeting

< Meetings
Revision as of 16:49, 30 March 2026 by Clark Boylan (talk | contribs) (Agenda for next meeting)

Weekly Project Infrastructure team meeting

The OpenDev Team holds public weekly meetings in #opendev-meeting on OFTC, Tuesdays at 1900 UTC. Everyone interested in infrastructure and process surrounding automated testing and deployment is encouraged to attend.

Please feel free to add agenda items (and your IRC nick in parenthesis).

Agenda for next meeting

  • Actions from last meeting
  • Specs Review
  • Topics
    • Upgrading Old Servers (clarkb 20230627)
      • https://etherpad.opendev.org/p/opendev-server-upgrade-planning Central tracking document which may link to more host specific documents
      • Next on the list are graphite and backup servers
      • Can probably spin up new backup servers alongside the old ones then migrate the old volumes off the old servers to the new ones and finally delete the old servers. Just need to double check borg version support matrix details and also what adding new backup servers will do to our cron job setups for backups.
      • mnasiadka has been working to replace some of the older mirror nodes. Please look out for changes related to this effort.
      • Remember to use launch-node's --config-drive flag when booting new Noble nodes in Rax Classic
    • Dealing with web crawlers (clarkb 20251216)
      • Turns out mod_security is not great at handling ddos scenarios due to the size of the its database storing only IP addresses
      • Larger servers have helped static.opendev.org keep up with the demand. One server dedicated to everything but docs.openstack.org and a single server for docs.openstack.org
      • Can probably consolidate back to a single larger server or multiple server of the same size balancing requests (either via haproxy of DNS round robin)
      • https://review.opendev.org/c/opendev/system-config/+/981932 Anubis for Lists
    • Deploying a Prometheus for Server Metrics (clarkb 20260331)
      • https://review.opendev.org/c/opendev/system-config/+/980840
      • This change and its child deploy prometheus with node exporter to collect server metrics
      • Napkin math says that a 1TB volume should get us about 60 days of metrics. mnasiadka also indicates that Prometheus doesn't handle longer term metrics super well
      • Ideally we would collect at least a years' worth of data. Can we make that happen with Prometheus?
      • Do we need to look at Prometheus adjacent tools like Mimir or Thanos?
        • Both of these solutions seem to tie into Prometheus using Prometheus as the data collection system. Then they store the data in a different system which can handle long term storage more nimbly. Then for queries they speak promql and prometheus apis allowing you to point tools like grafana at them as if they were prometheus.
    • Upgrade Ansible to v9 (clarkb 20260310)
    • Gerrit Account Cleanups (clarkb 20260317)
      • Since the upgrade to Gerrit notedb we've had account inconsistencies that prevent us from push to the external ids ref/table directly.
      • clarkb did a bunch of work to get the number down from hundreds to about 33 consistency errors before stalling out.
      • The tail was the most difficult as it wasn't clear what the more appropriate fix for each account would be
      • Since then it has been years and those accounts are likely inactive and unused. We can rerun the Gerrit consistency check, feed the info back through our audit script then decide if we need to be careful with any of these accounts
      • Chances are we can simply disable them all and remove the conflicting external ids.
      • If we take good notes we can reconstruct the accounts as appropriate after the fact without Gerrit downtime should one of these users show up and wonder what happened.
    • Gerrit 3.12 and 3.13 Upgrade Planning (clarkb 20260310)
    • Python 3.14 Base Images (clarkb 20260331)
      • We are currently building Python 3.11 and 3.12 images. Rather than step through 3.13 to 3.14 we can skip 3.13 entirely and avoid some unnecessary work.
      • https://review.opendev.org/c/opendev/system-config/+/982180
      • The major drawback is that none of our current test nodes have python3.14 packages so we have to rely on pyenv or wait for functional Ubuntu Resolute images
      • clarkb did testing with the speculative 3.14 images and lodgeit and they do seem to work
    • Ubuntu Resolute Test Nodes (clarkb 20260331)
    • OpenInfra PTG Prep (clarkb 20260331)
      • The next PTG is happening April 20-24.
      • We will want to put meetpad and jvm nodes in the emergency file prior to the event to prevent unwanted upgrade disruptions.
      • Is there any other prep work that we think should be done ahead of the event?
  • Open discussion

Upcoming Project Renames

(any additions should mention original->new full names and link to the corresponding project-config rename change in Gerrit) Changes should have their topic set to project-rename.

Previous meetings

Previous meetings, with their notes and logs, can be found at http://eavesdrop.openstack.org/meetings/infra/ and earlier at http://eavesdrop.openstack.org/meetings/ci/