Difference between revisions of "Meetings/InfraTeamMeeting"
< Meetings
Clark Boylan (talk | contribs) (→Agenda for next meeting) |
Clark Boylan (talk | contribs) (→Agenda for next meeting) |
||
Line 10: | Line 10: | ||
* Announcements | * Announcements | ||
+ | ** Gerrit User Summit happening December 2&3 virtually. | ||
+ | ** clarkb out next week. Should we skip the meeting November 23? | ||
* Actions from last meeting | * Actions from last meeting | ||
Line 16: | Line 18: | ||
* Topics | * Topics | ||
− | ** Improving OpenDev's CD throughput (clarkb | + | ** Improving OpenDev's CD throughput (clarkb 20211116) |
*** We can run many of our jobs in parallel in all of our CD pipelines. But this requires we properly document/address dependencies | *** We can run many of our jobs in parallel in all of our CD pipelines. But this requires we properly document/address dependencies | ||
**** Need to understand our job dependencies and properly note them in Zuul config or address them by combining jobs. | **** Need to understand our job dependencies and properly note them in Zuul config or address them by combining jobs. | ||
Line 32: | Line 34: | ||
***** this is a follow-on that adds a base job to clone system-config, and stops the other production jobs re-cloning. | ***** this is a follow-on that adds a base job to clone system-config, and stops the other production jobs re-cloning. | ||
***** this job must run first, but then all other jobs can run in parallel, as they are all in the same buildset and using the same "view" of system-config for that particular run | ***** this job must run first, but then all other jobs can run in parallel, as they are all in the same buildset and using the same "view" of system-config for that particular run | ||
− | ** Gerrit Account cleanups (clarkb | + | ** Gerrit Account cleanups (clarkb 20211116) |
*** 33 conflicts remain. Clarkb has written notes on proposed plans for each user in the comments of review02:~clarkb/gerrit_user_cleanups/audit-results-annotated.yaml | *** 33 conflicts remain. Clarkb has written notes on proposed plans for each user in the comments of review02:~clarkb/gerrit_user_cleanups/audit-results-annotated.yaml | ||
− | ** Zuul multi scheduler setup (clarkb | + | ** Zuul multi scheduler setup (clarkb 20211116) |
*** Zuul is currently running with two schedulers (zuul01.o.o and zuul02.o.o with zuul02.o.o being "primary") | *** Zuul is currently running with two schedulers (zuul01.o.o and zuul02.o.o with zuul02.o.o being "primary") | ||
− | *** | + | *** Did first rolling restart of schedulers over the weekend. |
− | + | *** Zuul-web should return consistent results now as it talk to ZooKeeper directly. | |
− | *** | + | ** User management on our systems (clarkb 20211116) |
− | ** User management on our systems (clarkb | ||
*** Be explicit about uid/gid ranges: https://review.opendev.org/c/opendev/system-config/+/816869/ | *** Be explicit about uid/gid ranges: https://review.opendev.org/c/opendev/system-config/+/816869/ | ||
**** 0-999 system, 1000-1999 unallocated, 2000-2999 for infra-root users, 3000-9999 host level users, 10k - 64k container users that need uids on the host as well for bind mounts. | **** 0-999 system, 1000-1999 unallocated, 2000-2999 for infra-root users, 3000-9999 host level users, 10k - 64k container users that need uids on the host as well for bind mounts. | ||
− | |||
*** Give gerritbot and matrix-gerritbot a shared user: https://review.opendev.org/c/opendev/system-config/+/816769/ | *** Give gerritbot and matrix-gerritbot a shared user: https://review.opendev.org/c/opendev/system-config/+/816769/ | ||
*** Eventually convert mariadb container's from uid 999 to something that makes more sense on the system. | *** Eventually convert mariadb container's from uid 999 to something that makes more sense on the system. | ||
+ | ** Caching openstack/openstack on our DIB images (clarkb 20211116) | ||
+ | *** There are semi frequent errors when updating the DIB cache for openstack/openstack | ||
+ | *** Seems related to verifying or updating submodule content. | ||
+ | *** One theory is that we replicate openstack/openstack's submodule updates before we push the new refs to the other repos. Then if DIB fetches in that window of time it is an error. | ||
+ | *** Should we simply stop caching this repo entirely? It isn't really used for much. | ||
* Open discussion | * Open discussion |
Revision as of 20:31, 15 November 2021
Contents
Weekly Project Infrastructure team meeting
The OpenDev Team holds public weekly meetings in #opendev-meeting
on OFTC, Tuesdays at 1900 UTC. Everyone interested in infrastructure and process surrounding automated testing and deployment is encouraged to attend.
Please feel free to add agenda items (and your IRC nick in parenthesis).
Agenda for next meeting
- Announcements
- Gerrit User Summit happening December 2&3 virtually.
- clarkb out next week. Should we skip the meeting November 23?
- Actions from last meeting
- Specs Review
- Topics
- Improving OpenDev's CD throughput (clarkb 20211116)
- We can run many of our jobs in parallel in all of our CD pipelines. But this requires we properly document/address dependencies
- Need to understand our job dependencies and properly note them in Zuul config or address them by combining jobs.
- Example 1: Combine service-gitea-lb and service-gitea jobs.
- Example 2: Combine letsencrypt and nameserver jobs
- Example 3: Have all jobs with webserver config express a dependency on the letsencrypt job
- Suggest we document the known job dependencies in a human readable format, then encode this into zuul, then we can switch to parallel runs.
- https://review.opendev.org/c/opendev/system-config/+/807672
- should list dependencies for all jobs
- zuul doesn't trigger on this? not sure on best approach to make it mergable
- https://review.opendev.org/c/opendev/base-jobs/+/807807
- currently every executor adds keys for bridge, then logs in and clones system-config before running playbooks
- this change makes split jobs to do this. however, production remains the same as both are called.
- https://review.opendev.org/c/opendev/system-config/+/807808
- this is a follow-on that adds a base job to clone system-config, and stops the other production jobs re-cloning.
- this job must run first, but then all other jobs can run in parallel, as they are all in the same buildset and using the same "view" of system-config for that particular run
- Need to understand our job dependencies and properly note them in Zuul config or address them by combining jobs.
- We can run many of our jobs in parallel in all of our CD pipelines. But this requires we properly document/address dependencies
- Gerrit Account cleanups (clarkb 20211116)
- 33 conflicts remain. Clarkb has written notes on proposed plans for each user in the comments of review02:~clarkb/gerrit_user_cleanups/audit-results-annotated.yaml
- Zuul multi scheduler setup (clarkb 20211116)
- Zuul is currently running with two schedulers (zuul01.o.o and zuul02.o.o with zuul02.o.o being "primary")
- Did first rolling restart of schedulers over the weekend.
- Zuul-web should return consistent results now as it talk to ZooKeeper directly.
- User management on our systems (clarkb 20211116)
- Be explicit about uid/gid ranges: https://review.opendev.org/c/opendev/system-config/+/816869/
- 0-999 system, 1000-1999 unallocated, 2000-2999 for infra-root users, 3000-9999 host level users, 10k - 64k container users that need uids on the host as well for bind mounts.
- Give gerritbot and matrix-gerritbot a shared user: https://review.opendev.org/c/opendev/system-config/+/816769/
- Eventually convert mariadb container's from uid 999 to something that makes more sense on the system.
- Be explicit about uid/gid ranges: https://review.opendev.org/c/opendev/system-config/+/816869/
- Caching openstack/openstack on our DIB images (clarkb 20211116)
- There are semi frequent errors when updating the DIB cache for openstack/openstack
- Seems related to verifying or updating submodule content.
- One theory is that we replicate openstack/openstack's submodule updates before we push the new refs to the other repos. Then if DIB fetches in that window of time it is an error.
- Should we simply stop caching this repo entirely? It isn't really used for much.
- Improving OpenDev's CD throughput (clarkb 20211116)
- Open discussion
Upcoming Project Renames
(any additions should mention original->new full names and link to the corresponding project-config rename change in Gerrit)
- Rename foo/example -> bar/example: https://review.opendev.org/123456
Previous meetings
Previous meetings, with their notes and logs, can be found at http://eavesdrop.openstack.org/meetings/infra/ and earlier at http://eavesdrop.openstack.org/meetings/ci/