Jump to: navigation, search

Infrastructure Status

  • 2017-10-18 14:56:18 UTC Gerrit account 8944 set to inactive to handle a duplicate account issue
  • 2017-10-18 04:45:12 UTC review.o.o hard rebooted due to failure during live migration (rax ticket: 171018-ord-0000074). manually restarted gerrit after boot, things seem ok now
  • 2017-10-18 00:33:55 UTC due to unscheduled restart of zuulv3.o.o you will need to 'recheck' your jobs that were last running. Sorry for the inconvenience.
  • 2017-10-16 15:21:53 UTC elasticsearch cluster is now green after triggering index curator early to clear out old indexes "lost" on es07
  • 2017-10-16 03:05:41 UTC elasticsearch07.o.o rebooted & elasticsearch started. data was migrated from SSD storage and "main" vg contains only one block device now
  • 2017-10-15 22:06:10 UTC Zuul v3 rollout maintenance is underway, scheduled to conclude by 23:00 UTC: http://lists.openstack.org/pipermail/openstack-dev/2017-October/123618.html
  • 2017-10-15 21:20:10 UTC Zuul v3 rollout maintenance begins at 22:00 UTC (roughly 45 minutes from now): http://lists.openstack.org/pipermail/openstack-dev/2017-October/123618.html
  • 2017-10-12 23:06:18 UTC Workarounds are in place for libcurl and similar dependency errors due to stale ubuntu mirroring, and for POST_FAILURE results stemming from runaway inode utilization on the logs site; feel free to recheck failing changes for either of these problems now
  • 2017-10-12 16:04:42 UTC removed mirror.npm volume from afs
  • 2017-10-12 14:57:16 UTC Job log uploads are failing due to lack of inodes. Jobs also fail due to mismatches in gnutls packages. Workarounds for both in progress with proper fixes to follow.
  • 2017-10-11 17:26:50 UTC move Gerrit account 27031s' openid to account 21561 and marked 27031 inactive
  • 2017-10-11 13:07:12 UTC Due to unrelated emergencies, the Zuul v3 rollout has not started yet; stay tuned for further updates
  • 2017-10-11 11:13:10 UTC deleted the errant review/andreas_jaeger/zuulv3-unbound branch from the openstack-infra/project-config repository (formerly at commit 2e8ae4da5d422df4de0b9325bd9c54e2172f79a0)
  • 2017-10-11 10:10:02 UTC The CI system will be offline starting at 11:00 UTC (in just under an hour) for Zuul v3 rollout: http://lists.openstack.org/pipermail/openstack-dev/2017-October/123337.html
  • 2017-10-11 07:46:41 UTC Lots of RETRY_LIMIT errors due to unbound useage with Zuul v3, we reverted the change; recheck your changes
  • 2017-10-10 01:43:02 UTC manually rotated all logs on zuulv3.openstack.org as a stop-gap to prevent a full rootfs later when scheduled log rotation kicks in; an additional 14gib were freed as a result
  • 2017-10-10 00:43:23 UTC restart of *gerrit* complete
  • 2017-10-10 00:39:01 UTC restarting zuul after prolonged period of high GC activity is causing 502 errors
  • 2017-10-09 20:53:36 UTC cleared all old workspaces on signing01.ci to deal with those which had cached git remotes to some no-longer-existing zuul v2 mergers
  • 2017-10-05 00:51:41 UTC updated openids in the storyboard.openstack.org database from login.launchpad.net to login.ubuntu.com
  • 2017-10-04 06:31:26 UTC The special infra pipelines in zuul v3 have disappared
  • 2017-10-03 03:00:20 UTC zuulv3 restarted with 508786 508787 508793 509014 509040 508955 manually applied; should fix branch matchers, use *slightly* less memory, and fix the 'base job not defined' error
  • 2017-10-02 12:50:51 UTC Restarted nodepool-launcher on nl01 and nl02 to fix zookeeper connection
  • 2017-10-02 12:45:00 UTC ran `sudo -u zookeeper ./zkCleanup.sh /var/lib/zookeeper 3` in /usr/share/zookeeper/bin on nodepool.openstack.org to free up 22gib of space for its / filesystem
  • 2017-09-28 22:41:03 UTC zuul.openstack.org has been added to the emergency disable list so that a temporary redirect to zuulv3 can be installed by hand
  • 2017-09-28 14:44:03 UTC The infra team is now taking Zuul v2 offline and bringing Zuul v3 online. Please see https://docs.openstack.org/infra/manual/zuulv3.html for more information, and ask us in #openstack-infra if you have any questions.
  • 2017-09-26 23:40:51 UTC project-config is unable to merge changes due to problems found during zuul v3 migration. for the time being, if any emergency changes are needed (eg, nodepool config), please discuss in #openstack-infra and force-merge them.
  • 2017-09-26 18:25:58 UTC The infra team is continuing work to bring Zuul v3 online; expect service disruptions and please see https://docs.openstack.org/infra/manual/zuulv3.html for more information.
  • 2017-09-25 23:37:33 UTC project-config is frozen until further notice for the zuul v3 transition; please don't approve any changes without discussion with folks familiar with the migration in #openstack-infra
  • 2017-09-25 20:52:05 UTC The infra team is bringing Zuul v3 online; expect service disruptions and please see https://docs.openstack.org/infra/manual/zuulv3.html for more information.
  • 2017-09-25 15:50:39 UTC deleted all workspaces from release.slave.openstack.org to deal with changes to zuul v2 mergers
  • 2017-09-22 21:33:40 UTC jeepyb and gerritlib fixes for adding project creator to new groups on Gerrit project creation in process of getting landed. Please double check group membership after the next project creation.
  • 2017-09-22 19:12:01 UTC /vicepa filesystem on afs01.ord.openstack.org has been repaired and vos release of docs and docs.dev volumes have resumed to normal frequency
  • 2017-09-22 17:39:22 UTC When seeding initial group members in Gerrit remove the openstack project creator account until jeepyb is updated to do so automatically
  • 2017-09-22 11:06:09 UTC no content is currently pushed to docs.openstack.org - post jobs run successfully but docs.o.o is not updated
  • 2017-09-21 19:23:16 UTC OpenIDs for the Gerrit service have been restored from a recent backup and the service is running again; before/after table states are being analyzed now to identify any remaining cleanup needed for changes made to accounts today
  • 2017-09-21 18:25:35 UTC The Gerrit service on review.openstack.org is being taken offline briefly to perform database repair work but should be back up shortly
  • 2017-09-21 18:19:03 UTC Gerrit OpenIDs have been accidentally overwritten and are in the process of being restored
  • 2017-09-21 17:54:32 UTC nl01.o.o and nl02.o.o are both back online with site-specific nodepool.yaml files.
  • 2017-09-21 14:08:07 UTC nodepool.o.o removed from emergency file, ovh-bhs1 came back online at 03:45z.
  • 2017-09-21 13:39:00 UTC Gerrit account 8971 for "Fuel CI" has been disabled due to excessive failure comments
  • 2017-09-21 02:50:04 UTC OVH-BHS1 mirror has disappeared unexpectedly. did not respond to hard reboot. nodepool.o.o in emergency file and region max-servers set to 0
  • 2017-09-20 23:17:13 UTC Please don't merge any new project creation changes until mordred gives the go ahead. We have new puppet problems on the git backends and there are staged jeepyb changes we want to watch before opening the flood gates
  • 2017-09-20 20:21:59 UTC nb03.o.o / nb04.o.o added to emergency file
  • 2017-09-19 23:42:19 UTC Gerrit is once again part of normal puppet config management. Problems with Gerrit gitweb links and Zuul post jobs have been addressed. We currently cannot create new gerrit projects (fixes in progress) and email sending is slow (being debugged).
  • 2017-09-19 22:34:37 UTC Gerrit is being restarted to address some final issues, review.openstack.org will be inaccessible for a few minutes while we restart
  • 2017-09-19 20:28:23 UTC Zuul and Gerrit are being restarted to address issues discovered with the Gerrit 2.13 upgrade. review.openstack.org will be inaccessible for a few minutes while we make these changes. Currently running jobs will be restarted for you once Zuul and Gerrit are running again.
  • 2017-09-19 07:25:16 UTC Post jobs are not executed currently, do not tag any releases
  • 2017-09-19 07:13:26 UTC Zuul is not running any post jobs
  • 2017-09-19 02:42:08 UTC Gerrit is being restarted to feed its insatiable memory appetite
  • 2017-09-19 00:10:07 UTC please avoid merging new project creation changes until after we have the git backends puppeting properly
  • 2017-09-18 23:48:12 UTC review.openstack.org Gerrit 2.13 upgrade is functionally complete. The Infra team will be cleaning up bookkeeping items over the next couple days. If you have any questions please let us know
  • 2017-09-18 23:34:42 UTC review.openstack.org added to emergency file until git.o.o puppet is fixed and we can supervise a puppet run on review.o.o
  • 2017-09-18 16:40:08 UTC The Gerrit service at https://review.openstack.org/ is offline, upgrading to 2.13, for an indeterminate period of time hopefully not to exceed 23:59 UTC today: http://lists.openstack.org/pipermail/openstack-dev/2017-August/120533.html
  • 2017-09-18 15:04:04 UTC The Gerrit service at https://review.openstack.org/ is offline, upgrading to 2.13, for an indeterminate period of time hopefully not to exceed 23:59 UTC today: http://lists.openstack.org/pipermail/openstack-dev/2017-August/120533.html
  • 2017-09-18 14:33:25 UTC Gerrit will be offline for the upgrade to 2.13 starting at 15:00 UTC (in roughly 30 minutes) and is expected to probably be down/unusable for 8+ hours while an offline reindex is performed: http://lists.openstack.org/pipermail/openstack-dev/2017-August/120533.html
  • 2017-09-18 13:48:14 UTC accountPatchReviewDb database created and gerrit2 account granted access in Review-MySQL trove instance, in preparation for upcoming gerrit upgrade maintenance
  • 2017-09-18 13:38:33 UTC updatepuppetmaster cron job on puppetmaster.openstack.org has been disabled in preparation for the upcoming gerrit upgrade maintenance
  • 2017-09-18 13:38:31 UTC Gerrit will be offline for the upgrade to 2.13 starting at 15:00 UTC (in roughly 1.5 hours) and is expected to probably be down/unusable for 8+ hours while an offline reindex is performed: http://lists.openstack.org/pipermail/openstack-dev/2017-August/120533.html
  • 2017-09-18 12:07:34 UTC Gerrit will be offline for the upgrade to 2.13 starting at 15:00 UTC (in roughly 3 hours) and is expected to probably be down/unusable for 8+ hours while an offline reindex is performed: http://lists.openstack.org/pipermail/openstack-dev/2017-August/120533.html
  • 2017-09-17 15:30:17 UTC Zuul has been fixed, you can approve changes again.
  • 2017-09-17 05:52:25 UTC Zuul is currently not moving any changes into the gate queue. Wait with approving changes until this is fixed.
  • 2017-09-17 01:06:37 UTC Zuul has been restarted to pick up a bug fix in prep for Gerrit upgrade. Changes have been reenqueued for you.
  • 2017-09-16 14:21:27 UTC OpenStack CI is fixed and fully operational again, feel free to "recheck" your jobs
  • 2017-09-16 09:12:28 UTC OpenStack CI is currently not recording any votes in gerrit. Do not recheck your changes until this is fixed.
  • 2017-09-14 23:12:24 UTC Artifact signing key for Pike has been retired; key for Queens is now in production
  • 2017-09-13 23:05:46 UTC CentOS 7.4 point release today has resulted in some mirror disruption, repair underway; expect jobs on centos7 nodes to potentially fail for a few hours longer
  • 2017-09-13 14:36:45 UTC increased ovh quotas to bhs1:80 gra1:50 as we haven't had launch errors recently according to grafana
  • 2017-09-11 22:50:48 UTC zm05.o.o - zm08.o.o now online running on ubuntu xenial
  • 2017-09-09 00:17:24 UTC nodepool.o.o added to ansible emergency file so that we can hand tune the max-servers in ovh. Using our previous numbers results in lots of 500 errors from the clouds
  • 2017-09-08 16:08:20 UTC New 1TB cinder volume attached to Rax ORD backup server and backups filesystem extended to include that space. This was done in response to a full filesystem. Backups should begin functioning again on the next pulse.
  • 2017-09-08 13:48:12 UTC nodepool issue related to bad images has been resolved, builds should be coming back online soon. Restarted gerrit due to reasons. Happy Friday.
  • 2017-09-08 10:48:41 UTC Our CI systems experience a hickup, no new jobs are started. Please stay tuned and wait untils this resolved.
  • 2017-09-05 22:47:53 UTC logstash-worker16.o.o to logstash-worker20.o.o deleted in rackspace
  • 2017-09-04 19:18:46 UTC ubuntu-xenial nodepool-launcher (nl02.o.o) online
  • 2017-09-04 19:17:35 UTC logstash-worker16.o.o to logstash-worker20.o.o services stopped
  • 2017-08-29 18:00:17 UTC /etc/hosts on mirror.regionone.infracloud-vanilla.org has buildlogs.centos.org pinned to 38.110.33.4. This is temporary to see if round robin DNS is our issues when we proxy to buildlogs.centos.org
  • 2017-08-29 16:20:39 UTC replaced myself with clarkb at https://review.openstack.org/#/admin/groups/infra-ptl
  • 2017-08-28 12:11:46 UTC restarted ptgbot service on eavesdrop at 11:29 utc; was disconnected from freenode 2017-08-26 02:29 utc due to an irc ping timeout
  • 2017-08-24 16:00:19 UTC hound service on codesearch.o.o stopped / started to pick up new projects for indexing
  • 2017-08-23 23:17:52 UTC infracloud-vanilla is offline due to the keystone certificate expiring. this has also broken puppet-run-all on puppetmaster.
  • 2017-08-22 07:43:46 UTC Gerrit has been restarted successfully
  • 2017-08-22 07:37:59 UTC Gerrit is going to be restarted due to slow performance
  • 2017-08-17 16:10:10 UTC deleted mirror.mtl01.internap.openstack.org (internap -> inap rename)
  • 2017-08-17 04:21:46 UTC all RAX mirror hosts (iad, ord and dfw) migrated to new Xenial based hosts
  • 2017-08-16 23:43:32 UTC renamed nodepool internap provider to inap. new mirror server in use.
  • 2017-08-16 19:55:08 UTC zuul v3 executors ze02, ze03, ze04 are online
  • 2017-08-16 19:54:55 UTC zuul v2 launchers zl07, zl08, zl09 have been deleted due to reduced cloud capacity and to make way for zuul v3 executors
  • 2017-08-16 13:01:36 UTC trove configuration "sanity" created in rax dfw for mysql 5.7, setting our usual default overrides (wait_timeout=28800, character_set_server=utf8, collation_server=utf8_bin)
  • 2017-08-15 20:35:51 UTC created auto hold for gate-tripleo-ci-centos-7-containers-multinode to debug docker.io issues with reverse proxy
  • 2017-08-15 18:42:16 UTC mirror.sto2.citycloud.o.o DNS updated to 46.254.11.19 TTL 60
  • 2017-08-14 15:29:31 UTC mirror.kna1.citycloud.openstack.org DNS entry updated to 91.123.202.15
  • 2017-08-11 20:39:46 UTC created mirror.mtl01.inap.openstack.org to replace mirror.mtl01.internap.openstack.org (internap -> inap rename)
  • 2017-08-11 19:20:14 UTC The apps.openstack.org server has been stopped, snapshotted one last time, and deleted.
  • 2017-08-11 05:00:49 UTC restarted mirror.ord.rax.openstack.org per investigation in https://bugs.launchpad.net/openstack-gate/+bug/1708707 which suggested apache segfaults causing pypi download failures. Will monitor
  • 2017-08-10 23:47:46 UTC removed 8.8.8.8 dns servers from both infracloud-chocolate and infracloud-vanilla provider-subnet-infracloud subnet
  • 2017-08-10 20:03:12 UTC Image builds manually queued for centos-7, debian-jessie, fedora-25, fedora-26, opensuse-423, ubuntu-trusty and ubuntu-xenial to use latest glean (1.9.2)
  • 2017-08-10 19:50:10 UTC glean 1.9.2 released to properly support vfat configdrive labels
  • 2017-08-10 12:27:50 UTC mirror.lon1.citycloud.openstack.org migrated to a new compute node by Kim from citycloud. appears up. nodepool conf restored & nodepool.o.o taken out of emergency file
  • 2017-08-10 12:13:12 UTC nodepool in emergency file and citycloud-lon1 region commented out while we investigate issues with mirror
  • 2017-08-09 20:18:19 UTC OVH ticket 8344470555 has been opened to track voucher reinstatement/refresh
  • 2017-08-08 00:07:46 UTC Gerrit on review.openstack.org restarted just now, and is no longer using contact store functionality or configuration options
  • 2017-08-07 23:34:49 UTC The Gerrit service on review.openstack.org will be offline momentarily at 00:00 utc for a quick reconfiguration-related restart
  • 2017-08-07 16:38:16 UTC temporarily blocked 59.108.63.126 in iptables on static.openstack.org due to a denial of service condition involving tarballs.o.o/kolla/images/centos-source-registry-ocata.tar.gz
  • 2017-08-04 20:37:45 UTC Gerrit is being restarted to pick up CSS changes and should be back momentarily
  • 2017-08-02 20:00:10 UTC OSIC environment is active in Nodepool and running jobs normally once more
  • 2017-08-02 17:29:57 UTC infracloud-vanilla back online
  • 2017-08-02 14:18:29 UTC mirror.regionone.infracloud-vanilla.openstack.org DNS updated to 15.184.65.187
  • 2017-08-02 13:59:00 UTC We have disable infracloud-vanilla due to the compute host running mirror.regionone.infracloud-vanilla.o.o being offline. Please recheck your failed jobs to schedule them to another cloud.
  • 2017-08-01 23:49:09 UTC osic nodes have been removed from nodepool due to a problem with the mirror host beginning around 22:20 UTC. please recheck any jobs with failures installing packages.
  • 2017-08-01 22:16:19 UTC pypi mirror manually updated and released
  • 2017-08-01 21:28:46 UTC pypi mirrors have not updated since 2:15 UTC due to issue with pypi.python.org. reported issue, since corrected. mirror updates now in progress.
  • 2017-08-01 08:09:21 UTC Yolanda has started nodepool-launcher process because it was stopped for more than one hour
  • 2017-07-31 07:39:25 UTC Yolanda had to restart nodepool-launcher because vms were not being spinned and the process looked inactive for the latest 90 min
  • 2017-07-28 17:14:32 UTC The Gerrit service on review.openstack.org is being taken offline for roughly 5 minutes to perform a database backup and reconfiguration
  • 2017-07-23 23:23:03 UTC Job triggering events between 21:00 and 23:15 UTC were lost, and any patch sets uploaded or approved during that timeframe will need rechecking or reapproval before their jobs will run
  • 2017-07-22 00:27:10 UTC restarted logstash and jenkins-log-worker-{A,B,C,D} services on all logstash-workerNN servers to get logs processing again
  • 2017-07-22 00:26:02 UTC manually expired old elasticsearch shards to get the cluster back into a sane state
  • 2017-07-21 19:24:23 UTC docs.o.o is up again, https://review.openstack.org/486196 fixes it - but needed manual applying since jobs depend on accessing docs.o.o
  • 2017-07-21 18:43:07 UTC kibana on logstash.o.o is currently missing entries past 21:25 utc yesterday
  • 2017-07-21 18:42:20 UTC elasticsearch02 has been hard-rebooted via nova after it hung at roughly 21:25 utc yesterday; elasticsearch service on elasticsearch05 also had to be manually started following a spontaneous reboot from 2017-07-14 01:39..18:27 (provider ticket from that date mentions an unresponsive hypervisor host); cluster is recovering now but kibana on logstash.o.o is currently missing entries past
  • 2017-07-21 18:41:02 UTC docs.o.o is currently broken, we're investigating
  • 2017-07-21 17:07:30 UTC Restarting Gerrit for our weekly memory leak cleanup.
  • 2017-07-19 23:07:08 UTC restarted nodepool-launcher which had frozen (did not respond to SIGUSR2)
  • 2017-07-19 13:24:08 UTC the lists.o.o server is temporarily in emergency disable mode pending merger of https://review.openstack.org/484989
  • 2017-07-17 20:39:01 UTC /srv/static/tarballs/trove/images/ubuntu/mysql.qcow2 has been removed from static.openstack.org again
  • 2017-07-14 13:39:41 UTC deleted duplicate mirror.la1.citycloud and forced regeneration of dynamic inventory to get it to show up
  • 2017-07-13 19:09:37 UTC docs maintenance is complete and afsdb01 puppet and vos release cronjob have been reenabled
  • 2017-07-13 18:11:47 UTC puppet updates for afsdb01 have been temporarily suspended and its vos release cronjob disabled in preparation for manually reorganizing the docs volume
  • 2017-07-13 00:17:28 UTC zl08.o.o and zl09.o.o are now online and functional.
  • 2017-07-12 16:28:32 UTC both mirrors in infracloud-chocolate and infracloud-vanilla replaced with 250GB HDD mirror flavors now.
  • 2017-07-12 14:46:27 UTC DNS for mirror.regionone.infracloud-chocolate.openstack.org changed to 15.184.69.112, 60min TTL
  • 2017-07-12 13:22:05 UTC DNS for mirror.regionone.infracloud-vanilla.openstack.org changed to 15.184.66.172, 60min TTL
  • 2017-07-12 07:59:43 UTC Gerrit has been successfully restarted
  • 2017-07-12 07:51:20 UTC Gerrit is going to be restarted, due to low performance
  • 2017-07-12 06:53:30 UTC FYI, ask.openstack.org is down, review.o.o is slow - please have patience until this is fixed
  • 2017-07-11 18:00:06 UTC small hiccup in review-dev gerrit 2.13.8 -> 2.13.9 upgrade. Will be offline temporarily while we wait on puppet to curate lib installations
  • 2017-07-10 21:03:37 UTC 100gb cinder volume added and corresponding proxycache logical volume mounted at /var/cache/apache2 on mirrors for ca-ymq-1.vexxhost, dfw.rax, iad.rax, mtl01.internap, ord.rax, regionone.osic-cloud1
  • 2017-07-10 21:01:51 UTC zuul service on zuul.openstack.org restarted to clear memory utilization from slow leak
  • 2017-07-10 19:22:40 UTC similarly reinstalled tox on all other ubuntu-based zuul_nodes tracked in hiera (centos nodes seem to have been unaffected)
  • 2017-07-10 19:04:46 UTC reinstalled tox on proposal.slave.o.o using python 2.7, as it had defaulted to 3.4 at some point in the past (possibly related to the pip vs pip3 mixup last month)
  • 2017-07-10 17:01:01 UTC old mirror lv on static.o.o reclaimed to extend the tarballs lv by 150g
  • 2017-07-06 23:45:47 UTC nb03.openstack.org has been cleaned up and rebooted, and should return to building rotation
  • 2017-07-06 12:01:55 UTC docs.openstack.org is up again.
  • 2017-07-06 11:17:42 UTC docs.openstack.org has internal error (500). Fix is underway.
  • 2017-07-03 15:40:16 UTC "docs.openstack.org is working fine again, due to move of new location, each repo needs to merge one change to appear on docs.o.o"
  • 2017-07-03 15:26:19 UTC rebooting files01.openstack.org to clear up defunct apache2 zombies ignoring sigkill
  • 2017-07-03 15:21:17 UTC "We're experiencing a few problems with the reorg on docs.openstack.org and are looking into these..."
  • 2017-07-03 14:39:21 UTC We have switched now all docs publishing jobs to new documentation builds. For details see dhellmann's email http://lists.openstack.org/pipermail/openstack-dev/2017-July/119221.html . For problems, join us on #openstack-doc
  • 2017-07-01 00:33:44 UTC Reissued through June 2018 and manually tested all externally issued SSL/TLS certificates for our servers/services
  • 2017-06-29 18:03:43 UTC review-dev has been upgraded to gerrit 2.13.8. Please test behavior and functionality and note any abnormalities on https://etherpad.openstack.org/p/gerrit-2.13.-upgrade-steps
  • 2017-06-23 08:05:47 UTC ok git.openstack.org is working again, you can recheck failed jobs
  • 2017-06-23 06:06:21 UTC unknown issue with the git farm, everything broken - we're investigating
  • 2017-06-20 21:19:32 UTC The Gerrit service on review-dev.openstack.org is being taken offline for an upgrade to 2.13.7.4.988b40f
  • 2017-06-20 15:41:54 UTC Restarted openstack-paste service on paste.openstack.org as lodgeit runserver process was hung and unrsponsive (required sigterm followed up sighup before it would exit)
  • 2017-06-20 12:57:52 UTC restarting gerrit to address slowdown issues
  • 2017-06-18 21:29:58 UTC Image builds for ubuntu-trusty are paused and have been rolled back to yesterday until DNS issues can be unraveled
  • 2017-06-17 03:03:42 UTC zuulv3.o.o and ze01.o.o now using SSL/TLS for gearman operations
  • 2017-06-09 14:58:36 UTC The Gerrit service on review.openstack.org is being restarted now to clear an issue arising from an unanticipated SSH API connection flood
  • 2017-06-09 14:06:10 UTC Blocked 169.48.164.163 in iptables on review.o.o temporarily for excessive connection counts
  • 2017-06-07 20:40:18 UTC Blocked 60.251.195.198 in iptables on review.o.o temporarily for excessive connection counts
  • 2017-06-07 20:39:49 UTC Blocked 113.196.154.248 in iptables on review.o.o temporarily for excessive connection counts
  • 2017-06-07 20:07:25 UTC The Gerrit service on review.openstack.org is being restarted now to clear some excessive connection counts while we debug the intermittent request failures reported over the past few minutes
  • 2017-06-07 19:59:08 UTC Blocked 169.47.209.131, 169.47.209.133, 113.196.154.248 and 210.12.16.251 in iptables on review.o.o temporarily while debugging excessive connection counts
  • 2017-06-06 19:27:56 UTC both zuulv3.o.o and ze01.o.o are online and under puppet cfgmgmt
  • 2017-06-05 22:30:53 UTC Puppet updates are once again enabled for review-dev.openstack.org
  • 2017-06-05 14:37:25 UTC review-dev.openstack.org has been added to the emergency disable list for Puppet updates so additional trackingid entries can be tested there
  • 2017-06-01 14:35:16 UTC python-setuptools 36.0.1 has been released and now making its way into jobs. Feel free to 'recheck' your failures. If you have any problems, please join #openstack-infra
  • 2017-06-01 09:46:17 UTC There is a known issue with setuptools 36.0.0 and errors about the "six" package. For current details see https://github.com/pypa/setuptools/issues/1042 and monitor #openstack-infra
  • 2017-05-27 12:05:22 UTC The Gerrit service on review.openstack.org is restarting to clear some hung API connections and should return to service momentarily.
  • 2017-05-26 20:58:41 UTC OpenStack general mailing list archives from Launchpad (July 2010 to July 2013) have been imported into the current general archive on lists.openstack.org.
  • 2017-05-26 09:57:14 UTC Free space for logs.openstack.org reached 40GiB, so an early log expiration run (45 days) is underway in a root screen session.
  • 2017-05-25 23:18:21 UTC The nodepool-dsvm jobs are failing for now, until we reimplement zookeeper handling in our devstack plugin
  • 2017-05-24 17:46:12 UTC nb03.o.o and nb04.o.o are online (upgraded to xenial). Will be waiting a day or 2 before deleting nb01.o.o and nb02.o.o.
  • 2017-05-24 14:52:39 UTC both nb01.o.o and nb02.o.o are stopped. This is to allow nb03.o.o to build todays images
  • 2017-05-24 04:10:31 UTC Sufficient free space has been reclaimed that jobs are passing again; any POST_FAILURE results can now be rechecked.
  • 2017-05-23 21:25:01 UTC The logserver has filled up, so jobs are currently aborting with POST_FAILURE results; remediation is underway.
  • 2017-05-23 14:04:47 UTC Disabled Gerrit account 10842 (Xiexianbin) for posting unrequested third-party CI results on changes
  • 2017-05-17 10:55:41 UTC gerrit is being restarted to help stuck git replication issues
  • 2017-05-15 07:02:20 UTC eavesdrop is up again, logs from Sunday 21:36 to Monday 7:01 are missing
  • 2017-05-15 06:42:55 UTC eavesdrop is currently not getting updated
  • 2017-05-12 13:39:24 UTC The Gerrit service on http://review.openstack.org is being restarted to address hung remote replication tasks.
  • 2017-05-11 18:42:55 UTC OpenID authentication through LP/UO SSO is working again
  • 2017-05-11 17:29:50 UTC The Launchpad/UbuntuOne SSO OpenID provider is offline, preventing logins to review.openstack.org, wiki.openstack.org, et cetera; ETA for fix is unknown
  • 2017-05-03 18:54:36 UTC Gerrit on review.openstack.org is being restarted to accomodate a memory leak in Gerrit. Service should return shortly.
  • 2017-05-01 18:15:44 UTC Upgraded wiki.openstack.org from MediaWiki 1.28.0 to 1.28.2 for CVE-2017-0372
  • 2017-04-27 17:52:33 UTC DNS has been updated for the new redirects added to static.openstack.org, moving them off old-wiki.openstack.org (which is now being taken offline)
  • 2017-04-25 15:52:41 UTC Released bindep 2.4.0
  • 2017-04-21 20:38:54 UTC Gerrit is back in service and generally usable, though remote Git replicas (git.openstack.org and github.com) will be stale for the next few hours until online reindexing completes
  • 2017-04-21 20:06:20 UTC Gerrit is offline briefly for scheduled maintenance http://lists.openstack.org/pipermail/openstack-dev/2017-April/115702.html
  • 2017-04-21 19:44:12 UTC Gerrit will be offline briefly starting at 20:00 for scheduled maintenance http://lists.openstack.org/pipermail/openstack-dev/2017-April/115702.html
  • 2017-04-18 21:51:51 UTC nodepool.o.o restarted to pick up https://review.openstack.org/#/c/455466/
  • 2017-04-14 17:23:54 UTC vos release npm.mirror --localauth currently running from screen in afsdb01
  • 2017-04-14 02:01:28 UTC wiki.o.o required a hard restart due to host issues following rackspace network maintenence
  • 2017-04-13 19:53:37 UTC The Gerrit service on http://review.openstack.org is being restarted to address hung remote replication tasks.
  • 2017-04-13 08:52:57 UTC zuul was restarted due to an unrecoverable disconnect from gerrit. If your change is missing a CI result and isn't listed in the pipelines on http://status.openstack.org/zuul/ , please recheck
  • 2017-04-12 21:27:31 UTC Restarting Gerrit for our weekly memory leak cleanup.
  • 2017-04-11 14:48:58 UTC we have rolled back centos-7, fedora-25 and ubuntu-xenial images to the previous days release. Feel free to recheck your jobs now.
  • 2017-04-11 14:28:32 UTC latest base images have mistakenly put python3 in some places expecting python2 causing widespread failure of docs patches - fixes are underway
  • 2017-04-11 02:17:51 UTC bindep 2.3.0 relesed to fix fedora 25 image issues
  • 2017-04-09 16:23:03 UTC lists.openstack.org is back online. Thanks for your patience.
  • 2017-04-09 15:18:22 UTC We are preforming unscheduled maintenance on lists.openstack.org, the service is currently down. We'll post a follow up shortly
  • 2017-04-07 19:00:49 UTC ubuntu-precise has been removed from nodepool.o.o, thanks for the memories
  • 2017-04-06 15:00:18 UTC zuulv3 is offline awaiting a security update.
  • 2017-04-05 14:02:24 UTC git.openstack.org is synced up
  • 2017-04-05 12:53:14 UTC The Gerrit service on http://review.openstack.org is being restarted to address hung remote replication tasks, and should return to an operable state momentarily
  • 2017-04-05 11:16:06 UTC cgit.openstack.org is not up to date
  • 2017-04-04 16:13:40 UTC The openstackid-dev server has been temporarily rebuilt with a 15gb performance flavor in preparation for application load testing
  • 2017-04-01 13:29:37 UTC The http://logs.openstack.org/ site is back in operation; previous logs as well as any uploaded during the outage should be available again; jobs which failed with POST_FAILURE can also be safely rechecked.
  • 2017-03-31 21:52:06 UTC The upgrade maintenance for lists.openstack.org has been completed and it is back online.
  • 2017-03-31 20:00:04 UTC lists.openstack.org will be offline from 20:00 to 23:00 UTC for planned upgrade maintenance
  • 2017-03-31 08:27:06 UTC logs.openstack.org has corrupted disks, it's being repaired. Please avoid rechecking until this is fixed
  • 2017-03-31 07:46:38 UTC Jobs in gate are failing with POST_FAILURE. Infra roots are investigating
  • 2017-03-30 17:05:30 UTC The Gerrit service on review.openstack.org is being restarted briefly to relieve performance issues, and should return to service again momentarily.
  • 2017-03-29 18:47:18 UTC statusbot restarted since it seems to have fallen victim to a ping timeout (2017-03-26 20:55:32) and never realized it
  • 2017-03-23 19:13:06 UTC eavesdrop.o.o cinder volume rotated to avoid rackspace outage on Friday March 31 03:00-09:00 UTC
  • 2017-03-23 16:20:33 UTC Cinder volumes static.openstack.org/main08, eavesdrop.openstack.org/main01 and review-dev.openstack.org/main01 will lose connectivity Friday March 31 03:00-09:00 UTC unless replaced by Wednesday March 29.
  • 2017-03-21 08:43:22 UTC Wiki problems have been fixed, it's up and running
  • 2017-03-21 00:44:19 UTC LP bugs for monasca migrated to openstack/monasca-api in StoryBoard, defcore to openstack/defcore, refstack to openstack/refstack
  • 2017-03-16 15:59:20 UTC The Gerrit service on review.openstack.org is being restarted to address hung remote replication tasks, and should return to an operable state momentarily
  • 2017-03-16 11:49:38 UTC paste.openstack.org service is back up - turns out it was a networking issue, not a database issue. yay networks!
  • 2017-03-16 11:02:17 UTC paste.openstack.org is down, due to connectivity issues with backend database. support ticket has been created.
  • 2017-03-14 16:07:35 UTC Changes https://review.openstack.org/444323 and https://review.openstack.org/444342 have been approved, upgrading https://openstackid.org/ production to what's been running and tested on https://openstackid-dev.openstack.org/
  • 2017-03-14 13:55:27 UTC Gerrit has been successfully restarted
  • 2017-03-14 13:49:09 UTC Gerrit has been successfully restarted
  • 2017-03-14 13:42:50 UTC Gerrit is going to be restarted due to performance problems
  • 2017-03-14 04:22:30 UTC gerrit under load throwing 503 errors. Service restart fixed symptoms and appears to be running smoothly
  • 2017-03-13 17:46:25 UTC restarting gerrit to address performance problems
  • 2017-03-09 16:43:59 UTC nodepool-builder restarted on nb02.o.o after remounting /opt file system
  • 2017-03-07 15:59:57 UTC compute085.chocolate.ic.o.o back in service
  • 2017-03-07 15:46:03 UTC compute085.chocolate.ic.o.o currently disabled on controller00.chocolate.ic.o.o, investigating a failing with the neutron linuxbridge agent
  • 2017-03-06 21:33:48 UTC nova-computer for compute035.vanilla.ic.o.o has been disabled on controller.vanilla.ic.o.o. compute035.vanilla.ic.o.o appears to be having HDD issue, currently in ready-only mode.
  • 2017-03-06 21:17:46 UTC restarting gerrit to address performance problems
  • 2017-03-04 14:36:00 UTC CORRECTION: The afs01.dfw.openstack.org/main01 volume has been successfully replaced by afs01.dfw.openstack.org/main04 and is therefore no longer impacted by the coming block storage maintenance.
  • 2017-03-04 13:35:22 UTC The afs01.dfw.openstack.org/main01 volume has been successfully replaced by review.openstack.org/main02 and is therefore no longer impacted by the coming block storage maintenance.
  • 2017-03-03 21:47:51 UTC The review.openstack.org/main01 volume has been successfully replaced by review.openstack.org/main02 and is therefore no longer impacted by the coming block storage maintenance.
  • 2017-03-03 16:39:58 UTC Upcoming provider maintenance 04:00-10:00 UTC Wednesday, March 8 impacting Cinder volumes for: afs01.dfw, nb02 and review
  • 2017-03-03 14:28:54 UTC integrated gate is blocked by job waiting for trusty-multinode node
  • 2017-03-01 14:26:12 UTC Provider maintenance resulted in loss of connectivity to the static.openstack.org/main06 block device taking our docs-draft logical volume offline; filesystem recovery has been completed and the volume brought back into service.
  • 2017-02-28 23:13:36 UTC manually installed paramiko 1.18.1 on nodepool.o.o and restarted nodepool (due to suspected bug related to https://github.com/paramiko/paramiko/issues/44 in 1.18.2)
  • 2017-02-28 13:45:41 UTC gerrit is back to normal and I don't know how to use the openstackstaus bot
  • 2017-02-28 13:39:11 UTC ok gerrit is back to normal
  • 2017-02-28 13:10:06 UTC restarting gerrit to address performance problems
  • 2017-02-23 14:40:37 UTC nodepool-builder (nb01.o.o / nb02.o.o) stopped again. As a result of zuulv3-dev.o.o usage of infra-chocolate, we are accumulating DIB images on disk
  • 2017-02-23 13:42:06 UTC The mirror update process has completed and resulting issue confirmed solved; any changes whose jobs failed on invalid qemu package dependencies can now be safely rechecked to obtain new results.
  • 2017-02-23 13:05:37 UTC Mirror update failures are causing some Ubuntu-based jobs to fail on invalid qemu package dependencies; the problem mirror is in the process of updating now, so this condition should clear shortly.
  • 2017-02-22 14:55:51 UTC Created Continuous Integration Tools Development in All-Projects.git (UI), added zuul gerrit user to the group.
  • 2017-02-17 19:05:17 UTC Restarting gerrit due to performance problems
  • 2017-02-17 07:48:00 UTC osic-cloud disabled again, see https://review.openstack.org/435250 for some background
  • 2017-02-16 21:37:58 UTC zuulv3-dev.o.o is now online. Zuul services are currently stopped.
  • 2017-02-16 18:19:17 UTC osic-cloud1 temporarily disable. Currently waiting for root cause of networking issues.
  • 2017-02-15 23:18:25 UTC nl01.openstack.org (nodepool-launcher) is now online. Nodepool services are disabled.
  • 2017-02-15 20:58:25 UTC We're currently battling an increase in log volume which isn't leaving sufficient space for new jobs to upload logs and results in POST_FAILURE in those cases; recheck if necessary but keep spurious rebasing and rechecking to a minimum until we're in the clear.
  • 2017-02-14 23:08:17 UTC Hard rebooted mirror.ca-ymq-1.vexxhost.openstack.org because vgs was hanging indefinitely, impacting our ansible/puppet automation
  • 2017-02-13 17:20:54 UTC AFS replication issue has been addressed. Mirrors are currently re-syncing and coming back online.
  • 2017-02-13 15:51:28 UTC We are currently investigating an issue with our AFS mirrors which is causing some projects jobs to fail. We are working to correct the issue.
  • 2017-02-10 14:14:43 UTC The afs02.dfw.openstack.org/main02 volume in Rackspace DFW is expected to become unreachable between 04:00-10:00 UTC Sunday and may require corrective action on afs02.dfw.o.o as a result
  • 2017-02-10 14:12:44 UTC Rackspace will be performing Cinder maintenance in DFW from 04:00 UTC Saturday through 10:00 Sunday (two windows scheduled)
  • 2017-02-09 20:21:48 UTC Restarting gerrit due to performance problems
  • 2017-02-09 20:18:23 UTC Restarting gerrit due to performance problems
  • 2017-02-08 11:36:51 UTC The proposal node had disconnected from the static zuul-launcher. Restarting the launcher has restored connection and proposal jobs are running again
  • 2017-02-08 10:37:14 UTC post and periodic jobs are not running, seems proposal node is down
  • 2017-02-07 16:36:10 UTC restarted gerritbot since messages seemed to be going into a black hole
  • 2017-02-06 18:15:12 UTC rax notified us that the host groups.o.o is on was rebooted
  • 2017-02-04 17:44:18 UTC zuul-launchers restarted to pick up 428740
  • 2017-02-03 19:46:54 UTC elastic search delay (elastic-recheck) appears to have recovered. logstash daemon was stopped on logstash-workers, then started. Our logprocessors were also restarted
  • 2017-02-03 14:13:27 UTC static.o.o root partition at 100%, deleted apache2 logs greater then 5 days in /var/log/apache2 to free up space
  • 2017-02-02 22:53:06 UTC Restarting gerrit due to performance problems
  • 2017-01-30 21:12:09 UTC increased quoto on afs volume mirror.pypi from 500G to 1T
  • 2017-01-25 12:51:30 UTC Gerrit has been successfully restarted
  • 2017-01-25 12:48:18 UTC Gerrit is going to be restarted due to slow performance
  • 2017-01-24 18:16:30 UTC HTTPS cert and chain for zuul.openstack.org has been renewed and replaced.
  • 2017-01-24 18:16:22 UTC HTTPS cert and chain for ask.openstack.org has been renewed and replaced.
  • 2017-01-14 08:34:53 UTC OSIC cloud has been taken down temporarily, see https://review.openstack.org/420275
  • 2017-01-12 20:36:29 UTC Updated: Gerrit will be offline until 20:45 for scheduled maintenance (running longer than anticipated): http://lists.openstack.org/pipermail/openstack-dev/2017-January/109910.html
  • 2017-01-12 20:11:24 UTC Gerrit will be offline between now and 20:30 for scheduled maintenance: http://lists.openstack.org/pipermail/openstack-dev/2017-January/109910.html
  • 2017-01-12 17:41:11 UTC fedora (25) AFS mirror now online.
  • 2017-01-11 02:09:00 UTC manually disabled puppet ansible runs from puppetmaster.openstack.org in crontab due to CVE-2016-9587
  • 2017-01-11 02:08:10 UTC upgraded ansible on all zuul launchers due to CVE-2016-9587. see https://bugzilla.redhat.com/show_bug.cgi?id=1404378 and https://review.openstack.org/418636
  • 2017-01-10 20:14:26 UTC docs.openstack.org served from afs via files01.openstack.org
  • 2017-01-09 19:23:20 UTC using ironic node-set-maintenance $node off && ironic node-set-power-state $node reboot infracloud hypervisors that had disappeared were brought back to life. The mirror VM was then reenabled with openstack server set $vm_name active.
  • 2017-01-09 15:09:23 UTC Nodepool use of Infra-cloud's chocolate region has been disabled with https://review.openstack.org/417904 while nova host issues impacting its mirror instance are investigated.
  • 2017-01-09 15:08:02 UTC All zuul-launcher services have been emergency restarted so that zuul.conf change https://review.openstack.org/417679 will take effect.
  • 2017-01-08 09:43:24 UTC AFS doc publishing is broken, we have read-only file systems.
  • 2017-01-07 01:03:27 UTC docs and docs.dev (developer.openstack.org) afs volumes now have read-only replicas in dfw and ord, and they are being served by files01.openstack.org. a script runs on afsdb01 every 5 minutes to release them if there are any changes.
  • 2017-01-04 22:18:51 UTC elasticsearch rolling upgrade to version 1.7.6 is complete and cluster is recovered
  • 2017-01-02 21:30:44 UTC logstash daemons were 'stuck' and have been restarted on logstash-worker0X.o.o hosts. Events are being processed and indexed again as a result. Should probably look into upgrading logstash install (and possibly elasticsearch
  • 2016-12-29 11:11:50 UTC logs.openstack.org is up again. Feel free to recheck any failures.
  • 2016-12-29 08:20:50 UTC All CI tests are currently broken since logs.openstack.org is down. Refrain from recheck or approval until this is fixed.
  • 2016-12-29 03:00:42 UTC review.o.o (gerrit) restarted
  • 2016-12-21 18:00:07 UTC Gerrit is being restarted to update its OpenID SSO configuration
  • 2016-12-16 00:17:36 UTC nova services restart on controller00.chocolate.ic.openstack.org to fix nodes failing to launch, unsure why this fixed our issue
  • 2016-12-14 23:06:05 UTC nb01.o.o and nb02.o.o added to emergency file on puppetmaster. To manually apply https://review.openstack.org/#/c/410988/
  • 2016-12-14 17:00:06 UTC nb01.o.o and nb02.o.o builders restarted and running from master again. nodepool.o.o did not restart, but /opt/nodepool is pointing to master branch
  • 2016-12-13 17:04:17 UTC Canonical admins have resolved the issue with login.launchpad.net, so authentication should be restored now.
  • 2016-12-13 16:27:33 UTC Launchpad SSO is not currently working, so logins to our services like review.openstack.org and wiki.openstack.org are failing; the admins at Canonical are looking into the issue but there is no estimated time for a fix yet.
  • 2016-12-12 15:08:04 UTC The Gerrit service on review.openstack.org is restarting now to address acute performance issues, and will be back online momentarily.
  • 2016-12-09 23:11:43 UTC manually ran "pip uninstall pyopenssl" on refstack.openstack.org to resolve a problem with requests/cryptography/pyopenssl/mod_wsgi
  • 2016-12-09 22:00:09 UTC elasticsearch has finished shard recovery and relocation. Cluster is now green
  • 2016-12-09 19:03:15 UTC launcher/deleter on nodepool.o.o are now running the zuulv3 branch. zookeeper based nodepool builders (nb01, nb02) are in production
  • 2016-12-09 18:57:39 UTC performed full elasticsearch cluster restart in an attempt to get it to fully recover and go green. Previously was yellow for days unable to initialize some replica shards. Recovery of shards in progress now.
  • 2016-12-08 19:48:07 UTC nb01.o.o / nb02.o.o removed from emergency file
  • 2016-12-08 19:16:13 UTC nb01.o.o / nb02.o.o added to emergency file on puppetmaster
  • 2016-12-07 19:00:56 UTC The zuul-launcher service on zlstatic01 has been restarted following application of fix https://review.openstack.org/408194
  • 2016-12-05 18:55:57 UTC Further project-config changes temporarily frozen for approval until xenial job cut-over changes merge, in an effort to avoid unnecessary merge conflicts.
  • 2016-11-30 16:43:16 UTC afs01.dfw.o.o / afs02.dfw.o.o /dev/mapper/main-vicepa increased to 3TB
  • 2016-11-24 14:49:29 UTC OpenStack CI is processing jobs again. Thanks to the Canadian admin "team" that had their Thanksgiving holiday already ;) Jobs are all enqueued, no need to recheck.
  • 2016-11-24 13:40:03 UTC OpenStack CI has taken a Thanksgiving break; no new jobs are currently launched. We're currently hoping for a friendly admin to come out of Thanksgiving and fix the system.
  • 2016-11-24 05:40:46 UTC The affected filesystems on the log server are repaired. Please leave 'recheck' comments on any changes which failed with POST_FAILURE.
  • 2016-11-24 00:14:50 UTC Due to a problem with the cinder volume backing the log server, jobs are failing with POST_FAILURE. Please avoid issuing 'recheck' commands until the issue is resolved.
  • 2016-11-23 22:56:05 UTC Configuration management updates are temporarily disabled for openstackid.org in preparation for validating change 399253.
  • 2016-11-23 22:56:01 UTC The affected filesystems on the log server are repaired. Please leave 'recheck' comments on any changes which failed with POST_FAILURE.
  • 2016-11-23 22:45:15 UTC This message is to inform you that your Cloud Block Storage device static.openstack.org/main05 has been returned to service.
  • 2016-11-23 21:11:19 UTC Due to a problem with the cinder volume backing the log server, jobs are failing with POST_FAILURE. Please avoid issuing 'recheck' commands until the issue is resolved.
  • 2016-11-23 20:57:14 UTC received at 20:41:09 UTC: This message is to inform you that our monitoring systems have detected a problem with the server which hosts your Cloud Block Storage device 'static.openstack.org/main05' at 20:41 UTC. We are currently investigating the issue and will update you as soon as we have additional information regarding the alert. Please do not access or modify 'static.openstack.org/main05' during this process.
  • 2016-11-22 21:12:27 UTC Gerrit is offline until 21:30 UTC for scheduled maintenance: http://lists.openstack.org/pipermail/openstack-dev/2016-November/107379.html
  • 2016-11-22 14:29:16 UTC rebooted ask.openstack.org for a kernel update
  • 2016-11-21 12:20:56 UTC We are currently having capacity issues with our ubuntu-xenial nodes. We have addressed the issue but will be another few hours before new images have been uploaded to all cloud providers.
  • 2016-11-17 19:18:55 UTC zl04 is restarted now as well. This concludes the zuul launcher restarts for ansible synchronize logging workaround
  • 2016-11-17 19:06:28 UTC all zuul launchers except for zl04 restarted to pick up error logging fix for synchronize tasks. zl04 failed to stop and is being held aside for debugging purposes
  • 2016-11-15 18:58:00 UTC developer.openstack.org is now served from files.openstack.org
  • 2016-11-14 19:32:12 UTC Correction, https://review.openstack.org/396428 changes logs-DEV.openstack.org behavior, rewriting nonexistent files to their .gz compressed counterparts if available.
  • 2016-11-14 19:30:38 UTC https://review.openstack.org/396428 changes logs.openstack.org behavior, rewriting nonexistent files to their .gz compressed counterparts if available.
  • 2016-11-14 17:54:20 UTC Gerrit on review.o.o restarted to deal with GarbageCollection eating all the cpu. Previous restart was Novemeber 7th, so we lasted for one week.
  • 2016-11-11 18:43:11 UTC This message is to inform you that our monitoring systems have detected a problem with the server which hosts your Cloud Block Storage device 'wiki-dev.openstack.org/main01' at 18:27 UTC. We are currently investigating the issue and will update you as soon as we have additional information regarding the alert. Please do not access or modify 'wiki-dev.openstack.org/main01' during this process.
  • 2016-11-11 13:01:03 UTC Our OpenStack CI system is coming back online again. Thanks for your patience.
  • 2016-11-11 12:02:09 UTC Our OpenStack CI systems are stuck and no new jobs are submitted. Please do not recheck - and do not approve changes until this is fixed.
  • 2016-11-11 11:50:51 UTC nodepool/zuul look currently stuck, looks like no new jobs are started
  • 2016-11-10 17:09:24 UTC restarted all zuul-launchers to pick up https://review.openstack.org/394658
  • 2016-11-07 23:09:54 UTC removed the grafana keynote demo dashboard using curl -X DELETE http://grafyamlcreds@localhost:8080/api/dashboards/db/nodepool-new-clouds
  • 2016-11-07 08:47:58 UTC Gerrit is going to be restarted due to slowness and proxy errors
  • 2016-11-04 20:05:04 UTC The old phabricator demo server has been deleted.
  • 2016-11-04 20:04:39 UTC The old (smaller) review-dev server which was replaced in August has now been deleted.
  • 2016-11-02 14:47:47 UTC All hidden Gerrit groups owned by Administrators with no members or inclusions have been prefixed with "Unused-" for possible future (manual) deletion.
  • 2016-10-28 08:57:35 UTC restart apache2 on etherpad.o.o to clear out stale connections
  • 2016-10-27 11:23:46 UTC The nodepool-builder service on nodepool.o.o has been started again now that our keynote demo is complete.
  • 2016-10-26 05:42:32 UTC The Gerrit service on review.openstack.org is being restarted now to guard against potential performance issues later this week.
  • 2016-10-25 13:51:11 UTC The nodepool-builder process is intentionally stopped on nodepool.openstack.org and will be started again tomorrow after noon UTC.
  • 2016-10-21 20:44:36 UTC nodepool is in emergency file so that nodepool config can be more directly managed temporarily
  • 2016-10-20 18:10:09 UTC The Gerrit service on review.openstack.org is being restarted now in an attempt to resolve some mismatched merge states on a few changes, but should return momentarily.
  • 2016-10-20 17:26:37 UTC restarted ansible launchers with 2.5.2.dev31
  • 2016-10-18 23:42:50 UTC restarted logstash daemons as well to get logstash pipeline moving again. Appears they all went out to lunch for some reason (logstash logs not so great but they stopped reading from the tcp connection with log workers according to strace)
  • 2016-10-18 17:44:47 UTC logstash worker daemons restarted as they have all deadlocked. Proper fix in https://review.openstack.org/388122
  • 2016-10-18 16:12:40 UTC pycparser 2.16 released to fix assertion error from today.
  • 2016-10-18 14:06:54 UTC We are away of pycparser failures in the gate and working to address the issue.
  • 2016-10-12 21:33:19 UTC bandersnatch manually synced and mirror.pypi vos released to get around timeout on cron. Mirror appears to have reached steady state and should sync properly again.
  • 2016-10-11 02:49:34 UTC Jobs running on osic nodes are failing due to network issues with the mirror. We are temporarily disabling the cloud.
  • 2016-10-10 07:11:12 UTC Nodepool images can now be built for Gentoo as well - https://review.openstack.org/#/c/310865
  • 2016-10-07 16:46:26 UTC full sync of bandersnatch started, to pickup missing packages from AFS quota issue this morning
  • 2016-10-07 12:30:07 UTC mirror.pypi quota (AFS) bumped to 500GB (up from 400GB)
  • 2016-10-07 12:28:59 UTC mirror.pypi quota (AFS) bumped to 500MB (up from 400MB)
  • 2016-10-06 18:56:31 UTC nodepool now running 3 separate daemons with configuration managed by puppet. If you can always make sure there is a deleter running before we have a launcher to avoid leaking nodes.
  • 2016-10-05 03:15:53 UTC X.509 certificate renewed and updated in private hiera for openstackid.org
  • 2016-10-04 14:02:29 UTC The Gerrit service on review.openstack.org is being restarted to address performance degradation and should return momentarily
  • 2016-09-29 15:01:26 UTC manually running log_archive_maintenance.sh to make room for logs on static.o.o
  • 2016-09-26 16:12:24 UTC Launchpad SSO logins are confirmed working correctly again
  • 2016-09-26 15:50:16 UTC gerrit login manually set to error page in apache config to avoid accidental account creation while lp sso is offline
  • 2016-09-26 15:50:13 UTC Launchpad SSO is offline, preventing login to https://review.openstack.org/, https://wiki.openstack.org/ and many other sites; no ETA has been provided by the LP admin team
  • 2016-09-26 15:44:08 UTC Earlier job failures for "zuul-cloner: error: too few arguments" should now be solved, and can safely be rechecked
  • 2016-09-26 15:37:34 UTC added review.openstack.org to emergency disabled file
  • 2016-09-26 15:28:35 UTC A 4gb swapfile has been added on cacti.openstack.org at /swap while we try to work out what flavor its replacement should run
  • 2016-09-23 22:40:31 UTC mirror.iad.rax.openstack.org has been rebooted to restore sanity following connectivity issues to its cinder volume
  • 2016-09-22 14:50:38 UTC Rebooted wheel-mirror-centos-7-amd64.slave.openstack.org to clear persistent PAG creation error
  • 2016-09-22 04:44:55 UTC A bandersnatch update is running under a root screen session on mirror-update.openstack.org
  • 2016-09-21 13:44:26 UTC disabled apache2/puppetmaster processes on puppetmaster.openstack.org
  • 2016-09-20 14:44:06 UTC infra-cloud has been enabled again.
  • 2016-09-20 13:45:20 UTC OpenStack Infra now has a Twitter bot, follow it at https://twitter.com/openstackinfra
  • 2016-09-20 13:38:56 UTC infra-cloud temporarily taken off to debug some glance issues.
  • 2016-09-20 13:37:49 UTC openstack infra now has a twitter bot, follow it at https://twitter.com/openstackinfra
  • 2016-09-18 16:35:31 UTC The /srv/mediawiki filesystem for the production wiki site had communication errors, so has been manually put through an offline fsck and remounted again
  • 2016-09-13 17:12:12 UTC The Gerrit service on review.openstack.org is being restarted now to address current performance problems, but should return to a working state within a few minutes
  • 2016-09-09 16:59:50 UTC setuptools 27.1.2 addresses the circular import
  • 2016-09-09 15:56:05 UTC New setuptools release appears to have a circular import which is breaking many jobs - check for ImportError: cannot import name monkey.
  • 2016-09-08 01:26:02 UTC restarted nodepoold and nodepool builder to pick up change that should prevent leaking iamges when we hit the 8 hour image timeout.
  • 2016-09-07 20:21:28 UTC controller00 of infracloud is put on emergency hosts, as neutron debugging has been tweaked to investigate sporadic connect timeouts, please leave as is till we get more errors on logs
  • 2016-09-02 19:16:43 UTC Gerrit is completing an online re-index, you may encounter slowness until it is complete
  • 2016-09-02 18:07:50 UTC Gerrit is now going offline for maintenance, reserving a maintenance window through 22:00 UTC.
  • 2016-09-02 17:39:48 UTC The infrastructure team is taking Gerrit offline for maintenance, beginning shortly after 18:00 UTC for a potentially 4 hour maintenance window.
  • 2016-09-02 15:23:22 UTC The Gerrit service on review.openstack.org is restarting quickly to relieve resource pressure and restore normal performance
  • 2016-09-02 12:24:51 UTC restarted nodepool with the latest shade and nodepool changes. all looks well - floating-ips, images and flavors are not being hammered
  • 2016-09-02 05:38:24 UTC Space has been freed up on the log server. If you have POST_FAILURE results it is now safe to issue a 'recheck'
  • 2016-09-02 05:12:19 UTC The logs volume is full causing jobs to fail with POST_FAILURE. This is being worked on, please do not recheck until notified.
  • 2016-08-31 22:29:18 UTC that way the cloud8 people can work on getting the ips sorted in parallel
  • 2016-08-31 22:29:06 UTC in the mean time, it was suggested as a workaround to just use the cloud1 mirror since they're in the same data center by pointing the dns record there
  • 2016-08-31 22:28:50 UTC the networking in cloud8 is such that our mirror is behind the double nat - so our automation has no idea what the actual ip of the server is ... the cloud8 people are looking in to fixing this, but there are things outside of their immediate control
  • 2016-08-29 17:43:00 UTC email sent to rackspace about rax-iad networking issue. The region is still disabled in nodepool
  • 2016-08-26 19:19:03 UTC restarted apache2 on health.o.o to remove a runaway apache process using all the cpu and memory. Looked like it may be related to mysql connections issues. DB currently looks happy.
  • 2016-08-25 23:20:30 UTC mirror.mtl01.internap.openstack.org now online
  • 2016-08-25 19:47:45 UTC The Gerrit service on review.openstack.org is restarting to implement some performance tuning adjustments, and should return to working order momentarily.
  • 2016-08-23 20:07:55 UTC mirror.regionone.osic-cloud1.openstack.org upgraded to support both ipv4 / ipv6. DNS has also been updated.
  • 2016-08-23 16:53:58 UTC The https://wiki.openstack.org/ site (temporarily hosted from wiki-upgrade-test.o.o) has been updated from Mediawiki 1.27.0 to 1.27.1 per https://lists.wikimedia.org/pipermail/mediawiki-announce/2016-August/000195.html
  • 2016-08-20 15:39:13 UTC The its-storyboard plugin has been enabled on review.openstack.org per http://eavesdrop.openstack.org/meetings/infra/2016/infra.2016-08-16-19.02.log.html#l-90
  • 2016-08-19 19:28:55 UTC nodepool.o.o added to emergency file on puppetmaster.o.o. So we can remove ubuntu-xenail label from osic-cloud1
  • 2016-08-19 11:51:08 UTC OSIC has burned through the problematic IP range with failures, things should be back to normal now.
  • 2016-08-19 11:23:21 UTC DSVM jobs on OSIC currently failing because of IP collisions, fix is in the gate - https://review.openstack.org/#/c/357764/ - please hold rechecks until merged
  • 2016-08-19 11:18:22 UTC Precise tests on OSIC provider are currently failing, please stop your checks until the issue is resolved.
  • 2016-08-18 20:08:15 UTC mirror.nyj01.internap.openstack.org replacement server now online, DNS has been updated to 74.217.28.58
  • 2016-08-17 23:04:47 UTC osic-cloud8 credentials added to hieradata
  • 2016-08-17 19:46:43 UTC The volume for logs.openstack.org filled up rather suddenly, causing a number of jobs to fail with a POST_FAILURE result and no logs; we're manually expiring some logs now to buy breathing room, but any changes which hit that in the past few minutes will need to be rechecked and/or approved again
  • 2016-08-17 16:54:30 UTC tripleo-test-cloud-rh1 credentials update on nodepool.o.o to use opentackzuul project
  • 2016-08-17 02:37:29 UTC DNS for wiki.openstack.org currently goes to the wiki-upgrade-test.openstack.org server, as the former suffered a compromise due to missing iptables rules
  • 2016-08-15 22:45:15 UTC mirror.ord.rax.openstack.org upgraded to performance1-4 to address network bandwidth cap.
  • 2016-08-15 20:49:59 UTC gracefully restarting all zuul-launchers
  • 2016-08-15 20:34:14 UTC Installed ansible stable-2.1 branch on zuul launchers to pick up https://github.com/ansible/ansible/commit/d35377dac78a8fcc6e8acf0ffd92f47f44d70946
  • 2016-08-13 16:16:54 UTC The Gerrit service on review.openstack.org is online again
  • 2016-08-13 12:26:24 UTC gerrit is having issues ... it is being working, no ETA at the moment
  • 2016-08-12 23:09:05 UTC https://wiki.openstack.org/ is now running Mediawiki 1.27.0; please let us know in #openstack-infra if anything seems wrong
  • 2016-08-12 23:03:06 UTC ok https://wiki.openstack.org/ is now running Mediawiki 1.27.0; please let us know in #openstack-infra if anything seems wrong
  • 2016-08-12 21:01:01 UTC The Mediawiki service at wiki.openstack.org will be offline from 21:00 UTC until approximately 23:00 UTC for a planned upgrade http://lists.openstack.org/pipermail/openstack-dev/2016-August/101395.html
  • 2016-08-12 20:51:18 UTC The Gerrit service on review.openstack.org is restarting for a scheduled upgrade, but should return to service momentarily: http://lists.openstack.org/pipermail/openstack-dev/2016-August/101394.html
  • 2016-08-12 18:36:06 UTC Added wiki.openstack.org to /etc/ansible/hosts/emergency on puppetmaster.openstack.org in preparation for 21:00 UTC upgrade maintenance
  • 2016-08-10 16:51:12 UTC nodepool-builder restarted on nodepool.o.o to pickup nodepool.yaml changes for bluebox-sjc1
  • 2016-08-10 05:26:14 UTC zuul is being restarted to reload configuration. Jobs should be re-enqueued but if you're missing anything (and it's not on http://status.openstack.org/zuul/) please issue a recheck in 30min.
  • 2016-08-08 08:40:29 UTC Gerrit is going to be restarted
  • 2016-08-02 23:50:13 UTC restarted zuul to clear geard function registration to fix inaccuracies with nodepool demand calculations
  • 2016-07-30 16:59:01 UTC Emergency filesystem repairs are complete; any changes which failed jobs with POST_FAILURE status or due to lack of access to tarballs can be safely rechecked now
  • 2016-07-30 14:25:39 UTC Cinder connectivity was lost to the volumes for sites served from static.openstack.org (logs, docs-draft, tarballs) and so they will remain offline until repairs are complete
  • 2016-07-30 10:00:23 UTC All jobs currenty fail with POST_FAILURE
  • 2016-07-30 05:00:49 UTC zuul-launcher release ran on zl04-zl07, I've left the first 4 zuul-launchers so we can debug the "too many ready node online" issue
  • 2016-07-29 16:47:09 UTC Our PyPI mirrors should be current again as of 16:10 UTC today
  • 2016-07-28 22:50:11 UTC performed full restart of elasticsearch cluster to get it indexing logs again.
  • 2016-07-27 21:26:43 UTC more carefully restarted logstash daemons again. Bigdesk reports significantly higher data transport rates indicating maybe it is happy now.
  • 2016-07-27 14:31:01 UTC auto-hold added to nodepool.o.o for gate-project-config-layout while we debug pypi mirror failures
  • 2016-07-27 13:54:13 UTC Gerrit is being restarted now to relieve performance degradation
  • 2016-07-27 04:19:26 UTC gate-tempest-dsvm-platform-fedora24 added to nodepool auto-hold to debug ansible failures
  • 2016-07-26 20:03:46 UTC restarted logstash worker and logstash indexer daemons to get logstash data flowing again.
  • 2016-07-22 15:29:01 UTC Up to one hour outage expected for static.openstack.org/main04 cinder volume on Saturday, July 30, starting at 08:00 UTC; log uploads issues will probably break all ci jobs and need filesystem remediation after the maintenance concludes
  • 2016-07-22 00:02:34 UTC gerrit/git gc change merged; gerrit and git.o.o repos should gc'd at 04:07 UTC
  • 2016-07-21 00:00:31 UTC All file uploads are disabled on wiki.openstack.org by https://review.openstack.org/345100
  • 2016-07-20 20:07:42 UTC Wiki admins should watch https://wiki.openstack.org/w/index.php?title=Special%3AListUsers&username=&group=&creationSort=1&desc=1&limit=50 for signs of new accounts spamming (spot check linked "contribs" for them)
  • 2016-07-20 20:07:02 UTC New user account creation has been reenabled for the wiki by https://review.openstack.org/344502
  • 2016-07-19 20:20:36 UTC Puppet is reenabled on wiki.openstack.org, and is updating the page edit captcha from questy to recaptcha
  • 2016-07-16 17:34:08 UTC disabled "Microsoft Manila CI", account id 18128 because it was in a comment loop on change 294830
  • 2016-07-15 14:19:47 UTC Gerrit is restarting to correct memory/performance issues.
  • 2016-07-12 01:11:05 UTC zlstatic01.o.o back online
  • 2016-07-11 23:51:57 UTC zlstatic01 in graceful mode
  • 2016-07-08 22:26:21 UTC manually downgraded elasticsearch-curator and ran it to clean out old indexes that were making cluster very slow and unhappy
  • 2016-07-08 21:51:39 UTC restarted logstash on logstash workers with some help from kill. The daemons were not processing events leading to the crazy logstash queue graphs and refused to restart normally.
  • 2016-07-08 16:38:05 UTC ran puppet on codesearch.openstack.org and manually restarted hound
  • 2016-07-06 06:29:08 UTC All python 3.5 jobs are failing today, we need to build new xenial images first.
  • 2016-07-05 18:15:59 UTC Job instability resulting from a block storage connectivity error on mirror.iad.rax.openstack.org has been corrected; jobs running in rax-iad should be more reliable again.
  • 2016-07-05 10:37:26 UTC we now have python35 jobs enabled
  • 2016-07-04 08:16:19 UTC setuptools 24.0.0 broke dsvm tests, we've gone back to old images, it's safe to recheck now if you had a failure related to setuptools 24.0.0 (processor_architecture) - see bug 1598525
  • 2016-07-04 00:56:10 UTC To work around the periodic group expansion issue causing puppet to run on hosts disabled in our groups.txt file in git, i have added the list of disabled hosts from it to the emergency disabled group on the puppetmaster for now
  • 2016-07-02 00:06:39 UTC Gerrit, Zuul and static.openstack.org now available following the scheduled maintenance window.
  • 2016-07-01 20:08:28 UTC Gerrit is offline for maintenance until approximately 22:00 UTC
  • 2016-07-01 19:54:58 UTC The infrastructure team is taking Gerrit offline for maintenance beginning shortly after 20:00 UTC to upgrade the Zuul and static.openstack.org servers. We aim to have it back online around 22:00 UTC.
  • 2016-06-30 16:22:04 UTC zlstatic01.o.o restart to pick up zuul.NodeWorker.wheel-mirror-ubuntu-xenial-amd64.slave.openstack.org
  • 2016-06-29 21:30:29 UTC bindep 2.0.0 release and firefox/xvfb removal from bindep-fallback.txt should take effect in our next image update
  • 2016-06-29 18:59:30 UTC UCA AFS mirror online
  • 2016-06-29 18:29:58 UTC bindep 2.0.0 released
  • 2016-06-23 23:23:13 UTC https://github.com/Shrews/ansible-modules-core/commit/d11cb0d9a1c768735d9cb4b7acc32b971b524f13
  • 2016-06-23 23:22:23 UTC zuul launchers are all running locally patched ansible (source in ~root/ansible) to correct and/or further debug async timeout issue
  • 2016-06-22 22:09:48 UTC nodepool also supports auto-holding nodes for specific failed jobs (it will set the reason appropriately)
  • 2016-06-22 22:09:14 UTC nodepool now support adding a reason when holding a node "--reason <foo>" please use that so that we can remember why they are held :)
  • 2016-06-21 16:07:09 UTC Gerrit is being restarted now to apply an emergency security-related configuration change
  • 2016-06-20 13:14:52 UTC OpenID logins are back to normal
  • 2016-06-20 13:01:26 UTC OpenID login from review.o.o is experiencing difficulties, possibly due to transatlantic network performance issues. Things are being investigated
  • 2016-06-20 10:40:50 UTC static.openstack.org is back up. If you have POST_FAILURE and are missing logs from your CI jobs, please leave a 'recheck'.
  • 2016-06-20 05:24:05 UTC static.openstack.org (which hosts logs.openstack.org and tarballs.openstack.org among others) is currently being rebuilt. As jobs can not upload logs they are failing with POST_FAILURE. This should be resolved soon. Please do not recheck until then.
  • 2016-06-20 03:11:54 UTC static.openstack.org (which hosts logs.openstack.org) is currently migrating due to a hardware failure. It should be back up shortly.
  • 2016-06-18 17:44:10 UTC zl01 restarted properly
  • 2016-06-18 17:21:20 UTC zl01 currently graceful restarting via 330184
  • 2016-06-18 16:38:42 UTC Gerrit is restarting now to relieve memory pressure and restore responsiveness
  • 2016-06-17 16:34:44 UTC zuul was restarted for a software upgrade; events between 16:08 and 16:30 were missed, please recheck any changes uploaded during that time
  • 2016-06-17 01:14:35 UTC follow-up mail about zuul-related changes: http://lists.openstack.org/pipermail/openstack-dev/2016-June/097595.html
  • 2016-06-16 23:56:49 UTC all jenkins servers have been deleted
  • 2016-06-16 22:43:06 UTC Jenkins is retired: http://lists.openstack.org/pipermail/openstack-dev/2016-June/097584.html
  • 2016-06-16 20:20:36 UTC zl05 - zl07 are in production; jenkins05 - jenkins07 are in prepare for shutdown mode pending decomissioning
  • 2016-06-15 18:52:04 UTC jenkins07 back online. Will manually cleanup used nodes moving forward
  • 2016-06-15 18:40:21 UTC jenkins03 and jenkins04 are in prepare-for-shutdown mode in preparation for decomissioning
  • 2016-06-13 19:50:30 UTC zuul has been restarted with registration checks disabled -- we should no longer see NOT_REGISTERED errors after zuul restarts.
  • 2016-06-13 16:24:44 UTC jenkins02.openstack.org has been deleted
  • 2016-06-10 22:19:31 UTC jenkins02 is in prepare for shutdown in preparation for decomissioning
  • 2016-06-10 06:31:03 UTC All translation imports have broken UTF-8 encoding.
  • 2016-06-09 20:07:08 UTC jenkins.o.o is in prepare-for-shutdown in preparation for decomissioning. zlstatic01.openstack.org is running and attached to its workers instead.
  • 2016-06-09 17:42:26 UTC deleted jenkins01.openstack.org
  • 2016-06-08 18:12:10 UTC Zuul has been restarted to correct an error condition. Events since 17:30 may have been missed; please 'recheck' your changes if they were uploaded since then, or have "NOT_REGISTERED" errors.
  • 2016-06-08 00:24:27 UTC nodepool.o.o restarted to pick up review 326114
  • 2016-06-07 23:25:57 UTC jenkins01 is in prepare-for-shutdown mode in preparation for decommissioning.
  • 2016-06-07 08:13:44 UTC dig gate for project-config is fixed again with https://review.openstack.org/326273 merged.
  • 2016-06-07 07:12:13 UTC All project-config jobs fail - the dib gate is broken.
  • 2016-06-06 18:09:46 UTC zl01.openstack.org in production
  • 2016-06-04 01:23:46 UTC Gerrit maintenance concluded successfully
  • 2016-06-04 00:08:07 UTC Gerrit is offline for maintenance until 01:45 UTC (new ETA)
  • 2016-06-03 20:12:32 UTC Gerrit is offline for maintenance until 00:00 UTC
  • 2016-06-03 20:00:59 UTC The infrastructure team is taking Gerrit offline for maintenance this afternoon, beginning shortly after 20:00 UTC. We aim to have it back online around 00:00 UTC.
  • 2016-06-03 14:02:43 UTC Cleanup from earlier block storage disruption on static.openstack.org has been repaired, and any jobs which reported an "UNSTABLE" result or linked to missing logs between 08:00-14:00 UTC can be retriggered by leaving a "recheck" comment.
  • 2016-06-03 11:44:18 UTC CI is experiencing issues with test logs, all jobs are currently UNSTABLE as a result. No need to recheck until this is fixed! Thanks for your patience.
  • 2016-06-03 10:11:14 UTC CI is experiencing issues with test logs, all jobs are currently UNSTABLE as a result. No need to recheck until this is fixed! Thanks for your patience.
  • 2016-06-03 09:38:30 UTC CI is experiencing issues with test logs, all jobs are currently UNSTABLE as a result. No need to recheck until this is fixed! Thanks for your patience.
  • 2016-06-02 01:09:39 UTC nodepool.o.o restarted to fix jenkins01.o.o (wasn't launching jobs)
  • 2016-06-01 23:08:46 UTC zl01.openstack.org is back in production handling a portion of the job load
  • 2016-05-30 14:18:17 UTC openstack-meetbot back online, there was an issue with DNS.
  • 2016-05-30 13:13:53 UTC Statusbot has been restarted (no activity since 27/05)
  • 2016-05-27 23:00:57 UTC eavesdrop.o.o upgraded to ubuntu-trusty and online!
  • 2016-05-27 22:23:22 UTC statusbot back online
  • 2016-05-27 19:33:52 UTC elasticsearch07.o.o upgraded to ubuntu-trusty and cluster is green
  • 2016-05-27 18:59:58 UTC logstash.openstack.org upgraded to ubuntu trusty
  • 2016-05-27 18:51:40 UTC elasticsearch06.o.o upgraded to ubuntu-trusty and cluster is green
  • 2016-05-27 18:06:09 UTC jenkins06.o.o back online
  • 2016-05-27 17:58:01 UTC jenkins05.o.o back online
  • 2016-05-27 17:47:38 UTC elasticsearch05.o.o upgraded to ubuntu-trusty and cluster is green
  • 2016-05-27 17:24:17 UTC elasticsearch04.o.o upgraded to ubuntu-trusty and cluster is green
  • 2016-05-27 16:43:54 UTC elasticsearch03.o.o upgraded to ubuntu-trusty and cluster is green
  • 2016-05-27 16:20:16 UTC elasticsearch02.o.o upgraded to ubuntu-trusty and cluster is green
  • 2016-05-27 13:32:30 UTC nodepoold restarted to address zmq issue with jenkins02 and jenkins06
  • 2016-05-27 07:15:08 UTC zuul required a restart due to network outages. If your change is not listed on http://status.openstack.org/zuul/ and is missing results, please issue a 'recheck'.
  • 2016-05-27 03:23:11 UTC after a quick check, gerrit and its filesystem have been brought back online and should be working again
  • 2016-05-27 03:03:41 UTC Gerrit is going offline briefly to check possible filesystem corruption
  • 2016-05-27 00:48:13 UTC logstash-worker20.openstack.org now running ubuntu-trusty and processing requests
  • 2016-05-27 00:32:59 UTC logstash-worker19.openstack.org now running ubuntu-trusty and processing requests
  • 2016-05-27 00:18:55 UTC logstash-worker18.openstack.org now running ubuntu-trusty and processing requests
  • 2016-05-27 00:10:33 UTC puppetmaster.o.o remove from emergency file since OSIC is now back online
  • 2016-05-27 00:01:23 UTC logstash-worker17.openstack.org now running ubuntu-trusty and processing requests
  • 2016-05-26 23:29:01 UTC logstash-worker16.openstack.org now running ubuntu-trusty and processing requests
  • 2016-05-26 23:01:18 UTC logstash-worker15.openstack.org now running ubuntu-trusty and processing requests
  • 2016-05-26 22:33:27 UTC logstash-worker14.openstack.org now running ubuntu-trusty and processing requests
  • 2016-05-26 22:17:25 UTC zl01 removed from production
  • 2016-05-26 22:12:26 UTC logstash-worker13.openstack.org now running ubuntu-trusty and processing requests
  • 2016-05-26 21:59:00 UTC paste.openstack.org now running ubuntu-trusty and successfully responding to requests
  • 2016-05-26 21:43:25 UTC zuul launcher zl01.openstack.org is in production (handling load in parallel with jenkins)
  • 2016-05-26 21:05:10 UTC logstash-worker12.openstack.org now running ubuntu-trusty and processing requests
  • 2016-05-26 20:57:15 UTC puppet disabled on puppetmaster (for the puppetmaster host itsself -- not globally) and OSIC manually removed from clouds.yaml because OSIC is down which is causing ansible openstack inventory to fail
  • 2016-05-26 20:21:28 UTC osic appears down at the moment. Following up with #osic for information
  • 2016-05-26 19:45:23 UTC logstash-worker11.openstack.org now running ubuntu-trusty and processing requests
  • 2016-05-26 18:36:29 UTC logstash-worker10.openstack.org now running ubuntu-trusty and processing requests
  • 2016-05-26 18:23:09 UTC logstash-worker09.openstack.org now running ubuntu-trusty and processing requests
  • 2016-05-26 18:11:40 UTC logstash-worker08.openstack.org now running ubuntu-trusty and processing requests
  • 2016-05-26 18:00:29 UTC logstash-worker07.openstack.org now running ubuntu-trusty and processing requests
  • 2016-05-26 17:47:15 UTC logstash-worker06.openstack.org now running ubuntu-trusty and processing requests
  • 2016-05-26 17:00:59 UTC logstash-worker05.openstack.org now running ubuntu-trusty and processing requests
  • 2016-05-26 16:26:21 UTC logstash-worker04.openstack.org now running ubuntu-trusty and processing requests
  • 2016-05-26 16:11:10 UTC logstash-worker03.openstack.org now running ubuntu-trusty and processing requests
  • 2016-05-26 15:50:12 UTC logstash-worker02.openstack.org now running ubuntu-trusty and processing requests
  • 2016-05-26 15:28:49 UTC logstash-worker01.openstack.org now running ubuntu-trusty and processing requests
  • 2016-05-25 21:05:16 UTC zuul has been restarted with a change that records and reports estimated job durations internally. job times will be under-estimated until zuul builds up its internal database
  • 2016-05-25 20:35:42 UTC status.o.o has been upgraded to ubuntu trusty
  • 2016-05-25 18:42:28 UTC storyboard.o.o has been upgraded to ubuntu trusty
  • 2016-05-25 18:42:06 UTC graphite.o.o has been upgraded to ubuntu trusty
  • 2016-05-24 22:28:54 UTC graphite.o.o is currently down, we have an open ticket with RAX regarding the detaching of cinder volumes. 160524-dfw-0003689
  • 2016-05-24 20:23:55 UTC zuul-dev.openstack.org now running on ubuntu-trusty
  • 2016-05-24 19:34:00 UTC zm08.openstack.org now running on ubuntu-trusty and processing gearman requests
  • 2016-05-24 19:19:27 UTC zm07.openstack.org now running on ubuntu-trusty and processing gearman requests
  • 2016-05-24 19:09:49 UTC zm06.openstack.org now running on ubuntu-trusty and processing gearman requests
  • 2016-05-24 18:53:11 UTC zm05.openstack.org now running on ubuntu-trusty and processing gearman requests
  • 2016-05-24 18:30:51 UTC zm04.openstack.org now running on ubuntu-trusty and processing gearman requests
  • 2016-05-24 18:12:01 UTC zm03.openstack.org now running on ubuntu-trusty and processing gearman requests
  • 2016-05-24 17:52:34 UTC zm02.openstack.org now running on ubuntu-trusty and processing gearman requests
  • 2016-05-24 17:32:53 UTC zm01.openstack.org now running on ubuntu-trusty and processing gearman requests
  • 2016-05-24 13:21:46 UTC nodepoold restarted to pick up new version of shade / clean-floating-ips
  • 2016-05-23 17:46:37 UTC changed cacti.openstack.org IP address (for upgrade to trusty); gap in data around this time while iptables updates everywhere to allow snmp
  • 2016-05-20 13:40:03 UTC I've stopped jenkins01.o.o, it doesn't appear to be working properly. Nodes attach to jenkins but are not launched by nodepool. I believe zl01 might be the issue
  • 2016-05-18 20:12:03 UTC ran restart_jenkins_masters.yaml on jenkins02.o.o
  • 2016-05-18 01:47:59 UTC Gerrit is about to be restarted to help with page timeouts
  • 2016-05-18 01:28:06 UTC ovh-bhs1 has been down for better part of the last 12 hours. See http://paste.openstack.org/show/497434/ for info about the exception
  • 2016-05-18 00:55:21 UTC nodepool restarted to pickup clean-floating-ips patch
  • 2016-05-13 09:04:38 UTC tripleo-f22 nodes slowing coming online now in nodepool
  • 2016-05-13 08:32:35 UTC tripleo-test-cloud-rh1 added back to nodepool.o.o however having currently having issues launching tripleo-f22 nodes. TripleO CI team should be looking into it
  • 2016-05-13 07:03:46 UTC Removed nodepool.o.o from emergency file on puppetmaster.o.o
  • 2016-05-11 21:56:35 UTC nodepool restarted to pickup https://review.openstack.org/#/c/294339/
  • 2016-05-11 18:47:20 UTC npm mirror sync finished; lock is released
  • 2016-05-11 16:16:27 UTC all afs mirror volumes have been moved to afs01.dfw and afs02.dfw (so they are no longer in ord) to speed up vos release times. all are in regular service using read-only replicas except for npm.
  • 2016-05-11 12:00:27 UTC We have a workaround for our mirrors to attempt to translate package names if a match isn't immediately obvious. A more complete fix is yet to come. It is now safe to 'recheck' any jobs that failed due to "No matching distribution found". Please join #openstack-infra if you discover more problems.
  • 2016-05-11 07:08:56 UTC pip 8.1.2 broke our local python mirror, some jobs will fail with "No matching distribution found". We're investigating. Do not "recheck" until the issue is solved
  • 2016-05-10 17:11:59 UTC created afs02.dfw.openstack.org fileserver
  • 2016-05-10 16:14:42 UTC afs update: the vos release -force completed in just under 59 hours, so i followed up with a normal vos release (no -force) thereafter to make sure it will complete without error now. it's been running for ~4.5 hours so far
  • 2016-05-09 12:54:07 UTC released bandersnatch lock on mirror-update.o.o to resume bandersnatch updates
  • 2016-05-07 23:21:13 UTC vos release of mirror.pypi is running with -force this time, under the usual root screen session on afs0.dfw.openstack.org
  • 2016-05-06 23:57:19 UTC the Review-MySQL trove instance has now been expanded to 50gb (19% full) and /home/gerrit2 on review.openstack.org increased to 200gb (47% full)
  • 2016-05-06 19:06:56 UTC opened support ticket 160506-iad-0001201 for Review-MySQL trove instance taking >3 hours (so far) to resize its backing volume
  • 2016-05-06 16:56:58 UTC osic-cloud1 is coming back online. Thanks for the help #osic
  • 2016-05-06 16:46:35 UTC osic-cloud1 is down at the moment, #osic is looking into the issue. Will update shortly.
  • 2016-05-06 16:02:59 UTC OSIC leads 21 FIPs, they have been deleted manually.
  • 2016-05-06 15:43:14 UTC the current 100gb /home/gerrit2 on review.openstack.org is 95% full, so i've added a new 200gb ssd volume to review.o.o as a replacement for the current 100gb ssd volume. once i'm comfortable that things are still stable after the trove volume resize, i'll pvmove the extents from the old cinder volume to the new one and then extend the lv/fs to 200gb
  • 2016-05-06 15:42:37 UTC the trove instance for review.openstack.org was 10gb and 90% full, so i'm upping it to 50gb (which is supposed to be a non-impacting online operation)
  • 2016-05-06 14:47:31 UTC Zuul has been restarted. As a results, we only preserved patches in the gate queue. Be sure to recheck your patches in gerrit if needed.
  • 2016-05-06 14:17:04 UTC Zuul is currently recovering from a large number of changes, it will take a few hours until your job is processed. Please have patience and enjoy a great weekend!
  • 2016-05-05 20:30:54 UTC Gerrit is restarting to revert incorrect changes to test result displays
  • 2016-05-05 19:22:43 UTC Gerrit is restarting to address performance issues related to a suspected memory leak
  • 2016-05-03 20:38:56 UTC through some careful scripting (which involved apache reconfiguration to stop holding an open file lock) i offlined the tarballs volume on static.openstack.org to repair its filesystem so it could be remounted read-write
  • 2016-05-03 20:28:58 UTC restarting apache on review.openstack.org to pick up security patches. Gerrit web ui may disappear for a short time.
  • 2016-05-03 09:24:59 UTC Docs-draft filesystem has been restored. Please check your affected jobs again
  • 2016-05-03 08:36:36 UTC Filesystem on docs-draft.openstack.org is broken, we are on the process of repairing it. Please stop checking jobs using this filesystem until further notice
  • 2016-05-03 08:27:24 UTC Logs filesystem has been successfully restored, please recheck your jobs
  • 2016-05-03 06:47:23 UTC Filesystem on logs.openstack.org is broken, we are on the process of repairing it. Please stop checking your jobs until further notice
  • 2016-05-03 00:37:42 UTC gerrit configuration update blocked on failing beaker tests due to missing bouncycastle releases; job being made nonvoting in https://review.openstack.org/311898
  • 2016-05-02 23:47:45 UTC due to an error in https://review.openstack.org/295530 which will be corrected in https://review.openstack.org/311888 gerrit should not be restarted until the second change lands
  • 2016-05-02 21:51:56 UTC manual vos release of pypi mirror started in screen on fileserver; see https://etherpad.openstack.org/p/fix-afs
  • 2016-05-02 15:19:44 UTC steps to fix the pypi mirror problem in progress: https://etherpad.openstack.org/p/fix-afs
  • 2016-05-02 06:53:53 UTC AFS mirrors not publishing, they get suck on vos release since 29th April
  • 2016-04-22 15:03:19 UTC Log server was repaired as of 10:50 UTC and jobs have been stable since. If necessary, please recheck changes that have 'UNSTABLE' results.
  • 2016-04-22 10:54:56 UTC Log server has been repaired and jobs are stable again. If necessary please recheck changes that have 'UNSTABLE' results.
  • 2016-04-22 07:32:05 UTC Logs are failing to be uploaded causing jobs to be marked as UNSTABLE. We are working on repairing the log filesystem and will update when ready. Please do not recheck before then.
  • 2016-04-21 12:49:48 UTC OVH provider is enabled again, please wait for the job queue to be processed
  • 2016-04-21 10:38:33 UTC OVH servers are down, we are working to solve it. This will cause that jobs queue is processed slowly, please have patience.
  • 2016-04-19 13:41:32 UTC We have recovered one of our cloud providers, but there is a huge backlog of jobs to process. Please have patience until your jobs are processed
  • 2016-04-15 09:51:47 UTC Zuul and gerrit are working normally now. Please recheck any jobs that may have been affected by this failure.
  • 2016-04-15 09:23:40 UTC No jobs are being processed by gerrit and zuul . We are working to solve the problem, please be aware that no changes have been sent to the queue in the last hour, so you will need to recheck jobs for that period.
  • 2016-04-15 09:06:29 UTC Gerrit is going to be restarted because is not processing new changes
  • 2016-04-11 21:08:40 UTC Gerrit move maintenance completed successfully; note that DNS has been updated to new IP addresses as indicated in http://lists.openstack.org/pipermail/openstack-dev/2016-April/091274.html
  • 2016-04-11 20:08:57 UTC Gerrit is offline until 21:00 UTC for a server replacement http://lists.openstack.org/pipermail/openstack-dev/2016-April/091274.html
  • 2016-04-11 19:51:50 UTC Gerrit will be offline from 20:00 to 21:00 UTC (starting 10 minutes from now) for a server replacement http://lists.openstack.org/pipermail/openstack-dev/2016-April/091274.html
  • 2016-04-11 16:20:17 UTC Reminder, Gerrit will be offline from 20:00 to 21:00 UTC for a server replacement http://lists.openstack.org/pipermail/openstack-dev/2016-April/091274.html
  • 2016-04-07 08:36:04 UTC jobs depending on npm are now working again
  • 2016-04-06 10:20:39 UTC npm lint jobs are failing due to a problem with npm registry. The problem is under investigation, and we will update once the issue is solved.
  • 2016-04-05 20:01:57 UTC ubuntu xenial mirrors now online.
  • 2016-04-05 14:51:52 UTC dns for openstackid.org has been changed from 2001:4800:7817:102:be76:4eff:fe05:d9cd and 23.253.97.70 (openstackid 1.0.17 on ubuntu precise) to 2001:4800:7815:101:be76:4eff:fe04:7741 and 23.253.243.97 (openstackid 1.0.18 on ubuntu trusty). record ttls remain 300s for now
  • 2016-04-05 13:04:10 UTC jenkins06.o.o back online, appears to have run out of RAM
  • 2016-04-04 07:15:37 UTC Gerrit is going to be restarted due to bad performance
  • 2016-03-31 19:56:01 UTC Any jobs which erroneously failed on missing traceroute packages should be safe to recheck now
  • 2016-03-31 17:49:51 UTC Job failures for missing traceroute packages are in the process of being fixed now, ETA 30 minutes to effectiveness for new jobs
  • 2016-03-30 11:15:35 UTC Gate on project-config is currently broken due to IRC tests. The problem has been detected and we are working to fix the issue as soon as possible.
  • 2016-03-28 15:22:43 UTC Gerrit is restarting on review.openstack.org in an attempt to address an issue reading an object from the ec2-api repository
  • 2016-03-24 17:08:05 UTC restarted gerrit to address GC issue
  • 2016-03-21 14:59:32 UTC Rackspace has opened support tickets warning of disruptive maintenance March 22 05:00-07:00 UTC, March 24 03:00 to 07:00 UTC, and March 25 02:00 to 06:00 UTC which could impact network connectivity including disconnecting from Trove databases and Cinder block devices
  • 2016-03-19 22:25:25 UTC Gerrit is restarting to increase performance issues
  • 2016-03-15 15:33:38 UTC Launchpad SSO is back to normal - happy hacking
  • 2016-03-15 15:00:29 UTC Launchpad OpenID SSO is currently experiencing issues preventing login. The Launchpad team is working on the issue
  • 2016-03-15 11:37:22 UTC Gerrit had to be restarted because was not responsive. As a consequence, some of the test results have been lost, from 09:30 UTC to 11:30 UTC approximately. Please recheck any affected jobs by this problem.
  • 2016-03-15 11:34:39 UTC Gerrit had to be restarted because was not responsive. As a consequence, some of the test results have been lost, from 08:30 UTC to 10:30 UTC approximately. Please recheck any affected jobs by this problem.
  • 2016-03-15 11:15:09 UTC Gerrit is going to be restarted
  • 2016-03-11 11:01:42 UTC Gerrit has been restarted successfully
  • 2016-03-11 10:56:07 UTC Gerrit is going to be restarted due to bad performance
  • 2016-03-07 07:25:45 UTC gerrit is going to be restarted due to bad performance
  • 2016-03-04 11:25:20 UTC testing status bot
  • 2016-03-01 10:45:18 UTC gerrit finished restartign
  • 2016-03-01 10:39:09 UTC Gerrit is going to be restarted due to poor performance
  • 2016-02-29 12:07:53 UTC Infra currently has a long backlog. Please be patient and where possible avoid rechecks while it catches up.
  • 2016-02-19 08:35:19 UTC Gerrit is going to be restarted due to performance problems
  • 2016-02-17 06:50:39 UTC A problem with the mirror used for CI jobs in the rax-iad region has been corrected. Please recheck changes that recently failed jobs on nodes in rax-iad.
  • 2016-02-13 17:42:02 UTC Gerrit is back up
  • 2016-02-13 15:11:57 UTC Gerrit is offline for filesystem repair
  • 2016-02-13 00:23:22 UTC Gerrit is offline for maintenance, ETA updated to 01:00 utc
  • 2016-02-12 23:43:30 UTC Gerrit is offline for maintenance, ETA updated to 23:59 utc
  • 2016-02-12 23:08:44 UTC Gerrit is offline for maintenance, ETA updated to 23:30 utc
  • 2016-02-12 22:07:37 UTC Gerrit is offline for maintenacne until 23:00 utc
  • 2016-02-12 21:47:47 UTC The infrastructure team is taking gerrit offline for maintenance this afternoon, beginning at 22:00 utc. We should have it back online around 23:00 utc. http://lists.openstack.org/pipermail/openstack-dev/2016-February/086195.html
  • 2016-02-09 17:25:39 UTC Gerrit is restarting now, to alleviate current performance impact and WebUI errors.
  • 2016-02-03 12:41:39 UTC Infra running with lower capacity now, due to a temporary problem affecting one of our nodepool providers. Please expect some delays in your jobs. Apologies for any inconvenience caused.
  • 2016-01-30 09:23:17 UTC Testing status command
  • 2016-01-22 17:52:01 UTC Restarting zuul due to a memory leak
  • 2016-01-20 11:56:15 UTC Restart done, review.openstack.org is available
  • 2016-01-20 11:45:12 UTC review.openstack.org is being restarted to apply patches
  • 2016-01-18 16:50:38 UTC Gerrit is restarting quickly as a workaround for performance degradation
  • 2016-01-11 22:06:57 UTC Gerrit is restarting to resolve java memory issues
  • 2015-12-17 16:43:53 UTC Zuul is moving in very slow motion since roughly 13:30 UTC; the Infra team is investigating.
  • 2015-12-16 21:02:59 UTC Gerrit has been upgraded to 2.11. Please report any issues in #openstack-infra as soon as possible.
  • 2015-12-16 17:07:00 UTC Gerrit is offline for a software upgrade from 17:00 to 21:00 UTC. See: http://lists.openstack.org/pipermail/openstack-dev/2015-December/081037.html
  • 2015-12-16 16:21:49 UTC Gerrit will be offline for a software upgrade from 17:00 to 21:00 UTC. See: http://lists.openstack.org/pipermail/openstack-dev/2015-December/081037.html
  • 2015-12-04 16:55:08 UTC The earlier JJB bug which disrupted tox-based job configurations has been reverted and applied; jobs seem to be running successfully for the past two hours.
  • 2015-12-04 09:32:24 UTC Tox tests are broken at the moment. From openstack-infra we are working to fix them. Please don't approve changes until we notify that tox tests work again.
  • 2015-11-06 20:04:47 UTC Gerrit is offline until 20:15 UTC today for scheduled project rename maintenance
  • 2015-11-06 19:41:20 UTC Gerrit will be offline at 20:00-20:15 UTC today (starting 20 minutes from now) for scheduled project rename maintenance
  • 2015-10-27 06:32:40 UTC CI will be disrupted for an indeterminate period while our service provider reboots systems for a security fix
  • 2015-10-17 18:40:01 UTC Gerrit is back online. Github transfers are in progress and should be complete by 1900 UTC.
  • 2015-10-17 18:03:25 UTC Gerrit is offline for project renames.
  • 2015-10-17 17:11:10 UTC Gerrit will be offline for project renames starting at 1800 UTC.
  • 2015-10-13 11:19:47 UTC Gerrit has been restarted and is responding to normal load again.
  • 2015-10-13 09:44:48 UTC gerrit is undergoing an emergency restart to investigate load issues
  • 2015-10-05 14:03:13 UTC Gerrit was restarted to temporarily address performance problems
  • 2015-09-17 10:16:42 UTC Gate back to normal, thanks to the backlisting of the problematic version
  • 2015-09-17 08:02:50 UTC Gate is currently stuck, failing grenade upgrade tests due the release of oslo.utils 1.4.1 for Juno.
  • 2015-09-11 23:04:39 UTC Gerrit is offline from 23:00 to 23:30 UTC while some projects are renamed. http://lists.openstack.org/pipermail/openstack-dev/2015-September/074235.html
  • 2015-09-11 22:32:57 UTC 30 minute warning, Gerrit will be offline from 23:00 to 23:30 UTC while some projects are renamed http://lists.openstack.org/pipermail/openstack-dev/2015-September/074235.html
  • 2015-08-31 20:27:18 UTC puppet agent temporarily disabled on nodepool.openstack.org to avoid accidental upgrade to python-glanceclient 1.0.0
  • 2015-08-26 15:45:47 UTC restarting gerrit due to a slow memory leak
  • 2015-08-17 10:50:24 UTC Gerrit restart has resolved the issue and systems are back up and functioning
  • 2015-08-17 10:23:42 UTC review.openstack.org (aka gerrit) is going down for an emergency restart
  • 2015-08-17 07:07:38 UTC Gerrit is currently under very high load and may be unresponsive. infra are looking into the issue.
  • 2015-08-12 00:06:30 UTC Zuul was restarted due to an error; events (such as approvals or new patchsets) since 23:01 UTC have been lost and affected changes will need to be rechecked
  • 2015-08-05 21:11:30 UTC Correction: change events between 20:50-20:54 UTC (during the restart only) have been lost and will need to be rechecked or their approvals reapplied to trigger testing.
  • 2015-08-05 21:06:19 UTC Zuul has been restarted to resolve a reconfiguration failure: previously running jobs have been reenqueued but change events between 19:50-20:54 UTC have been lost and will need to be rechecked or their approvals reapplied to trigger testing.
  • 2015-08-03 13:41:37 UTC The Gerrit service on review.openstack.org has been restarted in an attempt to improve performance.
  • 2015-07-30 09:01:49 UTC CI is back online but has a huge backlog. Please be patient and if possible delay approving changes until it has caught up.
  • 2015-07-30 07:52:49 UTC CI system is broken and very far behind. Please do not approve any changes for a while.
  • 2015-07-30 07:43:12 UTC Our CI system is broken again today, jobs are not getting processed at all.
  • 2015-07-29 13:27:42 UTC zuul jobs after about 07:00 UTC may need a 'recheck' to enter the queue. Look if your change is in http://status.openstack.org/zuul/ and recheck if not.
  • 2015-07-29 12:52:20 UTC zuul's disks were at capacity. Space has been freed up and jobs are being re-queued.
  • 2015-07-29 09:30:59 UTC Currently our CI system is broken, jobs are not getting processed at all.
  • 2015-07-28 08:04:50 UTC zuul has been restarted and queues restored. It may take some time to work through the backlog.
  • 2015-07-28 06:48:20 UTC zuul is stuck and about to undergo an emergency restart, please be patient as job results may take a long time
  • 2015-07-22 14:35:43 UTC CI is slowly recovering, please be patient while the backlog is worked through.
  • 2015-07-22 14:17:30 UTC CI is currently recovering from an outage overnight. It is safe to recheck results with NOT_REGISTERED errors. It may take some time for zuul to work through the backlog.
  • 2015-07-22 08:16:50 UTC zuul jobs are currently stuck while problems with gearman are debugged
  • 2015-07-22 07:24:43 UTC zuul is undergoing an emergency restart. Jobs will be re-queued but some events may be lost.
  • 2015-07-10 22:00:47 UTC Gerrit is unavailable from approximately 22:00 to 22:30 UTC for project renames
  • 2015-07-10 21:04:01 UTC Gerrit will be unavailable from 22:00 to 22:30 UTC for project renames
  • 2015-07-03 19:33:46 UTC etherpad.openstack.org is still offline for scheduled database maintenance, ETA 19:45 UTC
  • 2015-07-03 19:05:45 UTC etherpad.openstack.org is offline for scheduled database maintenance, ETA 19:30 UTC
  • 2015-06-30 14:56:00 UTC The log volume was repaired and brought back online at 14:00 UTC. Log links today from before that time may be missing, and changes should be rechecked if fresh job logs are desired for them.
  • 2015-06-30 08:50:29 UTC OpenStack CI is down due to hard drive failures
  • 2015-06-12 22:45:07 UTC Gerrit is back online. Zuul reconfiguration for renamed projects is still in progress, ETA 23:30.
  • 2015-06-12 22:10:50 UTC Gerrit is offline for project renames. ETA 22:40
  • 2015-06-12 22:06:20 UTC Gerrit is offline for project renames. ETA 20:30
  • 2015-06-12 21:45:26 UTC Gerrit will be offline for project renames between 22:00 and 22:30 UTC
  • 2015-06-11 21:08:10 UTC Gerrit has been restarted to terminate a persistent looping third-party CI bot
  • 2015-06-04 18:43:17 UTC Gerrit has been restarted to clear an issue with its event stream. Any change events between 17:25 and 18:38 UTC should be rechecked or have their approvals reapplied to initiate testing.
  • 2015-05-13 23:00:05 UTC Gerrit and Zuul are back online.
  • 2015-05-13 22:42:09 UTC Gerrit and Zuul are going offline for reboots to fix a security vulnerability.
  • 2015-05-12 00:58:04 UTC Gerrit has been downgraded to version 2.8 due to the issues observed today. Please report further problems in #openstack-infra.
  • 2015-05-11 23:56:14 UTC Gerrit is going offline while we perform an emergency downgrade to version 2.8.
  • 2015-05-11 17:40:47 UTC We have discovered post-upgrade issues with Gerrit affecting nova (and potentially other projects). Some changes will not appear and some actions, such as queries, may return an error. We are continuing to investigate.
  • 2015-05-09 18:32:43 UTC Gerrit upgrade completed; please report problems in #openstack-infra
  • 2015-05-09 16:03:24 UTC Gerrit is offline from 16:00-20:00 UTC to upgrade to version 2.10.
  • 2015-05-09 15:18:16 UTC Gerrit will be offline from 1600-2000 UTC while it is upgraded to version 2.10
  • 2015-05-06 00:43:52 UTC Restarted gerrit due to stuck stream-events connections. Events since 23:49 were missed and changes uploaded since then will need to be rechecked.
  • 2015-05-05 17:05:25 UTC zuul has been restarted to troubleshoot an issue, gerrit events between 15:00-17:00 utc were lost and changes updated or approved during that time will need to be rechecked or have their approval votes readded to trigger testing
  • 2015-04-29 14:06:55 UTC gerrit has been restarted to clear a stuck events queue. any change events between 13:29-14:05 utc should be rechecked or have their approval votes reapplied to trigger jobs
  • 2015-04-28 15:38:04 UTC gerrit has been restarted to clear an issue with its event stream. any change events between 14:43-15:30 utc should be rechecked or have their approval votes reapplied to trigger jobs
  • 2015-04-28 12:43:46 UTC Gate is experiencing epic failures due to issues with mirrors, work is underway to mitigate and return to normal levels of sanity
  • 2015-04-27 13:48:14 UTC gerrit has been restarted to clear a problem with its event stream. change events between 13:09 and 13:36 utc should be rechecked or have approval votes reapplied as needed to trigger jobs
  • 2015-04-27 08:11:05 UTC Restarting gerrit because it stopped sending events (ETA 15 mins)
  • 2015-04-22 17:33:33 UTC gerrit is restarting to clear hung stream-events tasks. any review events between 16:48 and 17:32 utc will need to be rechecked or have their approval votes reapplied to trigger testing in zuul
  • 2015-04-18 15:11:25 UTC Gerrit is offline for emergency maintenance, ETA 15:30 UTC to completion
  • 2015-04-18 14:32:11 UTC Gerrit will be offline between 15:00-15:30 UTC today for emergency maintenance (starting half an hour from now)
  • 2015-04-18 14:02:07 UTC Gerrit will be offline between 15:00-15:30 UTC today for emergency maintenance (starting an hour from now)
  • 2015-04-18 02:29:15 UTC gerrit is undergoing a quick-ish restart to implement a debugging patch. should be back up in ~10 minutes. apologies for any inconvenience
  • 2015-04-17 23:07:06 UTC Gerrit is available again.
  • 2015-04-17 22:09:51 UTC Gerrit is unavailable until 23:59 UTC for project renames and a database update.
  • 2015-04-17 22:05:40 UTC Gerrit is unavailable until 23:59 UTC for project renames and a database update.
  • 2015-04-17 21:05:41 UTC Gerrit will be unavailable between 22:00 and 23:59 UTC for project renames and a database update.
  • 2015-04-16 19:48:11 UTC gerrit has been restarted to clear a problem with its event stream. any gerrit changes updated or approved between 19:14 and 19:46 utc will need to be rechecked or have their approval reapplied for zuul to pick them up
  • 2015-04-15 18:27:55 UTC Gerrit has been restarted. New patches, approvals, and rechecks between 17:30 and 18:20 UTC may have been missed by Zuul and will need rechecks or new approvals added.
  • 2015-04-15 18:05:15 UTC Gerrit has stopped emitting events so Zuul is not alerted to changes. We will restart Gerrit shortly to correct the problem.
  • 2015-04-10 15:45:54 UTC gerrit has been restarted to address a hung event stream. change events between 15:00 and 15:43 utc which were lost will need to be rechecked or have approval workflow votes reapplied for zuul to act on them
  • 2015-04-06 11:40:08 UTC gerrit has been restarted to restore event streaming. any change events missed by zuul (between 10:56 and 11:37 utc) will need to be rechecked or have new approval votes set
  • 2015-04-01 13:29:44 UTC gerrit has been restarted to restore event streaming. any change events missed by zuul (between 12:48 and 13:28 utc) will need to be rechecked or have new approval votes set
  • 2015-03-31 11:51:33 UTC Check/Gate unstuck, feel free to recheck your abusively-failed changes.
  • 2015-03-31 08:55:59 UTC CI Check/Gate pipelines currently stuck due to a bad dependency creeping in the system. No need to recheck your patches at the moment.
  • 2015-03-27 22:06:32 UTC Gerrit is offline for maintenance, ETA 22:30 UTC http://lists.openstack.org/pipermail/openstack-dev/2015-March/059948.html
  • 2015-03-27 21:02:04 UTC Gerrit maintenance commences in 1 hour at 22:00 UTC http://lists.openstack.org/pipermail/openstack-dev/2015-March/059948.html
  • 2015-03-26 13:13:33 UTC gerrit stopped emitting stream events around 11:30 utc and has now been restarted. please recheck any changes currently missing results from jenkins
  • 2015-03-21 16:07:02 UTC Gerrit is back online
  • 2015-03-21 15:08:01 UTC Gerrit is offline for scheduled maintenance || http://lists.openstack.org/pipermail/openstack-infra/2015-March/002540.html
  • 2015-03-21 14:54:23 UTC Gerrit will be offline starting at 1500 UTC for scheduled maintenance
  • 2015-03-04 17:17:49 UTC Issue solved, gate slowly digesting accumulated changes
  • 2015-03-04 08:32:42 UTC Zuul check queue stuck due to reboot maintenance window at one of our cloud providers - no need to recheck changes at the moment, they won't move forward.
  • 2015-01-30 19:32:23 UTC Gerrit is back online
  • 2015-01-30 19:10:04 UTC Gerrit and Zuul are offline until 1930 UTC for project renames
  • 2015-01-30 18:43:57 UTC Gerrit and Zuul will be offline from 1900 to 1930 UTC for project renames
  • 2015-01-30 16:15:03 UTC zuul is running again and changes have been reenqueud. seehttp://status.openstack.org/zuul/ before rechecking if in doubt
  • 2015-01-30 14:26:56 UTC zuul isn't running jobs since ~10:30 utc, investigation underway
  • 2015-01-27 17:54:45 UTC Gerrit and Zuul will be offline for a few minutes for a security update
  • 2015-01-20 19:54:47 UTC Gerrit restarted to address likely memory leak leading to server slowness. Sorry if you were caught in the restart
  • 2015-01-09 18:59:29 UTC paste.openstack.org is going offline for a database migration (duration: ~2 minutes)
  • 2014-12-06 16:06:03 UTC gerrit will be offline for 30 minutes while we rename a few projects. eta 16:30 utc
  • 2014-12-06 15:21:31 UTC [reminder] gerrit will be offline for 30 minutes starting at 16:00 utc for project renames
  • 2014-11-22 00:33:53 UTC Gating and log storage offline due to block device error. Recovery in progress, ETA unknown.
  • 2014-11-21 21:46:58 UTC gating is going offline while we deal with a broken block device, eta unknown
  • 2014-10-29 20:58:17 UTC Restarting gerrit to get fixed CI javascript
  • 2014-10-20 21:22:38 UTC Zuul erroneously marked some changes as having merge conflicts. Those changes have been added to the check queue to be rechecked and will be automatically updated when complete.
  • 2014-10-17 21:27:06 UTC Gerrit is back online
  • 2014-10-17 21:04:39 UTC Gerrit is offline from 2100-2130 for project renames
  • 2014-10-17 20:35:12 UTC Gerrit will be offline from 2100-2130 for project renames
  • 2014-10-17 17:04:01 UTC upgraded wiki.openstack.org from Mediawiki 1.24wmf19 to 1.25wmf4 per http://ci.openstack.org/wiki.html
  • 2014-10-16 16:20:43 UTC An error in a configuration change to mitigate the poodle vulnerability caused a brief outage of git.openstack.org from 16:06-16:12. The problem has been corrected and git.openstack.org is working again.
  • 2014-09-24 21:59:06 UTC The openstack-infra/config repo will be frozen for project-configuration changes starting at 00:01 UTC. If you have a pending configuration change that has not merged or is not in the queue, please see us in #openstack-infra.
  • 2014-09-24 13:43:48 UTC removed 79 disassociated floating ips in hpcloud
  • 2014-09-22 15:52:51 UTC removed 431 disassociated floating ips in hpcloud
  • 2014-09-22 15:52:23 UTC killed bandersnatch process on pypi.region-b.geo-1.openstack.org, hung since 2014-09-18 22:45 due to https://bitbucket.org/pypa/bandersnatch/issue/52
  • 2014-09-22 15:51:21 UTC restarted gerritbot to get it to rejoin channels
  • 2014-09-19 20:53:18 UTC Gerrit is back online
  • 2014-09-19 20:17:08 UTC Gerrit will be offline from 20:30 to 20:50 UTC for project renames
  • 2014-09-16 13:38:01 UTC jenkins ran out of jvm memory on jenkins06 at 01:42:20 http://paste.openstack.org/show/112155/
  • 2014-09-14 18:13:14 UTC all our pypi mirrors failed to update urllib3 properly, full mirror refresh underway now to correct, eta 20:00 utc
  • 2014-09-13 15:10:34 UTC shutting down all irc bots now to change their passwords (per the wallops a few minutes ago, everyone should do the same)
  • 2014-09-13 14:54:19 UTC rebooted puppetmaster.openstack.org due to out-of-memory condition
  • 2014-08-30 16:08:43 UTC Gerrit is offline for project renaming maintenance, ETA 1630
  • 2014-08-25 17:12:51 UTC restarted gerritbot
  • 2014-08-16 16:30:38 UTC Gerrit is offline for project renames. ETA 1645.
  • 2014-07-26 18:28:21 UTC Zuul has been restarted to move it beyond a change it was failing to report on
  • 2014-07-23 22:08:12 UTC zuul is working through a backlog of jobs due to an earlier problem with nodepool
  • 2014-07-23 20:42:47 UTC nodepool is unable to build test nodes so check and gate tests are delayed
  • 2014-07-15 18:23:58 UTC python2.6 jobs are failing due to bug 1342262 "virtualenv>=1.9.1 not found" A fix is out but there are still nodes built on the old stale images
  • 2014-06-28 14:40:16 UTC Gerrit will be offline from 1500-1515 UTC for project renames
  • 2014-06-15 15:30:13 UTC Launchpad is OK - statusbot lost the old channel statuses. They will need to be manually restored
  • 2014-06-15 02:32:57 UTC launchpad openid is down. login to openstack services will fail until launchpad openid is happy again
  • 2014-06-02 14:17:51 UTC setuptools issue was fixed in upstream in 3.7.1 and 4.0.1, please, recheck on bug 1325514
  • 2014-06-02 08:33:19 UTC setuptools upstream has broken the world. it's a known issue. we're hoping that a solution materializes soon
  • 2014-05-29 20:41:04 UTC Gerrit is back online
  • 2014-05-29 20:22:30 UTC Gerrit is going offline to correct an issue with a recent project rename. ETA 20:45 UTC.
  • 2014-05-28 00:08:31 UTC zuul is using a manually installed "gear" library with the timeout and logging changes
  • 2014-05-27 22:11:41 UTC Zuul is started and processing changes that were in the queue when it was stopped. Changes uploaded or approved since then will need to be re-approved or rechecked.
  • 2014-05-27 21:34:45 UTC Zuul is offline due to an operational issue; ETA 2200 UTC.
  • 2014-05-26 22:31:12 UTC stopping gerrit briefly to rebuild its search index in an attempt to fix post-rename oddities (will update with notices every 10 minutes until completed)
  • 2014-05-23 21:36:49 UTC Gerrit is offline in order to rename some projects. ETA: 22:00.
  • 2014-05-23 20:34:36 UTC Gerrit will be offline for about 20 minutes in order to rename some projects starting at 21:00 UTC.
  • 2014-05-09 16:44:31 UTC New contributors can't complete enrollment due to https://launchpad.net/bugs/1317957 (Gerrit is having trouble reaching the Foundation Member system)
  • 2014-05-07 13:12:58 UTC Zuul is processing changes now; some results were lost. Use "recheck bug 1317089" if needed.
  • 2014-05-07 13:04:11 UTC Zuul is stuck due to earlier networking issues with Gerrit server, work in progress.
  • 2014-05-02 23:27:29 UTC paste.openstack.org is going down for a short database upgrade
  • 2014-05-02 22:00:08 UTC Zuul is being restarted with some dependency upgrades and configuration changes; ETA 2215
  • 2014-05-01 00:06:18 UTC the gate is still fairly backed up, though nodepool is back on track and chipping away at remaining changes. some py3k/pypy node starvation is slowing recovery
  • 2014-04-30 20:26:57 UTC the gate is backed up due to broken nodepool images, fix in progress (eta 22:00 utc)
  • 2014-04-28 19:33:21 UTC Gerrit upgrade to 2.8 complete. See: https://wiki.openstack.org/wiki/GerritUpgrade Some cleanup tasks still ongoing; join #openstack-infra if you have any questions.
  • 2014-04-28 16:38:31 UTC Gerrit is unavailable until further notice for a major upgrade. See: https://wiki.openstack.org/wiki/GerritUpgrade
  • 2014-04-28 15:31:50 UTC Gerrit downtime for upgrade begins in 30 minutes. See: https://wiki.openstack.org/wiki/GerritUpgrade
  • 2014-04-28 14:31:51 UTC Gerrit downtime for upgrade begins in 90 minutes. See: https://wiki.openstack.org/wiki/GerritUpgrade
  • 2014-04-25 20:59:57 UTC Gerrit will be unavailable for a few hours starting at 1600 UTC on Monday April 28th for an upgrade. See https://wiki.openstack.org/wiki/GerritUpgrade
  • 2014-04-25 17:17:55 UTC Gerrit will be unavailable for a few hours starting at 1600 UTC on Monday April 28th for an upgrade. See https://wiki.openstack.org/wiki/GerritUpgrade
  • 2014-04-16 00:00:14 UTC Restarting gerrit really quick to fix replication issue
  • 2014-04-08 01:33:50 UTC All services should be back up
  • 2014-04-08 00:22:30 UTC All of the project infrastructure hosts are being restarted for security updates.
  • 2014-03-25 13:30:44 UTC the issue with gerrit cleared on its own before any corrective action was taken
  • 2014-03-25 13:22:16 UTC the gerrit event stream is currently hung, blocking all testing. troubleshooting is in progress (next update at 14:00 utc)
  • 2014-03-12 12:24:44 UTC gerrit on review.openstack.org is down for maintenance (revised eta to resume is 13:00 utc)
  • 2014-03-12 12:07:18 UTC gerrit on review.openstack.org is down for maintenance (eta to resume is 12:30 utc)
  • 2014-03-12 11:28:08 UTC test/gate jobs are queuing now in preparation for gerrit maintenance at 12:00 utc (eta to resume is 12:30 utc)
  • 2014-02-26 22:25:55 UTC gerrit service on review.openstack.org will be down momentarily for a another brief restart--apologies for the disruption
  • 2014-02-26 22:13:11 UTC gerrit service on review.openstack.org will be down momentarily for a restart to add an additional git server
  • 2014-02-21 17:36:50 UTC Git-related build issues should be resolved. If your job failed with no build output, use "recheck bug 1282876".
  • 2014-02-21 16:34:23 UTC Some builds are failing due to errors in worker images; fix eta 1700 UTC.
  • 2014-02-20 23:41:09 UTC A transient error caused Zuul to report jobs as LOST; if you were affected, leave a comment with "recheck no bug"
  • 2014-02-18 23:33:18 UTC Gerrit login issues should be resolved.
  • 2014-02-13 22:35:01 UTC restarting zuul for a configuration change
  • 2014-02-10 16:21:11 UTC jobs are running for changes again, but there's a bit of a backlog so it will still probably take a few hours for everything to catch up
  • 2014-02-10 15:16:33 UTC the gate is experiencing delays due to nodepool resource issues (fix in progress, eta 16:00 utc)
  • 2014-02-07 20:10:08 UTC Gerrit and Zuul are offline for project renames. ETA 20:30 UTC.
  • 2014-02-07 18:59:03 UTC Zuul is now in queue-only mode preparing for project renames at 20:00 UTC
  • 2014-02-07 17:35:36 UTC Gerrit and Zuul going offline at 20:00 UTC for ~15mins for project renames
  • 2014-02-07 17:34:07 UTC Gerrit and Zuul going offline at 20:00 UTC for ~15mins for project renames
  • 2014-01-29 17:09:18 UTC the gate is merging changes again... issues with tox/virtualenv versions can be rechecked or reverified against bug 1274135
  • 2014-01-29 14:37:42 UTC most tests are failing as a result of new tox and testtools releases (bug 1274135, in progress)
  • 2014-01-29 14:25:35 UTC most tests are failing as a result of new tox and testtools releases--investigation in progress
  • 2014-01-24 21:55:40 UTC Zuul is restarting to pick up a bug fix
  • 2014-01-24 21:39:11 UTC Zuul is ignoring some enqueue events; fix in progress
  • 2014-01-24 16:13:31 UTC restarted gerritbot because it seemed to be on the wrong side of a netsplit
  • 2014-01-23 23:51:14 UTC Zuul is being restarted for an upgrade
  • 2014-01-22 20:51:44 UTC Zuul is about to restart for an upgrade; changes will be re-enqueued
  • 2014-01-17 19:13:32 UTC zuul.openstack.org underwent maintenance today from 16:50 to 19:00 UTC, so any changes approved during that timeframe should be reapproved so as to be added to the gate. new patchsets uploaded for those two hours should be rechecked (no bug) if test results are desired
  • 2014-01-14 12:29:06 UTC Gate currently blocked due to slave node exhaustion
  • 2014-01-07 16:47:29 UTC unit tests seem to be passing consistently after the upgrade. use bug 1266711 for related rechecks
  • 2014-01-07 14:51:19 UTC working on undoing the accidental libvirt upgrade which is causing nova and keystone unit test failures (ETA 15:30 UTC)
  • 2014-01-06 21:20:09 UTC gracefully stopping jenkins01 now. it has many nodes which are offline status and only a handful online, while nodepool thinks it has ~90 available to run jobs
  • 2014-01-06 19:37:28 UTC gracefully stopping jenkins02 now. it has many nodes which are offline status and only a handful online, while nodepool thinks it has ~75 available to run jobs
  • 2014-01-06 19:36:12 UTC gating is operating at reduced capacity while we work through a systems problem (ETA 21:00 UTC)
  • 2014-01-03 00:13:32 UTC see: https://etherpad.openstack.org/p/pip1.5Upgrade
  • 2014-01-02 17:07:54 UTC gating is severely hampered while we attempt to sort out the impact of today's pip 1.5/virtualenv 1.11 releases... no ETA for solution yet
  • 2014-01-02 16:58:35 UTC gating is severely hampered while we attempt to sort out the impact of the pip 1.5 release... no ETA for solution yet
  • 2013-12-24 06:11:50 UTC fix for grenade euca/bundle failures is in the gate. changes failing on those issues in the past 7 hours should be rechecked or reverified against bug 1263824
  • 2013-12-24 05:31:47 UTC gating is currently wedged by consistent grenade job failures--proposed fix is being confirmed now--eta 06:00 utc
  • 2013-12-13 17:21:56 UTC restarted gerritbot
  • 2013-12-11 21:35:29 UTC test
  • 2013-12-11 21:34:09 UTC test
  • 2013-12-11 21:20:28 UTC test
  • 2013-12-11 18:03:36 UTC Grenade gate infra issues: use "reverify bug 1259911"
  • 2013-12-06 17:05:12 UTC i'm running statusbot in screen to try to catch why it dies after a while.
  • 2013-12-04 18:34:41 UTC gate failures due to django incompatibility, pip bugs, and node performance problems
  • 2013-12-03 16:56:59 UTC docs jobs are failing due to a full filesystem; fix eta 1750 UTC
  • 2013-11-26 14:25:11 UTC Gate should be unwedged now, thanks for your patience
  • 2013-11-26 11:29:13 UTC Gate wedged - Most Py26 jobs fail currently (https://bugs.launchpad.net/openstack-ci/+bug/1255041)
  • 2013-11-20 22:45:24 UTC Please refrain from approving changes that don't fix gate-blocking issues -- http://lists.openstack.org/pipermail/openstack-dev/2013-November/019941.html
  • 2013-11-06 00:03:44 UTC filesystem resize complete, logs uploading successfully again in the past few minutes--feel free to 'recheck no bug' or 'reverify no bug' if your change failed jobs with an "unstable" result
  • 2013-11-05 23:31:13 UTC Out of disk space on log server, blocking test result uploads--fix in progress, eta 2400 utc
  • 2013-10-13 16:25:59 UTC etherpad migration complete
  • 2013-10-13 16:05:03 UTC etherpad is down for software upgrade and migration
  • 2013-10-11 16:36:06 UTC the gate is moving again for the past half hour or so--thanks for your collective patience while we worked through the issue
  • 2013-10-11 14:14:17 UTC The Infrastructure team is working through some devstack node starvation issues which is currently holding up gating and slowing checks. ETA 1600 UTC
  • 2013-10-11 12:48:07 UTC Gate is currently stuck (probably due to networking issues preventing new test nodes from being spun)
  • 2013-10-05 17:01:06 UTC puppet disabled on nodepool due to manually reverting gearman change
  • 2013-10-05 16:03:13 UTC Gerrit will be down for maintenance from 1600-1630 UTC
  • 2013-10-05 15:34:37 UTC Zuul is shutting down for Gerrit downtime from 1600-1630 UTC
  • 2013-10-02 09:54:09 UTC Jenkins01 is not failing, it's just very slow at the moment... so the gate is not completely stuck.
  • 2013-10-02 09:46:39 UTC One of our Jenkins masters is failing to return results, so the gate is currently stuck.
  • 2013-09-24 15:48:07 UTC changes seem to be making it through the gate once more, and so it should be safe to "recheck bug 1229797" or "reverify bug 1229797" on affected changes as needed
  • 2013-09-24 13:30:07 UTC dependency problems in gating, currently under investigation... more news as it unfolds
  • 2013-08-27 20:35:24 UTC Zuul has been restarted
  • 2013-08-27 20:10:38 UTC zuul is offline because of a pbr-related installation issue
  • 2013-08-24 22:40:08 UTC Zuul and nodepool are running again; rechecks have been issued (but double check your patch in case it was missed)
  • 2013-08-24 22:04:36 UTC Zuul and nodepool are being restarted
  • 2013-08-23 17:53:29 UTC recent UNSTABLE jobs were due to maintenance to expand capacity which is complete; recheck or reverify as needed
  • 2013-08-22 18:12:24 UTC stopping gerrit to correct a stackforge project rename error
  • 2013-08-22 17:55:56 UTC restarting gerrit to pick up a configuration change
  • 2013-08-22 15:06:03 UTC Zuul has been restarted; leave 'recheck no bug' or 'reverify no bug' comments to re-enqueue.
  • 2013-08-22 01:38:31 UTC Zuul is running again
  • 2013-08-22 01:02:06 UTC Zuul is offline for troubleshooting
  • 2013-08-21 21:10:59 UTC Restarting zuul, changes should be automatically re-enqueued
  • 2013-08-21 16:32:30 UTC LOST jobs are due to a known bug; use "recheck no bug"
  • 2013-08-19 20:27:37 UTC gate-grenade-devstack-vm is currently failing preventing merges. Proposed fix: https://review.openstack.org/#/c/42720/
  • 2013-08-16 13:53:35 UTC the gate seems to be properly moving now, but some changes which were in limbo earlier are probably going to come back with negative votes now. rechecking/reverifying those too
  • 2013-08-16 13:34:05 UTC the earlier log server issues seem to have put one of the jenkins servers in a bad state, blocking the gate--working on that, ETA 14:00 UTC
  • 2013-08-16 12:41:10 UTC still rechecking/reverifying false negative results on changes, but the gate is moving again
  • 2013-08-16 12:00:34 UTC log server has a larger filesystem now--rechecking/reverifying jobs, ETA 12:30 UTC
  • 2013-08-16 12:00:22 UTC server has a larger filesystem now--rechecking/reverifying jobs, ETA 12:30 UTC
  • 2013-08-16 11:21:47 UTC the log server has filled up, disrupting job completion--working on it now, ETA 12:30 UTC
  • 2013-08-16 11:07:34 UTC some sort of gating disruption has been identified--looking into it now
  • 2013-07-28 15:30:29 UTC restarted zuul to upgrade
  • 2013-07-28 00:25:57 UTC restarted jenkins to update scp plugin
  • 2013-07-26 14:19:34 UTC Performing maintenance on docs-draft site, unstable docs jobs expected for the next few minutes; use "recheck no bug"
  • 2013-07-20 18:38:03 UTC devstack gate should be back to normal
  • 2013-07-20 17:02:31 UTC devstack-gate jobs broken due to setuptools brokenness; fix in progress.
  • 2013-07-20 01:41:30 UTC replaced ssl certs for jenkins, review, wiki, and etherpad
  • 2013-07-19 23:47:31 UTC Project affected by the xattr cffi dependency issues should be able to run tests and have them pass. xattr has been fixed and the new version is on our mirror.
  • 2013-07-19 22:23:27 UTC Projects with a dependency on xattr are failing tests due to unresolved xattr dependencies. Fix should be in shortly
  • 2013-07-17 20:33:39 UTC Jenkins is running jobs again, some jobs are marked as UNSTABLE; fix in progress
  • 2013-07-17 18:43:20 UTC Zuul is queueing jobs while Jenkins is restarted for a security update
  • 2013-07-17 18:32:50 UTC Gerrit security updates have been applied
  • 2013-07-17 17:38:19 UTC Gerrit is being restarted to apply a security update
  • 2013-07-16 01:30:52 UTC Zuul is back up and outstanding changes have been re-enqueued in the gate queue.
  • 2013-07-16 00:23:27 UTC Zuul is down for an emergency load-related server upgrade. ETA 01:30 UTC.
  • 2013-07-06 16:29:49 UTC Neutron project rename in progress; see https://wiki.openstack.org/wiki/Network/neutron-renaming
  • 2013-07-06 16:29:32 UTC Gerrit and Zuul are back online, neutron rename still in progress
  • 2013-07-06 16:02:38 UTC Gerrit and Zuul are offline for neutron project rename; ETA 1630 UTC; see https://wiki.openstack.org/wiki/Network/neutron-renaming
  • 2013-06-14 23:28:41 UTC Zuul and Jenkins are back up (but somewhat backlogged). See http://status.openstack.org/zuul/
  • 2013-06-14 20:42:30 UTC Gerrit is back in service. Zuul and Jenkins are offline for further maintenance (ETA 22:00 UTC)
  • 2013-06-14 20:36:49 UTC Gerrit is back in service. Zuul and Jenkins are offline for further maintenance (ETA 22:00)
  • 2013-06-14 20:00:58 UTC Gerrit, Zuul and Jenkins are offline for maintenance (ETA 30 minutes)
  • 2013-06-14 18:29:37 UTC Zuul/Jenkins are gracefully shutting down in preparation for today's 20:00 UTC maintenance
  • 2013-06-11 17:32:14 UTC pbr 0.5.16 has been released and the gate should be back in business
  • 2013-06-11 16:00:10 UTC pbr change broke the gate, a fix is forthcoming
  • 2013-06-06 21:00:45 UTC jenkins log server is fixed; new builds should complete, old logs are being copied over slowly (you may encounter 404 errors following older links to logs.openstack.org until this completes)
  • 2013-06-06 19:38:01 UTC gating is currently broken due to a full log server (ETA 30 minutes)
  • 2013-05-16 20:02:47 UTC Gerrit, Zuul, and Jenkins are back online.
  • 2013-05-16 18:57:28 UTC Gerrit, Zuul, and Jenkins will all be shutting down for reboots at approximately 19:10 UTC.
  • 2013-05-16 18:46:38 UTC wiki.openstack.org and lists.openstack.org are back online
  • 2013-05-16 18:37:52 UTC wiki.openstack.org and lists.openstack.org are being rebooted. downtime should be < 5 min.
  • 2013-05-16 18:36:23 UTC eavesdrop.openstack.org is back online
  • 2013-05-16 18:31:14 UTC eavesdrop.openstack.org is being rebooted. downtime should be less than 5 minutes.
  • 2013-05-15 05:32:26 UTC upgraded gerrit to gerrit-2.4.2-17 to address a security issue: http://gerrit-documentation.googlecode.com/svn/ReleaseNotes/ReleaseNotes-2.5.3.html#_security_fixes
  • 2013-05-14 18:32:07 UTC gating is catching up queued jobs now and should be back to normal shortly (eta 30 minutes)
  • 2013-05-14 17:55:44 UTC gating is broken for a bit while we replace jenkins slaves (eta 30 minutes)
  • 2013-05-14 17:06:56 UTC gating is broken for a bit while we replace jenkins slaves (eta 30 minutes)
  • 2013-05-04 16:31:22 UTC lists.openstack.org and eavesdrop.openstack.org are back in service
  • 2013-05-04 16:19:45 UTC test
  • 2013-05-04 15:58:36 UTC eavesdrop and lists.openstack.org are offline for server upgrades and moves. ETA 1700 UTC.
  • 2013-05-02 20:20:45 UTC Jenkins is in shutdown mode so that we may perform an upgrade; builds will be delayed but should not be lost.
  • 2013-04-26 18:04:19 UTC We just added AAAA records (IPv6 addresses) to review.openstack.org and jenkins.openstack.org.
  • 2013-04-25 18:25:41 UTC meetbot is back on and confirmed to be working properly again... apologies for the disruption
  • 2013-04-25 17:40:34 UTC meetbot is on the wrong side of a netsplit; infra is working on getting it back
  • 2013-04-08 18:09:34 UTC A review.o.o repo needed to be reseeded for security reasons. To ensure that a force push did not miss anything a nuke from orbit approach was taken instead. Gerrit was stopped, old bad repo was removed, new good repo was added, and Gerrit was started again.
  • 2013-04-08 17:50:57 UTC The infra team is restarting Gerrit for git repo maintenance. If Gerrit is not responding please try again in a few minutes.
  • 2013-04-03 01:07:50 UTC https://review.openstack.org/#/c/25939/ should fix the prettytable dependency problem when merged (https://bugs.launchpad.net/nova/+bug/1163631)
  • 2013-04-03 00:48:01 UTC Restarting gerrit to try to correct an error condition in the stackforge/diskimage-builder repo
  • 2013-03-29 23:01:04 UTC Testing alert status
  • 2013-03-29 22:58:24 UTC Testing statusbot
  • 2013-03-28 13:32:02 UTC Everything is okay now.