Jump to: navigation, search

Difference between revisions of "Infrastructure Status"

Line 1: Line 1:
 +
* 2016-05-03 09:24:59 UTC Docs-draft filesystem has been restored. Please check your affected jobs again
 
* 2016-05-03 08:36:36 UTC Filesystem on docs-draft.openstack.org is broken, we are on the process of repairing it. Please stop checking jobs using this filesystem until further notice
 
* 2016-05-03 08:36:36 UTC Filesystem on docs-draft.openstack.org is broken, we are on the process of repairing it. Please stop checking jobs using this filesystem until further notice
 
* 2016-05-03 08:27:24 UTC Logs filesystem has been successfully restored, please recheck your jobs
 
* 2016-05-03 08:27:24 UTC Logs filesystem has been successfully restored, please recheck your jobs

Revision as of 09:25, 3 May 2016

  • 2016-05-03 09:24:59 UTC Docs-draft filesystem has been restored. Please check your affected jobs again
  • 2016-05-03 08:36:36 UTC Filesystem on docs-draft.openstack.org is broken, we are on the process of repairing it. Please stop checking jobs using this filesystem until further notice
  • 2016-05-03 08:27:24 UTC Logs filesystem has been successfully restored, please recheck your jobs
  • 2016-05-03 06:47:23 UTC Filesystem on logs.openstack.org is broken, we are on the process of repairing it. Please stop checking your jobs until further notice
  • 2016-05-03 00:37:42 UTC gerrit configuration update blocked on failing beaker tests due to missing bouncycastle releases; job being made nonvoting in https://review.openstack.org/311898
  • 2016-05-02 23:47:45 UTC due to an error in https://review.openstack.org/295530 which will be corrected in https://review.openstack.org/311888 gerrit should not be restarted until the second change lands
  • 2016-05-02 21:51:56 UTC manual vos release of pypi mirror started in screen on fileserver; see https://etherpad.openstack.org/p/fix-afs
  • 2016-05-02 15:19:44 UTC steps to fix the pypi mirror problem in progress: https://etherpad.openstack.org/p/fix-afs
  • 2016-05-02 06:53:53 UTC AFS mirrors not publishing, they get suck on vos release since 29th April
  • 2016-04-22 15:03:19 UTC Log server was repaired as of 10:50 UTC and jobs have been stable since. If necessary, please recheck changes that have 'UNSTABLE' results.
  • 2016-04-22 10:54:56 UTC Log server has been repaired and jobs are stable again. If necessary please recheck changes that have 'UNSTABLE' results.
  • 2016-04-22 07:32:05 UTC Logs are failing to be uploaded causing jobs to be marked as UNSTABLE. We are working on repairing the log filesystem and will update when ready. Please do not recheck before then.
  • 2016-04-21 12:49:48 UTC OVH provider is enabled again, please wait for the job queue to be processed
  • 2016-04-21 10:38:33 UTC OVH servers are down, we are working to solve it. This will cause that jobs queue is processed slowly, please have patience.
  • 2016-04-19 13:41:32 UTC We have recovered one of our cloud providers, but there is a huge backlog of jobs to process. Please have patience until your jobs are processed
  • 2016-04-15 09:51:47 UTC Zuul and gerrit are working normally now. Please recheck any jobs that may have been affected by this failure.
  • 2016-04-15 09:23:40 UTC No jobs are being processed by gerrit and zuul . We are working to solve the problem, please be aware that no changes have been sent to the queue in the last hour, so you will need to recheck jobs for that period.
  • 2016-04-15 09:06:29 UTC Gerrit is going to be restarted because is not processing new changes
  • 2016-04-11 21:08:40 UTC Gerrit move maintenance completed successfully; note that DNS has been updated to new IP addresses as indicated in http://lists.openstack.org/pipermail/openstack-dev/2016-April/091274.html
  • 2016-04-11 20:08:57 UTC Gerrit is offline until 21:00 UTC for a server replacement http://lists.openstack.org/pipermail/openstack-dev/2016-April/091274.html
  • 2016-04-11 19:51:50 UTC Gerrit will be offline from 20:00 to 21:00 UTC (starting 10 minutes from now) for a server replacement http://lists.openstack.org/pipermail/openstack-dev/2016-April/091274.html
  • 2016-04-11 16:20:17 UTC Reminder, Gerrit will be offline from 20:00 to 21:00 UTC for a server replacement http://lists.openstack.org/pipermail/openstack-dev/2016-April/091274.html
  • 2016-04-07 08:36:04 UTC jobs depending on npm are now working again
  • 2016-04-06 10:20:39 UTC npm lint jobs are failing due to a problem with npm registry. The problem is under investigation, and we will update once the issue is solved.
  • 2016-04-05 20:01:57 UTC ubuntu xenial mirrors now online.
  • 2016-04-05 14:51:52 UTC dns for openstackid.org has been changed from 2001:4800:7817:102:be76:4eff:fe05:d9cd and 23.253.97.70 (openstackid 1.0.17 on ubuntu precise) to 2001:4800:7815:101:be76:4eff:fe04:7741 and 23.253.243.97 (openstackid 1.0.18 on ubuntu trusty). record ttls remain 300s for now
  • 2016-04-05 13:04:10 UTC jenkins06.o.o back online, appears to have run out of RAM
  • 2016-04-04 07:15:37 UTC Gerrit is going to be restarted due to bad performance
  • 2016-03-31 19:56:01 UTC Any jobs which erroneously failed on missing traceroute packages should be safe to recheck now
  • 2016-03-31 17:49:51 UTC Job failures for missing traceroute packages are in the process of being fixed now, ETA 30 minutes to effectiveness for new jobs
  • 2016-03-30 11:15:35 UTC Gate on project-config is currently broken due to IRC tests. The problem has been detected and we are working to fix the issue as soon as possible.
  • 2016-03-28 15:22:43 UTC Gerrit is restarting on review.openstack.org in an attempt to address an issue reading an object from the ec2-api repository
  • 2016-03-24 17:08:05 UTC restarted gerrit to address GC issue
  • 2016-03-21 14:59:32 UTC Rackspace has opened support tickets warning of disruptive maintenance March 22 05:00-07:00 UTC, March 24 03:00 to 07:00 UTC, and March 25 02:00 to 06:00 UTC which could impact network connectivity including disconnecting from Trove databases and Cinder block devices
  • 2016-03-19 22:25:25 UTC Gerrit is restarting to increase performance issues
  • 2016-03-15 15:33:38 UTC Launchpad SSO is back to normal - happy hacking
  • 2016-03-15 15:00:29 UTC Launchpad OpenID SSO is currently experiencing issues preventing login. The Launchpad team is working on the issue
  • 2016-03-15 11:37:22 UTC Gerrit had to be restarted because was not responsive. As a consequence, some of the test results have been lost, from 09:30 UTC to 11:30 UTC approximately. Please recheck any affected jobs by this problem.
  • 2016-03-15 11:34:39 UTC Gerrit had to be restarted because was not responsive. As a consequence, some of the test results have been lost, from 08:30 UTC to 10:30 UTC approximately. Please recheck any affected jobs by this problem.
  • 2016-03-15 11:15:09 UTC Gerrit is going to be restarted
  • 2016-03-11 11:01:42 UTC Gerrit has been restarted successfully
  • 2016-03-11 10:56:07 UTC Gerrit is going to be restarted due to bad performance
  • 2016-03-07 07:25:45 UTC gerrit is going to be restarted due to bad performance
  • 2016-03-04 11:25:20 UTC testing status bot
  • 2016-03-01 10:45:18 UTC gerrit finished restartign
  • 2016-03-01 10:39:09 UTC Gerrit is going to be restarted due to poor performance
  • 2016-02-29 12:07:53 UTC Infra currently has a long backlog. Please be patient and where possible avoid rechecks while it catches up.
  • 2016-02-19 08:35:19 UTC Gerrit is going to be restarted due to performance problems
  • 2016-02-17 06:50:39 UTC A problem with the mirror used for CI jobs in the rax-iad region has been corrected. Please recheck changes that recently failed jobs on nodes in rax-iad.
  • 2016-02-13 17:42:02 UTC Gerrit is back up
  • 2016-02-13 15:11:57 UTC Gerrit is offline for filesystem repair
  • 2016-02-13 00:23:22 UTC Gerrit is offline for maintenance, ETA updated to 01:00 utc
  • 2016-02-12 23:43:30 UTC Gerrit is offline for maintenance, ETA updated to 23:59 utc
  • 2016-02-12 23:08:44 UTC Gerrit is offline for maintenance, ETA updated to 23:30 utc
  • 2016-02-12 22:07:37 UTC Gerrit is offline for maintenacne until 23:00 utc
  • 2016-02-12 21:47:47 UTC The infrastructure team is taking gerrit offline for maintenance this afternoon, beginning at 22:00 utc. We should have it back online around 23:00 utc. http://lists.openstack.org/pipermail/openstack-dev/2016-February/086195.html
  • 2016-02-09 17:25:39 UTC Gerrit is restarting now, to alleviate current performance impact and WebUI errors.
  • 2016-02-03 12:41:39 UTC Infra running with lower capacity now, due to a temporary problem affecting one of our nodepool providers. Please expect some delays in your jobs. Apologies for any inconvenience caused.
  • 2016-01-30 09:23:17 UTC Testing status command
  • 2016-01-22 17:52:01 UTC Restarting zuul due to a memory leak
  • 2016-01-20 11:56:15 UTC Restart done, review.openstack.org is available
  • 2016-01-20 11:45:12 UTC review.openstack.org is being restarted to apply patches
  • 2016-01-18 16:50:38 UTC Gerrit is restarting quickly as a workaround for performance degradation
  • 2016-01-11 22:06:57 UTC Gerrit is restarting to resolve java memory issues
  • 2015-12-17 16:43:53 UTC Zuul is moving in very slow motion since roughly 13:30 UTC; the Infra team is investigating.
  • 2015-12-16 21:02:59 UTC Gerrit has been upgraded to 2.11. Please report any issues in #openstack-infra as soon as possible.
  • 2015-12-16 17:07:00 UTC Gerrit is offline for a software upgrade from 17:00 to 21:00 UTC. See: http://lists.openstack.org/pipermail/openstack-dev/2015-December/081037.html
  • 2015-12-16 16:21:49 UTC Gerrit will be offline for a software upgrade from 17:00 to 21:00 UTC. See: http://lists.openstack.org/pipermail/openstack-dev/2015-December/081037.html
  • 2015-12-04 16:55:08 UTC The earlier JJB bug which disrupted tox-based job configurations has been reverted and applied; jobs seem to be running successfully for the past two hours.
  • 2015-12-04 09:32:24 UTC Tox tests are broken at the moment. From openstack-infra we are working to fix them. Please don't approve changes until we notify that tox tests work again.
  • 2015-11-06 20:04:47 UTC Gerrit is offline until 20:15 UTC today for scheduled project rename maintenance
  • 2015-11-06 19:41:20 UTC Gerrit will be offline at 20:00-20:15 UTC today (starting 20 minutes from now) for scheduled project rename maintenance
  • 2015-10-27 06:32:40 UTC CI will be disrupted for an indeterminate period while our service provider reboots systems for a security fix
  • 2015-10-17 18:40:01 UTC Gerrit is back online. Github transfers are in progress and should be complete by 1900 UTC.
  • 2015-10-17 18:03:25 UTC Gerrit is offline for project renames.
  • 2015-10-17 17:11:10 UTC Gerrit will be offline for project renames starting at 1800 UTC.
  • 2015-10-13 11:19:47 UTC Gerrit has been restarted and is responding to normal load again.
  • 2015-10-13 09:44:48 UTC gerrit is undergoing an emergency restart to investigate load issues
  • 2015-10-05 14:03:13 UTC Gerrit was restarted to temporarily address performance problems
  • 2015-09-17 10:16:42 UTC Gate back to normal, thanks to the backlisting of the problematic version
  • 2015-09-17 08:02:50 UTC Gate is currently stuck, failing grenade upgrade tests due the release of oslo.utils 1.4.1 for Juno.
  • 2015-09-11 23:04:39 UTC Gerrit is offline from 23:00 to 23:30 UTC while some projects are renamed. http://lists.openstack.org/pipermail/openstack-dev/2015-September/074235.html
  • 2015-09-11 22:32:57 UTC 30 minute warning, Gerrit will be offline from 23:00 to 23:30 UTC while some projects are renamed http://lists.openstack.org/pipermail/openstack-dev/2015-September/074235.html
  • 2015-08-31 20:27:18 UTC puppet agent temporarily disabled on nodepool.openstack.org to avoid accidental upgrade to python-glanceclient 1.0.0
  • 2015-08-26 15:45:47 UTC restarting gerrit due to a slow memory leak
  • 2015-08-17 10:50:24 UTC Gerrit restart has resolved the issue and systems are back up and functioning
  • 2015-08-17 10:23:42 UTC review.openstack.org (aka gerrit) is going down for an emergency restart
  • 2015-08-17 07:07:38 UTC Gerrit is currently under very high load and may be unresponsive. infra are looking into the issue.
  • 2015-08-12 00:06:30 UTC Zuul was restarted due to an error; events (such as approvals or new patchsets) since 23:01 UTC have been lost and affected changes will need to be rechecked
  • 2015-08-05 21:11:30 UTC Correction: change events between 20:50-20:54 UTC (during the restart only) have been lost and will need to be rechecked or their approvals reapplied to trigger testing.
  • 2015-08-05 21:06:19 UTC Zuul has been restarted to resolve a reconfiguration failure: previously running jobs have been reenqueued but change events between 19:50-20:54 UTC have been lost and will need to be rechecked or their approvals reapplied to trigger testing.
  • 2015-08-03 13:41:37 UTC The Gerrit service on review.openstack.org has been restarted in an attempt to improve performance.
  • 2015-07-30 09:01:49 UTC CI is back online but has a huge backlog. Please be patient and if possible delay approving changes until it has caught up.
  • 2015-07-30 07:52:49 UTC CI system is broken and very far behind. Please do not approve any changes for a while.
  • 2015-07-30 07:43:12 UTC Our CI system is broken again today, jobs are not getting processed at all.
  • 2015-07-29 13:27:42 UTC zuul jobs after about 07:00 UTC may need a 'recheck' to enter the queue. Look if your change is in http://status.openstack.org/zuul/ and recheck if not.
  • 2015-07-29 12:52:20 UTC zuul's disks were at capacity. Space has been freed up and jobs are being re-queued.
  • 2015-07-29 09:30:59 UTC Currently our CI system is broken, jobs are not getting processed at all.
  • 2015-07-28 08:04:50 UTC zuul has been restarted and queues restored. It may take some time to work through the backlog.
  • 2015-07-28 06:48:20 UTC zuul is stuck and about to undergo an emergency restart, please be patient as job results may take a long time
  • 2015-07-22 14:35:43 UTC CI is slowly recovering, please be patient while the backlog is worked through.
  • 2015-07-22 14:17:30 UTC CI is currently recovering from an outage overnight. It is safe to recheck results with NOT_REGISTERED errors. It may take some time for zuul to work through the backlog.
  • 2015-07-22 08:16:50 UTC zuul jobs are currently stuck while problems with gearman are debugged
  • 2015-07-22 07:24:43 UTC zuul is undergoing an emergency restart. Jobs will be re-queued but some events may be lost.
  • 2015-07-10 22:00:47 UTC Gerrit is unavailable from approximately 22:00 to 22:30 UTC for project renames
  • 2015-07-10 21:04:01 UTC Gerrit will be unavailable from 22:00 to 22:30 UTC for project renames
  • 2015-07-03 19:33:46 UTC etherpad.openstack.org is still offline for scheduled database maintenance, ETA 19:45 UTC
  • 2015-07-03 19:05:45 UTC etherpad.openstack.org is offline for scheduled database maintenance, ETA 19:30 UTC
  • 2015-06-30 14:56:00 UTC The log volume was repaired and brought back online at 14:00 UTC. Log links today from before that time may be missing, and changes should be rechecked if fresh job logs are desired for them.
  • 2015-06-30 08:50:29 UTC OpenStack CI is down due to hard drive failures
  • 2015-06-12 22:45:07 UTC Gerrit is back online. Zuul reconfiguration for renamed projects is still in progress, ETA 23:30.
  • 2015-06-12 22:10:50 UTC Gerrit is offline for project renames. ETA 22:40
  • 2015-06-12 22:06:20 UTC Gerrit is offline for project renames. ETA 20:30
  • 2015-06-12 21:45:26 UTC Gerrit will be offline for project renames between 22:00 and 22:30 UTC
  • 2015-06-11 21:08:10 UTC Gerrit has been restarted to terminate a persistent looping third-party CI bot
  • 2015-06-04 18:43:17 UTC Gerrit has been restarted to clear an issue with its event stream. Any change events between 17:25 and 18:38 UTC should be rechecked or have their approvals reapplied to initiate testing.
  • 2015-05-13 23:00:05 UTC Gerrit and Zuul are back online.
  • 2015-05-13 22:42:09 UTC Gerrit and Zuul are going offline for reboots to fix a security vulnerability.
  • 2015-05-12 00:58:04 UTC Gerrit has been downgraded to version 2.8 due to the issues observed today. Please report further problems in #openstack-infra.
  • 2015-05-11 23:56:14 UTC Gerrit is going offline while we perform an emergency downgrade to version 2.8.
  • 2015-05-11 17:40:47 UTC We have discovered post-upgrade issues with Gerrit affecting nova (and potentially other projects). Some changes will not appear and some actions, such as queries, may return an error. We are continuing to investigate.
  • 2015-05-09 18:32:43 UTC Gerrit upgrade completed; please report problems in #openstack-infra
  • 2015-05-09 16:03:24 UTC Gerrit is offline from 16:00-20:00 UTC to upgrade to version 2.10.
  • 2015-05-09 15:18:16 UTC Gerrit will be offline from 1600-2000 UTC while it is upgraded to version 2.10
  • 2015-05-06 00:43:52 UTC Restarted gerrit due to stuck stream-events connections. Events since 23:49 were missed and changes uploaded since then will need to be rechecked.
  • 2015-05-05 17:05:25 UTC zuul has been restarted to troubleshoot an issue, gerrit events between 15:00-17:00 utc were lost and changes updated or approved during that time will need to be rechecked or have their approval votes readded to trigger testing
  • 2015-04-29 14:06:55 UTC gerrit has been restarted to clear a stuck events queue. any change events between 13:29-14:05 utc should be rechecked or have their approval votes reapplied to trigger jobs
  • 2015-04-28 15:38:04 UTC gerrit has been restarted to clear an issue with its event stream. any change events between 14:43-15:30 utc should be rechecked or have their approval votes reapplied to trigger jobs
  • 2015-04-28 12:43:46 UTC Gate is experiencing epic failures due to issues with mirrors, work is underway to mitigate and return to normal levels of sanity
  • 2015-04-27 13:48:14 UTC gerrit has been restarted to clear a problem with its event stream. change events between 13:09 and 13:36 utc should be rechecked or have approval votes reapplied as needed to trigger jobs
  • 2015-04-27 08:11:05 UTC Restarting gerrit because it stopped sending events (ETA 15 mins)
  • 2015-04-22 17:33:33 UTC gerrit is restarting to clear hung stream-events tasks. any review events between 16:48 and 17:32 utc will need to be rechecked or have their approval votes reapplied to trigger testing in zuul
  • 2015-04-18 15:11:25 UTC Gerrit is offline for emergency maintenance, ETA 15:30 UTC to completion
  • 2015-04-18 14:32:11 UTC Gerrit will be offline between 15:00-15:30 UTC today for emergency maintenance (starting half an hour from now)
  • 2015-04-18 14:02:07 UTC Gerrit will be offline between 15:00-15:30 UTC today for emergency maintenance (starting an hour from now)
  • 2015-04-18 02:29:15 UTC gerrit is undergoing a quick-ish restart to implement a debugging patch. should be back up in ~10 minutes. apologies for any inconvenience
  • 2015-04-17 23:07:06 UTC Gerrit is available again.
  • 2015-04-17 22:09:51 UTC Gerrit is unavailable until 23:59 UTC for project renames and a database update.
  • 2015-04-17 22:05:40 UTC Gerrit is unavailable until 23:59 UTC for project renames and a database update.
  • 2015-04-17 21:05:41 UTC Gerrit will be unavailable between 22:00 and 23:59 UTC for project renames and a database update.
  • 2015-04-16 19:48:11 UTC gerrit has been restarted to clear a problem with its event stream. any gerrit changes updated or approved between 19:14 and 19:46 utc will need to be rechecked or have their approval reapplied for zuul to pick them up
  • 2015-04-15 18:27:55 UTC Gerrit has been restarted. New patches, approvals, and rechecks between 17:30 and 18:20 UTC may have been missed by Zuul and will need rechecks or new approvals added.
  • 2015-04-15 18:05:15 UTC Gerrit has stopped emitting events so Zuul is not alerted to changes. We will restart Gerrit shortly to correct the problem.
  • 2015-04-10 15:45:54 UTC gerrit has been restarted to address a hung event stream. change events between 15:00 and 15:43 utc which were lost will need to be rechecked or have approval workflow votes reapplied for zuul to act on them
  • 2015-04-06 11:40:08 UTC gerrit has been restarted to restore event streaming. any change events missed by zuul (between 10:56 and 11:37 utc) will need to be rechecked or have new approval votes set
  • 2015-04-01 13:29:44 UTC gerrit has been restarted to restore event streaming. any change events missed by zuul (between 12:48 and 13:28 utc) will need to be rechecked or have new approval votes set
  • 2015-03-31 11:51:33 UTC Check/Gate unstuck, feel free to recheck your abusively-failed changes.
  • 2015-03-31 08:55:59 UTC CI Check/Gate pipelines currently stuck due to a bad dependency creeping in the system. No need to recheck your patches at the moment.
  • 2015-03-27 22:06:32 UTC Gerrit is offline for maintenance, ETA 22:30 UTC http://lists.openstack.org/pipermail/openstack-dev/2015-March/059948.html
  • 2015-03-27 21:02:04 UTC Gerrit maintenance commences in 1 hour at 22:00 UTC http://lists.openstack.org/pipermail/openstack-dev/2015-March/059948.html
  • 2015-03-26 13:13:33 UTC gerrit stopped emitting stream events around 11:30 utc and has now been restarted. please recheck any changes currently missing results from jenkins
  • 2015-03-21 16:07:02 UTC Gerrit is back online
  • 2015-03-21 15:08:01 UTC Gerrit is offline for scheduled maintenance || http://lists.openstack.org/pipermail/openstack-infra/2015-March/002540.html
  • 2015-03-21 14:54:23 UTC Gerrit will be offline starting at 1500 UTC for scheduled maintenance
  • 2015-03-04 17:17:49 UTC Issue solved, gate slowly digesting accumulated changes
  • 2015-03-04 08:32:42 UTC Zuul check queue stuck due to reboot maintenance window at one of our cloud providers - no need to recheck changes at the moment, they won't move forward.
  • 2015-01-30 19:32:23 UTC Gerrit is back online
  • 2015-01-30 19:10:04 UTC Gerrit and Zuul are offline until 1930 UTC for project renames
  • 2015-01-30 18:43:57 UTC Gerrit and Zuul will be offline from 1900 to 1930 UTC for project renames
  • 2015-01-30 16:15:03 UTC zuul is running again and changes have been reenqueud. seehttp://status.openstack.org/zuul/ before rechecking if in doubt
  • 2015-01-30 14:26:56 UTC zuul isn't running jobs since ~10:30 utc, investigation underway
  • 2015-01-27 17:54:45 UTC Gerrit and Zuul will be offline for a few minutes for a security update
  • 2015-01-20 19:54:47 UTC Gerrit restarted to address likely memory leak leading to server slowness. Sorry if you were caught in the restart
  • 2015-01-09 18:59:29 UTC paste.openstack.org is going offline for a database migration (duration: ~2 minutes)
  • 2014-12-06 16:06:03 UTC gerrit will be offline for 30 minutes while we rename a few projects. eta 16:30 utc
  • 2014-12-06 15:21:31 UTC [reminder] gerrit will be offline for 30 minutes starting at 16:00 utc for project renames
  • 2014-11-22 00:33:53 UTC Gating and log storage offline due to block device error. Recovery in progress, ETA unknown.
  • 2014-11-21 21:46:58 UTC gating is going offline while we deal with a broken block device, eta unknown
  • 2014-10-29 20:58:17 UTC Restarting gerrit to get fixed CI javascript
  • 2014-10-20 21:22:38 UTC Zuul erroneously marked some changes as having merge conflicts. Those changes have been added to the check queue to be rechecked and will be automatically updated when complete.
  • 2014-10-17 21:27:06 UTC Gerrit is back online
  • 2014-10-17 21:04:39 UTC Gerrit is offline from 2100-2130 for project renames
  • 2014-10-17 20:35:12 UTC Gerrit will be offline from 2100-2130 for project renames
  • 2014-10-17 17:04:01 UTC upgraded wiki.openstack.org from Mediawiki 1.24wmf19 to 1.25wmf4 per http://ci.openstack.org/wiki.html
  • 2014-10-16 16:20:43 UTC An error in a configuration change to mitigate the poodle vulnerability caused a brief outage of git.openstack.org from 16:06-16:12. The problem has been corrected and git.openstack.org is working again.
  • 2014-09-24 21:59:06 UTC The openstack-infra/config repo will be frozen for project-configuration changes starting at 00:01 UTC. If you have a pending configuration change that has not merged or is not in the queue, please see us in #openstack-infra.
  • 2014-09-24 13:43:48 UTC removed 79 disassociated floating ips in hpcloud
  • 2014-09-22 15:52:51 UTC removed 431 disassociated floating ips in hpcloud
  • 2014-09-22 15:52:23 UTC killed bandersnatch process on pypi.region-b.geo-1.openstack.org, hung since 2014-09-18 22:45 due to https://bitbucket.org/pypa/bandersnatch/issue/52
  • 2014-09-22 15:51:21 UTC restarted gerritbot to get it to rejoin channels
  • 2014-09-19 20:53:18 UTC Gerrit is back online
  • 2014-09-19 20:17:08 UTC Gerrit will be offline from 20:30 to 20:50 UTC for project renames
  • 2014-09-16 13:38:01 UTC jenkins ran out of jvm memory on jenkins06 at 01:42:20 http://paste.openstack.org/show/112155/
  • 2014-09-14 18:13:14 UTC all our pypi mirrors failed to update urllib3 properly, full mirror refresh underway now to correct, eta 20:00 utc
  • 2014-09-13 15:10:34 UTC shutting down all irc bots now to change their passwords (per the wallops a few minutes ago, everyone should do the same)
  • 2014-09-13 14:54:19 UTC rebooted puppetmaster.openstack.org due to out-of-memory condition
  • 2014-08-30 16:08:43 UTC Gerrit is offline for project renaming maintenance, ETA 1630
  • 2014-08-25 17:12:51 UTC restarted gerritbot
  • 2014-08-16 16:30:38 UTC Gerrit is offline for project renames. ETA 1645.
  • 2014-07-26 18:28:21 UTC Zuul has been restarted to move it beyond a change it was failing to report on
  • 2014-07-23 22:08:12 UTC zuul is working through a backlog of jobs due to an earlier problem with nodepool
  • 2014-07-23 20:42:47 UTC nodepool is unable to build test nodes so check and gate tests are delayed
  • 2014-07-15 18:23:58 UTC python2.6 jobs are failing due to bug 1342262 "virtualenv>=1.9.1 not found" A fix is out but there are still nodes built on the old stale images
  • 2014-06-28 14:40:16 UTC Gerrit will be offline from 1500-1515 UTC for project renames
  • 2014-06-15 15:30:13 UTC Launchpad is OK - statusbot lost the old channel statuses. They will need to be manually restored
  • 2014-06-15 02:32:57 UTC launchpad openid is down. login to openstack services will fail until launchpad openid is happy again
  • 2014-06-02 14:17:51 UTC setuptools issue was fixed in upstream in 3.7.1 and 4.0.1, please, recheck on bug 1325514
  • 2014-06-02 08:33:19 UTC setuptools upstream has broken the world. it's a known issue. we're hoping that a solution materializes soon
  • 2014-05-29 20:41:04 UTC Gerrit is back online
  • 2014-05-29 20:22:30 UTC Gerrit is going offline to correct an issue with a recent project rename. ETA 20:45 UTC.
  • 2014-05-28 00:08:31 UTC zuul is using a manually installed "gear" library with the timeout and logging changes
  • 2014-05-27 22:11:41 UTC Zuul is started and processing changes that were in the queue when it was stopped. Changes uploaded or approved since then will need to be re-approved or rechecked.
  • 2014-05-27 21:34:45 UTC Zuul is offline due to an operational issue; ETA 2200 UTC.
  • 2014-05-26 22:31:12 UTC stopping gerrit briefly to rebuild its search index in an attempt to fix post-rename oddities (will update with notices every 10 minutes until completed)
  • 2014-05-23 21:36:49 UTC Gerrit is offline in order to rename some projects. ETA: 22:00.
  • 2014-05-23 20:34:36 UTC Gerrit will be offline for about 20 minutes in order to rename some projects starting at 21:00 UTC.
  • 2014-05-09 16:44:31 UTC New contributors can't complete enrollment due to https://launchpad.net/bugs/1317957 (Gerrit is having trouble reaching the Foundation Member system)
  • 2014-05-07 13:12:58 UTC Zuul is processing changes now; some results were lost. Use "recheck bug 1317089" if needed.
  • 2014-05-07 13:04:11 UTC Zuul is stuck due to earlier networking issues with Gerrit server, work in progress.
  • 2014-05-02 23:27:29 UTC paste.openstack.org is going down for a short database upgrade
  • 2014-05-02 22:00:08 UTC Zuul is being restarted with some dependency upgrades and configuration changes; ETA 2215
  • 2014-05-01 00:06:18 UTC the gate is still fairly backed up, though nodepool is back on track and chipping away at remaining changes. some py3k/pypy node starvation is slowing recovery
  • 2014-04-30 20:26:57 UTC the gate is backed up due to broken nodepool images, fix in progress (eta 22:00 utc)
  • 2014-04-28 19:33:21 UTC Gerrit upgrade to 2.8 complete. See: https://wiki.openstack.org/wiki/GerritUpgrade Some cleanup tasks still ongoing; join #openstack-infra if you have any questions.
  • 2014-04-28 16:38:31 UTC Gerrit is unavailable until further notice for a major upgrade. See: https://wiki.openstack.org/wiki/GerritUpgrade
  • 2014-04-28 15:31:50 UTC Gerrit downtime for upgrade begins in 30 minutes. See: https://wiki.openstack.org/wiki/GerritUpgrade
  • 2014-04-28 14:31:51 UTC Gerrit downtime for upgrade begins in 90 minutes. See: https://wiki.openstack.org/wiki/GerritUpgrade
  • 2014-04-25 20:59:57 UTC Gerrit will be unavailable for a few hours starting at 1600 UTC on Monday April 28th for an upgrade. See https://wiki.openstack.org/wiki/GerritUpgrade
  • 2014-04-25 17:17:55 UTC Gerrit will be unavailable for a few hours starting at 1600 UTC on Monday April 28th for an upgrade. See https://wiki.openstack.org/wiki/GerritUpgrade
  • 2014-04-16 00:00:14 UTC Restarting gerrit really quick to fix replication issue
  • 2014-04-08 01:33:50 UTC All services should be back up
  • 2014-04-08 00:22:30 UTC All of the project infrastructure hosts are being restarted for security updates.
  • 2014-03-25 13:30:44 UTC the issue with gerrit cleared on its own before any corrective action was taken
  • 2014-03-25 13:22:16 UTC the gerrit event stream is currently hung, blocking all testing. troubleshooting is in progress (next update at 14:00 utc)
  • 2014-03-12 12:24:44 UTC gerrit on review.openstack.org is down for maintenance (revised eta to resume is 13:00 utc)
  • 2014-03-12 12:07:18 UTC gerrit on review.openstack.org is down for maintenance (eta to resume is 12:30 utc)
  • 2014-03-12 11:28:08 UTC test/gate jobs are queuing now in preparation for gerrit maintenance at 12:00 utc (eta to resume is 12:30 utc)
  • 2014-02-26 22:25:55 UTC gerrit service on review.openstack.org will be down momentarily for a another brief restart--apologies for the disruption
  • 2014-02-26 22:13:11 UTC gerrit service on review.openstack.org will be down momentarily for a restart to add an additional git server
  • 2014-02-21 17:36:50 UTC Git-related build issues should be resolved. If your job failed with no build output, use "recheck bug 1282876".
  • 2014-02-21 16:34:23 UTC Some builds are failing due to errors in worker images; fix eta 1700 UTC.
  • 2014-02-20 23:41:09 UTC A transient error caused Zuul to report jobs as LOST; if you were affected, leave a comment with "recheck no bug"
  • 2014-02-18 23:33:18 UTC Gerrit login issues should be resolved.
  • 2014-02-13 22:35:01 UTC restarting zuul for a configuration change
  • 2014-02-10 16:21:11 UTC jobs are running for changes again, but there's a bit of a backlog so it will still probably take a few hours for everything to catch up
  • 2014-02-10 15:16:33 UTC the gate is experiencing delays due to nodepool resource issues (fix in progress, eta 16:00 utc)
  • 2014-02-07 20:10:08 UTC Gerrit and Zuul are offline for project renames. ETA 20:30 UTC.
  • 2014-02-07 18:59:03 UTC Zuul is now in queue-only mode preparing for project renames at 20:00 UTC
  • 2014-02-07 17:35:36 UTC Gerrit and Zuul going offline at 20:00 UTC for ~15mins for project renames
  • 2014-02-07 17:34:07 UTC Gerrit and Zuul going offline at 20:00 UTC for ~15mins for project renames
  • 2014-01-29 17:09:18 UTC the gate is merging changes again... issues with tox/virtualenv versions can be rechecked or reverified against bug 1274135
  • 2014-01-29 14:37:42 UTC most tests are failing as a result of new tox and testtools releases (bug 1274135, in progress)
  • 2014-01-29 14:25:35 UTC most tests are failing as a result of new tox and testtools releases--investigation in progress
  • 2014-01-24 21:55:40 UTC Zuul is restarting to pick up a bug fix
  • 2014-01-24 21:39:11 UTC Zuul is ignoring some enqueue events; fix in progress
  • 2014-01-24 16:13:31 UTC restarted gerritbot because it seemed to be on the wrong side of a netsplit
  • 2014-01-23 23:51:14 UTC Zuul is being restarted for an upgrade
  • 2014-01-22 20:51:44 UTC Zuul is about to restart for an upgrade; changes will be re-enqueued
  • 2014-01-17 19:13:32 UTC zuul.openstack.org underwent maintenance today from 16:50 to 19:00 UTC, so any changes approved during that timeframe should be reapproved so as to be added to the gate. new patchsets uploaded for those two hours should be rechecked (no bug) if test results are desired
  • 2014-01-14 12:29:06 UTC Gate currently blocked due to slave node exhaustion
  • 2014-01-07 16:47:29 UTC unit tests seem to be passing consistently after the upgrade. use bug 1266711 for related rechecks
  • 2014-01-07 14:51:19 UTC working on undoing the accidental libvirt upgrade which is causing nova and keystone unit test failures (ETA 15:30 UTC)
  • 2014-01-06 21:20:09 UTC gracefully stopping jenkins01 now. it has many nodes which are offline status and only a handful online, while nodepool thinks it has ~90 available to run jobs
  • 2014-01-06 19:37:28 UTC gracefully stopping jenkins02 now. it has many nodes which are offline status and only a handful online, while nodepool thinks it has ~75 available to run jobs
  • 2014-01-06 19:36:12 UTC gating is operating at reduced capacity while we work through a systems problem (ETA 21:00 UTC)
  • 2014-01-03 00:13:32 UTC see: https://etherpad.openstack.org/p/pip1.5Upgrade
  • 2014-01-02 17:07:54 UTC gating is severely hampered while we attempt to sort out the impact of today's pip 1.5/virtualenv 1.11 releases... no ETA for solution yet
  • 2014-01-02 16:58:35 UTC gating is severely hampered while we attempt to sort out the impact of the pip 1.5 release... no ETA for solution yet
  • 2013-12-24 06:11:50 UTC fix for grenade euca/bundle failures is in the gate. changes failing on those issues in the past 7 hours should be rechecked or reverified against bug 1263824
  • 2013-12-24 05:31:47 UTC gating is currently wedged by consistent grenade job failures--proposed fix is being confirmed now--eta 06:00 utc
  • 2013-12-13 17:21:56 UTC restarted gerritbot
  • 2013-12-11 21:35:29 UTC test
  • 2013-12-11 21:34:09 UTC test
  • 2013-12-11 21:20:28 UTC test
  • 2013-12-11 18:03:36 UTC Grenade gate infra issues: use "reverify bug 1259911"
  • 2013-12-06 17:05:12 UTC i'm running statusbot in screen to try to catch why it dies after a while.
  • 2013-12-04 18:34:41 UTC gate failures due to django incompatibility, pip bugs, and node performance problems
  • 2013-12-03 16:56:59 UTC docs jobs are failing due to a full filesystem; fix eta 1750 UTC
  • 2013-11-26 14:25:11 UTC Gate should be unwedged now, thanks for your patience
  • 2013-11-26 11:29:13 UTC Gate wedged - Most Py26 jobs fail currently (https://bugs.launchpad.net/openstack-ci/+bug/1255041)
  • 2013-11-20 22:45:24 UTC Please refrain from approving changes that don't fix gate-blocking issues -- http://lists.openstack.org/pipermail/openstack-dev/2013-November/019941.html
  • 2013-11-06 00:03:44 UTC filesystem resize complete, logs uploading successfully again in the past few minutes--feel free to 'recheck no bug' or 'reverify no bug' if your change failed jobs with an "unstable" result
  • 2013-11-05 23:31:13 UTC Out of disk space on log server, blocking test result uploads--fix in progress, eta 2400 utc
  • 2013-10-13 16:25:59 UTC etherpad migration complete
  • 2013-10-13 16:05:03 UTC etherpad is down for software upgrade and migration
  • 2013-10-11 16:36:06 UTC the gate is moving again for the past half hour or so--thanks for your collective patience while we worked through the issue
  • 2013-10-11 14:14:17 UTC The Infrastructure team is working through some devstack node starvation issues which is currently holding up gating and slowing checks. ETA 1600 UTC
  • 2013-10-11 12:48:07 UTC Gate is currently stuck (probably due to networking issues preventing new test nodes from being spun)
  • 2013-10-05 17:01:06 UTC puppet disabled on nodepool due to manually reverting gearman change
  • 2013-10-05 16:03:13 UTC Gerrit will be down for maintenance from 1600-1630 UTC
  • 2013-10-05 15:34:37 UTC Zuul is shutting down for Gerrit downtime from 1600-1630 UTC
  • 2013-10-02 09:54:09 UTC Jenkins01 is not failing, it's just very slow at the moment... so the gate is not completely stuck.
  • 2013-10-02 09:46:39 UTC One of our Jenkins masters is failing to return results, so the gate is currently stuck.
  • 2013-09-24 15:48:07 UTC changes seem to be making it through the gate once more, and so it should be safe to "recheck bug 1229797" or "reverify bug 1229797" on affected changes as needed
  • 2013-09-24 13:30:07 UTC dependency problems in gating, currently under investigation... more news as it unfolds
  • 2013-08-27 20:35:24 UTC Zuul has been restarted
  • 2013-08-27 20:10:38 UTC zuul is offline because of a pbr-related installation issue
  • 2013-08-24 22:40:08 UTC Zuul and nodepool are running again; rechecks have been issued (but double check your patch in case it was missed)
  • 2013-08-24 22:04:36 UTC Zuul and nodepool are being restarted
  • 2013-08-23 17:53:29 UTC recent UNSTABLE jobs were due to maintenance to expand capacity which is complete; recheck or reverify as needed
  • 2013-08-22 18:12:24 UTC stopping gerrit to correct a stackforge project rename error
  • 2013-08-22 17:55:56 UTC restarting gerrit to pick up a configuration change
  • 2013-08-22 15:06:03 UTC Zuul has been restarted; leave 'recheck no bug' or 'reverify no bug' comments to re-enqueue.
  • 2013-08-22 01:38:31 UTC Zuul is running again
  • 2013-08-22 01:02:06 UTC Zuul is offline for troubleshooting
  • 2013-08-21 21:10:59 UTC Restarting zuul, changes should be automatically re-enqueued
  • 2013-08-21 16:32:30 UTC LOST jobs are due to a known bug; use "recheck no bug"
  • 2013-08-19 20:27:37 UTC gate-grenade-devstack-vm is currently failing preventing merges. Proposed fix: https://review.openstack.org/#/c/42720/
  • 2013-08-16 13:53:35 UTC the gate seems to be properly moving now, but some changes which were in limbo earlier are probably going to come back with negative votes now. rechecking/reverifying those too
  • 2013-08-16 13:34:05 UTC the earlier log server issues seem to have put one of the jenkins servers in a bad state, blocking the gate--working on that, ETA 14:00 UTC
  • 2013-08-16 12:41:10 UTC still rechecking/reverifying false negative results on changes, but the gate is moving again
  • 2013-08-16 12:00:34 UTC log server has a larger filesystem now--rechecking/reverifying jobs, ETA 12:30 UTC
  • 2013-08-16 12:00:22 UTC server has a larger filesystem now--rechecking/reverifying jobs, ETA 12:30 UTC
  • 2013-08-16 11:21:47 UTC the log server has filled up, disrupting job completion--working on it now, ETA 12:30 UTC
  • 2013-08-16 11:07:34 UTC some sort of gating disruption has been identified--looking into it now
  • 2013-07-28 15:30:29 UTC restarted zuul to upgrade
  • 2013-07-28 00:25:57 UTC restarted jenkins to update scp plugin
  • 2013-07-26 14:19:34 UTC Performing maintenance on docs-draft site, unstable docs jobs expected for the next few minutes; use "recheck no bug"
  • 2013-07-20 18:38:03 UTC devstack gate should be back to normal
  • 2013-07-20 17:02:31 UTC devstack-gate jobs broken due to setuptools brokenness; fix in progress.
  • 2013-07-20 01:41:30 UTC replaced ssl certs for jenkins, review, wiki, and etherpad
  • 2013-07-19 23:47:31 UTC Project affected by the xattr cffi dependency issues should be able to run tests and have them pass. xattr has been fixed and the new version is on our mirror.
  • 2013-07-19 22:23:27 UTC Projects with a dependency on xattr are failing tests due to unresolved xattr dependencies. Fix should be in shortly
  • 2013-07-17 20:33:39 UTC Jenkins is running jobs again, some jobs are marked as UNSTABLE; fix in progress
  • 2013-07-17 18:43:20 UTC Zuul is queueing jobs while Jenkins is restarted for a security update
  • 2013-07-17 18:32:50 UTC Gerrit security updates have been applied
  • 2013-07-17 17:38:19 UTC Gerrit is being restarted to apply a security update
  • 2013-07-16 01:30:52 UTC Zuul is back up and outstanding changes have been re-enqueued in the gate queue.
  • 2013-07-16 00:23:27 UTC Zuul is down for an emergency load-related server upgrade. ETA 01:30 UTC.
  • 2013-07-06 16:29:49 UTC Neutron project rename in progress; see https://wiki.openstack.org/wiki/Network/neutron-renaming
  • 2013-07-06 16:29:32 UTC Gerrit and Zuul are back online, neutron rename still in progress
  • 2013-07-06 16:02:38 UTC Gerrit and Zuul are offline for neutron project rename; ETA 1630 UTC; see https://wiki.openstack.org/wiki/Network/neutron-renaming
  • 2013-06-14 23:28:41 UTC Zuul and Jenkins are back up (but somewhat backlogged). See http://status.openstack.org/zuul/
  • 2013-06-14 20:42:30 UTC Gerrit is back in service. Zuul and Jenkins are offline for further maintenance (ETA 22:00 UTC)
  • 2013-06-14 20:36:49 UTC Gerrit is back in service. Zuul and Jenkins are offline for further maintenance (ETA 22:00)
  • 2013-06-14 20:00:58 UTC Gerrit, Zuul and Jenkins are offline for maintenance (ETA 30 minutes)
  • 2013-06-14 18:29:37 UTC Zuul/Jenkins are gracefully shutting down in preparation for today's 20:00 UTC maintenance
  • 2013-06-11 17:32:14 UTC pbr 0.5.16 has been released and the gate should be back in business
  • 2013-06-11 16:00:10 UTC pbr change broke the gate, a fix is forthcoming
  • 2013-06-06 21:00:45 UTC jenkins log server is fixed; new builds should complete, old logs are being copied over slowly (you may encounter 404 errors following older links to logs.openstack.org until this completes)
  • 2013-06-06 19:38:01 UTC gating is currently broken due to a full log server (ETA 30 minutes)
  • 2013-05-16 20:02:47 UTC Gerrit, Zuul, and Jenkins are back online.
  • 2013-05-16 18:57:28 UTC Gerrit, Zuul, and Jenkins will all be shutting down for reboots at approximately 19:10 UTC.
  • 2013-05-16 18:46:38 UTC wiki.openstack.org and lists.openstack.org are back online
  • 2013-05-16 18:37:52 UTC wiki.openstack.org and lists.openstack.org are being rebooted. downtime should be < 5 min.
  • 2013-05-16 18:36:23 UTC eavesdrop.openstack.org is back online
  • 2013-05-16 18:31:14 UTC eavesdrop.openstack.org is being rebooted. downtime should be less than 5 minutes.
  • 2013-05-15 05:32:26 UTC upgraded gerrit to gerrit-2.4.2-17 to address a security issue: http://gerrit-documentation.googlecode.com/svn/ReleaseNotes/ReleaseNotes-2.5.3.html#_security_fixes
  • 2013-05-14 18:32:07 UTC gating is catching up queued jobs now and should be back to normal shortly (eta 30 minutes)
  • 2013-05-14 17:55:44 UTC gating is broken for a bit while we replace jenkins slaves (eta 30 minutes)
  • 2013-05-14 17:06:56 UTC gating is broken for a bit while we replace jenkins slaves (eta 30 minutes)
  • 2013-05-04 16:31:22 UTC lists.openstack.org and eavesdrop.openstack.org are back in service
  • 2013-05-04 16:19:45 UTC test
  • 2013-05-04 15:58:36 UTC eavesdrop and lists.openstack.org are offline for server upgrades and moves. ETA 1700 UTC.
  • 2013-05-02 20:20:45 UTC Jenkins is in shutdown mode so that we may perform an upgrade; builds will be delayed but should not be lost.
  • 2013-04-26 18:04:19 UTC We just added AAAA records (IPv6 addresses) to review.openstack.org and jenkins.openstack.org.
  • 2013-04-25 18:25:41 UTC meetbot is back on and confirmed to be working properly again... apologies for the disruption
  • 2013-04-25 17:40:34 UTC meetbot is on the wrong side of a netsplit; infra is working on getting it back
  • 2013-04-08 18:09:34 UTC A review.o.o repo needed to be reseeded for security reasons. To ensure that a force push did not miss anything a nuke from orbit approach was taken instead. Gerrit was stopped, old bad repo was removed, new good repo was added, and Gerrit was started again.
  • 2013-04-08 17:50:57 UTC The infra team is restarting Gerrit for git repo maintenance. If Gerrit is not responding please try again in a few minutes.
  • 2013-04-03 01:07:50 UTC https://review.openstack.org/#/c/25939/ should fix the prettytable dependency problem when merged (https://bugs.launchpad.net/nova/+bug/1163631)
  • 2013-04-03 00:48:01 UTC Restarting gerrit to try to correct an error condition in the stackforge/diskimage-builder repo
  • 2013-03-29 23:01:04 UTC Testing alert status
  • 2013-03-29 22:58:24 UTC Testing statusbot
  • 2013-03-28 13:32:02 UTC Everything is okay now.