Jump to: navigation, search

Difference between revisions of "Oath FFU Juno To Ocata"

(Upgrade)
Line 20: Line 20:
 
=== Upgrade ===
 
=== Upgrade ===
  
# Take cluster snapshot of VMs status  
+
# Take cluster snapshot of VM status
# Start CMR 
+
# Snooze or silence any alerting utilities so that you don't get spammed while your cluster is down
# Snooze monitoring alerts and stop functional testing jobs
+
# Remove the cluster you are about to start upgrading from the dropdown in Horizon. You don't want users hitting the cluster that you are upgrading from Horizon
# Stop api and mq services
+
# Stop the API and MQ services.
# Stop nova compute on all HVs & confirm
+
# Stop nova-compute on all hypervisors
# Backup db
+
# Make a full backup of the database before migration just in case anything happens or your upgrade fails for some reason and you want to revert. (We never needed to restore our backup thankfully, but it was nice to have for piece of mind)
# Run DB migration script
+
# Back up your configurations from your old deployment. This is nice to have. We ran into some cases where we missed something and having old working configs as reference was nice.
# Re-image API and MQ nodes to RHEL-7
+
# Run DB migration scripts
# Chef bootstrap API, MQ, and DB nodes
+
# Re-image your API and MQ nodes if needed. Everyone's operating system requirements are different. We upgraded ours from RHEL6 to RHEL7 during this process. However, this was not required. We could have just as easily left our API and MQ nodes on RHEL6. If you don't need / want to upgrade your operating system as part of this, re-imaging your control plane with the same OS is still a good idea just to make sure all of the old cruft is removed. Your CI/CD pipelines should be able to deploy OpenStack from scratch.
# Env and runlist association
+
* We had 3 API nodes per cluster. During the upgrade process, we shut down 2 of the API nodes and left them on Juno. We attempted our upgrade / reimage on the third API node. This was done so that in case anything went wrong with the upgrade, we could bring back the 2 Juno API nodes easily to get our service back up while we figure out what went wrong.  
# Start deployment for API, MQ, DB nodes.
+
* We did not re-image our DB hosts
# Deploy to two Compute nodes online as canaries
+
# Deploy the code for the release that you are upgrading to. We went from Juno to Ocata. So we deployed the Ocata code directly. We did not deploy each release of the code one by one. We just deployed straight to Ocata.
# Bring cluster online with the two canary compute nodes
+
# Cleanup the DB backups if everything is working
# Validate VM creation, deletion work as expected. Ensure VMs don't disappear from canary hypervisors.
+
# Repeat the API node upgrade process for the API nodes that are still on your old revision.
# Upgrade remaining compute nodes
 
# Post deployment checks for Control plane and HVs
 
# Clean up any additional db backup did during migration
 

Revision as of 11:19, 26 February 2018

Preparing to upgrade

  1. Create CMR.
  2. Finalize CMR steps
  3. Testing chef bootstrap & converge for api, mq, db, HVs
  4. Verify runbooks for SRE/PE
  5. Add new VIPs to LB
  6. Validate control plane pipeline
  7. HV preparation
    1. Upgrade all compute nodes to RHEL 7
  8. DB preparation
    1. archive deleted rows
    2. validate backups
  9. Verify network ACLs are correct
  10. Update the Horizon Banner with CMR information
  11. Add VIP settings in Chef recipe
  12. Internal announcements on intranet, email, etc
  13. Build new jumphosts that use openstack-client, rather than the novaclient

Upgrade

  1. Take cluster snapshot of VM status
  2. Snooze or silence any alerting utilities so that you don't get spammed while your cluster is down
  3. Remove the cluster you are about to start upgrading from the dropdown in Horizon. You don't want users hitting the cluster that you are upgrading from Horizon
  4. Stop the API and MQ services.
  5. Stop nova-compute on all hypervisors
  6. Make a full backup of the database before migration just in case anything happens or your upgrade fails for some reason and you want to revert. (We never needed to restore our backup thankfully, but it was nice to have for piece of mind)
  7. Back up your configurations from your old deployment. This is nice to have. We ran into some cases where we missed something and having old working configs as reference was nice.
  8. Run DB migration scripts
  9. Re-image your API and MQ nodes if needed. Everyone's operating system requirements are different. We upgraded ours from RHEL6 to RHEL7 during this process. However, this was not required. We could have just as easily left our API and MQ nodes on RHEL6. If you don't need / want to upgrade your operating system as part of this, re-imaging your control plane with the same OS is still a good idea just to make sure all of the old cruft is removed. Your CI/CD pipelines should be able to deploy OpenStack from scratch.
  • We had 3 API nodes per cluster. During the upgrade process, we shut down 2 of the API nodes and left them on Juno. We attempted our upgrade / reimage on the third API node. This was done so that in case anything went wrong with the upgrade, we could bring back the 2 Juno API nodes easily to get our service back up while we figure out what went wrong.
  • We did not re-image our DB hosts
  1. Deploy the code for the release that you are upgrading to. We went from Juno to Ocata. So we deployed the Ocata code directly. We did not deploy each release of the code one by one. We just deployed straight to Ocata.
  2. Cleanup the DB backups if everything is working
  3. Repeat the API node upgrade process for the API nodes that are still on your old revision.