Jump to: navigation, search

Difference between revisions of "Oath FFU Juno To Ocata"

Line 53: Line 53:
 
# Run DB migration scripts
 
# Run DB migration scripts
 
# Re-image your API and MQ nodes if needed. Everyone's operating system requirements are different. We upgraded ours from RHEL6 to RHEL7 during this process. However, this was not required. We could have just as easily left our API and MQ nodes on RHEL6. If you don't need / want to upgrade your operating system as part of this, re-imaging your control plane with the same OS is still a good idea just to make sure all of the old cruft is removed. Your CI/CD pipelines should be able to deploy OpenStack from scratch.
 
# Re-image your API and MQ nodes if needed. Everyone's operating system requirements are different. We upgraded ours from RHEL6 to RHEL7 during this process. However, this was not required. We could have just as easily left our API and MQ nodes on RHEL6. If you don't need / want to upgrade your operating system as part of this, re-imaging your control plane with the same OS is still a good idea just to make sure all of the old cruft is removed. Your CI/CD pipelines should be able to deploy OpenStack from scratch.
* We had 3 API nodes per cluster. During the upgrade process, we shut down 2 of the API nodes and left them on Juno. We attempted our upgrade / reimage on the third API node. This was done so that in case anything went wrong with the upgrade, we could bring back the 2 Juno API nodes easily to get our service back up while we figure out what went wrong.  
+
#* We had 3 API nodes per cluster. During the upgrade process, we shut down 2 of the API nodes and left them on Juno. We attempted our upgrade / reimage on the third API node. This was done so that in case anything went wrong with the upgrade, we could bring back the 2 Juno API nodes easily to get our service back up while we figure out what went wrong.  
* We did not re-image our DB hosts
+
#* We did not re-image our DB hosts
 
# Deploy the code for the release that you are upgrading to. We went from Juno to Ocata. So we deployed the Ocata code directly. We did not deploy each release of the code one by one. We just deployed straight to Ocata.
 
# Deploy the code for the release that you are upgrading to. We went from Juno to Ocata. So we deployed the Ocata code directly. We did not deploy each release of the code one by one. We just deployed straight to Ocata.
 
# Cleanup the DB backups if everything is working
 
# Cleanup the DB backups if everything is working
 
# Repeat the API node upgrade process for the API nodes that are still on your old revision.
 
# Repeat the API node upgrade process for the API nodes that are still on your old revision.

Revision as of 11:26, 26 February 2018

Our environment

We have 5 types of hosts

  • API
    • runs the following services:
      • Nova
      • Nova Scheduler
      • Placement
      • Nova-Network (yes, I know.)
      • Keystone
      • Glance API & Registry
  • DB
    • MySQL
  • MQ
    • Rabbit
  • HV/Compute
    • RHEL7
    • Networking: OpenVswitch using tagged vlans
    • Storage: All VMs stored on local disk, configured RAID 10. Instances all under /openstack
      • During an OS upgrade we will leave /openstack alone, and wipe/rewrite the operating system partitions.
  • UI
    • Horizon

Preparing to upgrade

  1. Notify users of upcoming downtime
  2. Testing chef bootstrap & converge for api, mq, db, HVs
  3. Verify runbooks for SRE/PE
  4. Add new VIPs to LB
  5. Validate control plane pipeline
  6. HV preparation
    1. Upgrade all compute nodes to RHEL 7
  7. DB preparation
    1. archive deleted rows
    2. validate backups
  8. Verify network ACLs are correct
  9. Update the Horizon Banner with CMR information
  10. Add VIP settings in Chef recipe
  11. Internal announcements on intranet, email, etc
  12. Build new jumphosts that use openstack-client, rather than the novaclient

Upgrade

  1. Take cluster snapshot of VM status
  2. Snooze or silence any alerting utilities so that you don't get spammed while your cluster is down
  3. Remove the cluster you are about to start upgrading from the dropdown in Horizon. You don't want users hitting the cluster that you are upgrading from Horizon
    • Note for Horizon: We left our old Horizon nodes with the Juno code base running on them. We brought up new machines for the new Horizon deployment. This made it easy to rollback in the event that something went wrong. Since we had multiple clusters, leaving our old Horizon nodes running allowed users to continue accessing the remaining Juno clusters while upgrades were in progress.
  4. Stop the API and MQ services.
  5. Stop nova-compute on all hypervisors
  6. Make a full backup of the database before migration just in case anything happens or your upgrade fails for some reason and you want to revert. (We never needed to restore our backup thankfully, but it was nice to have for piece of mind)
  7. Back up your configurations from your old deployment. This is nice to have. We ran into some cases where we missed something and having old working configs as reference was nice.
  8. Run DB migration scripts
  9. Re-image your API and MQ nodes if needed. Everyone's operating system requirements are different. We upgraded ours from RHEL6 to RHEL7 during this process. However, this was not required. We could have just as easily left our API and MQ nodes on RHEL6. If you don't need / want to upgrade your operating system as part of this, re-imaging your control plane with the same OS is still a good idea just to make sure all of the old cruft is removed. Your CI/CD pipelines should be able to deploy OpenStack from scratch.
    • We had 3 API nodes per cluster. During the upgrade process, we shut down 2 of the API nodes and left them on Juno. We attempted our upgrade / reimage on the third API node. This was done so that in case anything went wrong with the upgrade, we could bring back the 2 Juno API nodes easily to get our service back up while we figure out what went wrong.
    • We did not re-image our DB hosts
  10. Deploy the code for the release that you are upgrading to. We went from Juno to Ocata. So we deployed the Ocata code directly. We did not deploy each release of the code one by one. We just deployed straight to Ocata.
  11. Cleanup the DB backups if everything is working
  12. Repeat the API node upgrade process for the API nodes that are still on your old revision.