Jump to: navigation, search

Heat/Blueprints/as-update-policy

< Heat
Revision as of 21:17, 9 August 2013 by M4dcoder (talk | contribs) (Implementation)

Summary

The following is the proposed solution for the as-update-policy blueprint. We want to add an UpdatePolicy attribute that can be used with InstanceGroup and AutoScalingGroup to specify how changes to the launch configuration or subnet are rolled out. The UpdatePolicy attribute can be introduced to an existing stack during a request for a stack update. For the InstanceGroup resource type, we want to add the following snippet at the cfn template.

  "UpdatePolicy" : {
     "RollingUpdate" : {
        "MinInstancesInService" : "1",
        "MaxBatchSize" : "12",
        "PauseTime" : "PT60S"
     }
  }
  • MinInstancesInService indicates the number of instances that must be in service while other instances are being replaced.
  • MaxBatchSize indicates the maximum number of instances to roll out with each batch.
  • PauseTime indicates the wait time between each change.


The example below is a cfn template for an InstanceGroup with UpdatePolicy. This snippet is a revision of the sample heat template @ https://github.com/openstack/heat-templates/blob/master/cfn/F17/InstanceGroup.template. In the example below, update to the LaunchConfiguration in the JobServerGroup for an existing stack will trigger the specific RollingUpdate policy under the UpdatePolicy attribute. The name of the entry under UpdatePolicy is not significant. The InstanceGroup resource only expects one entry within the UpdatePolicy attribute. During the update, the JobServerGroup must have at least one instance in service. The update will be rolled out in batches of 5 instances. For each batch, new instances will be created first in parallel prior to terminating the old instances. There will be a 30 seconds pause before each batch is rolled out.

 "Resources" : {
   "JobServerGroup" : {
     "UpdatePolicy" : {
       "RollingUpdate" : {
         "MinInstancesInService" : "1",
         "MaxBatchSize" : "5",
         "PauseTime" : "PT30S"
       }
     },
     "Type" : "OS::Heat::InstanceGroup",
     "Properties" : {
       "LaunchConfigurationName" : { "Ref" : "JobServerConfig" },
       "Size" : {"Ref": "NumInstances"},
       "AvailabilityZones" : { "Fn::GetAZs" : "" }
     }
   },
   "JobServerConfig" : {
     "Type" : "AWS::AutoScaling::LaunchConfiguration",
     "Properties": {
       "ImageId"           : { "Ref" : "ImageId" },
       "InstanceType"      : { "Ref" : "InstanceType" },
       "KeyName"           : { "Ref" : "KeyName" },
       "NovaSchedulerHints": [ {"Key": "part", "Value": "long"},
                               {"Key": "ready", "Value": "short"} ],
       "UserData"          : { "Fn::Base64" : { "Fn::Join" : ["", [
         "#!/bin/bash -v\n"
       ]]}}
     }
   }
 }


The current Heat engine does not support changes in the underlying resource reference (i.e. LaunchConfiguration). Given the above example, if JobServerConfig is updated, when checking JobServerGroup for update, the changes to JobServerConfig is not recognized at _update_resource of the StackUpdate class. Therefore, a resource update for the InstanceGroup would not get triggered. The LaunchConfiguration resource will recognize the update but currently there's no update handler implemented and it seems more appropriate to let the InstanceGroup handle its own instance updates. So currently, the only way to trigger a change in the InstanceGroup as a result of the LaunchConfiguration change is if we rename the LaunchConfiguration JobServerConfig in the cfn template. Since LaunchConfigurationName is not in the update_allowed_properties of InstanceGroup, this will lead to a replace (destroy follow by create) of the existing InstanceGroup. This is not the desire solution as we want the update to the LaunchConfiguration to be rolled out in a controlled fashion.

Implementation

The following are changes proposed for implementation of this blueprint. The goal is to allow the InstanceGroup and AutoScalingGroup to recognize that there's an update with the LaunchConfiguration that it reference. The InstanceGroup and AutoScalingGroup should continue to make the decision on how to handle its own update. Currently, the template differences and property differences are passed into the update function and the resource makes the decision what to do with the differences. We want the change in LaunchConfiguration to be recognized as a property difference. To do that, we will override the FnGetRefId() of LaunchConfiguration to return physical_resource_name(). When any properties in LaunchConfiguration is modified, it will trigger the engine to replace the LaunchConfiguration; subsequently, the resource ID and also the physical resource name will also be renewed. The change in the physical resource name of the referenced LaunchConfiguration will trigger a property difference in the LaunchConfigurationName of the InstanceGroup. If LaunchConfigurationName is added into the update_allowed_properties, then the InstanceGroup and AutoScalingGroup will be able to handle update appropriately without triggering a destroy and replace of the entire group.

Modify LaunchConfiguration class

  • Override FnGetRefId to return physical_resource_name()


Add UpdatePolicy class

  • Put this new class in the autoscaling module under the engine module


Modify InstanceGroup and AutoScalingGroup

  • Add UpdatePolicy to updated_allowed_keys in InstanceGroup and modify handle_update to property differences in UpdatePolicy
    • Changes to the UpdatePolicy is only property changes
    • Changes to the UpdatePolicy alone does not trigger InstanceGroup to update/replace its instances
  • Add LaunchConfigurationName to the updated_allowed_properties so changes to the LaunchConfiguration will not trigger an UpdateReplace
  • Modify _create_template to resolve the LaunchConfigurationName correctly
    • Use conf = self.stack.resource_by_refid(self.properties['LaunchConfigurationName']) where conf will be the LaunchConfiguration resource
    • Use instance_definition = conf.t.copy() to get the instance definition
  • Modify handle_update to handle rolling update
    • The rolling update will only be triggered if there's an UpdatePolicy defined and that the LaunchConfigurationName is recognized as property difference.