Heat/Using-HA

Follow the getting started guide for your particular distribution.
The first step is to get a basically functional heat/openstack setup, see one of the getting started guides linked from the Heat main wiki page.

Open required host firewall ports
Guests need to be able to communicate with the heat-api-cfn and heat-api-cloudwatch services, which run by default on port 8000 and 8003 respectively

This will mean adding firewall ports to the host where the instances are launched, so that the instances can connect to these services (if in doubt use nmap to port-scan the IP address where the services are running to ensure you definitely have connectivity)

sudo lokkit -p 8000:tcp sudo lokkit -p 8003:tcp

Run the required services
For HA to work, you must be running the heat-api-cfn and heat-api-cloudwatch services, in addition to the heat-engine, and optionally the heat-api service

Create the HA example stack
There are two main examples of the Heat HA functionality, one demonstrates service level HA, and the other instance (heartbeat) level HA.

See:

HA Example Template

IHA Example Template

heat stack-create ha -f ./templates/WordPress_Single_Instance_With_HA.template --parameters="InstanceType=m1.xlarge;DBUsername=${USER};DBPassword=verybadpass;KeyName=${USER}_key"

Test service restarting
ssh into the guest and test service restarting - if you kill the httpd service, it will be automatically restarted, up to a maximum of 3 times, when the instance will be rebuilt, demonstrating recovery escalation:

ssh ec2-user@10.0.0.2 sudo systemctl stop httpd.service sudo systemctl status httpd.service
 * 1) at most 1 minute later

Test Instance restarting
Using the IHA template, you can demonstrate instance level HA, the easiest way to try this is to ssh onto the guest and shut it down - after a short delay heat will rebuild the instance.

Confirm HA is working
In addition to observing the HA actions, you can examine logs for data that shows the service failure

sudo grep Http /var/log/heat/engine.log

2012-06-19 14:31:03   DEBUG [heat.engine.manager] new watch:HttpFailureAlarm data:{u'Namespace': u'system/linux', u'ServiceFailure': {u'Units': u'Counter', u'Value': 1}}

Also note there is a CLI tool, heat-watch which can be used to view Heat CloudWatch metrics, inject metric data, and also forcibly set HA alarm states, which can be useful for testing, see Heat/Using-CloudWatch