Difference between revisions of "Entropy"

Revision as of 23:40, 7 December 2013

Summary

Entropy is a framework to write audit and repair scripts for openstack. It will allow writing cluster-check scripts, and define reactions to the errors/issues these bugs raise.

Entropy will allow developers to write health checkers without worrying about deployment, setting up a Jenkins, integrating with an emailer, etc. It also allows definition of "reaction" scripts that wait on issues and take well defined actions (file a ticket, mark a hypervisor bad, etc). This automates reacting to failure at one level, and tackles inundating SEs with emails about (probably) minor issues. A potentially more important use is to aggregate failures, notice trends in failures, and developing a database of known failures to make dealing with new ones easier.

Revision as of 23:22, 7 December 2013 (view source) Pranesh Pandurangan (talk \| contribs) ← Older edit		Revision as of 23:40, 7 December 2013 (view source) Pranesh Pandurangan (talk \| contribs) Newer edit →
Line 3:		Line 3:
	Entropy is a framework to write audit and repair scripts for openstack. It will allow writing cluster-check scripts, and define reactions to the errors/issues these bugs raise.		Entropy is a framework to write audit and repair scripts for openstack. It will allow writing cluster-check scripts, and define reactions to the errors/issues these bugs raise.

−	Entropy will allow developers to write health checkers without worrying about deployment, setting up a Jenkins, integrating with an emailer, etc. It also allows definition of "reaction" scripts that wait on issues and take well defined actions (file a ticket, mark a hypervisor bad, etc). This automates reacting to failure at one level, and tackles inundating SEs with emails about (probably) minor issues.	+	Entropy will allow developers to write health checkers without worrying about deployment, setting up a Jenkins, integrating with an emailer, etc. It also allows definition of "reaction" scripts that wait on issues and take well defined actions (file a ticket, mark a hypervisor bad, etc). This automates reacting to failure at one level, and tackles inundating SEs with emails about (probably) minor issues. A potentially more important use is to aggregate failures, notice trends in failures, and developing a database of known failures to make dealing with new ones easier.