Latest revision as of 17:23, 28 May 2019

Theory of Auto-Scaling

General Description

Conceptual Diagram

Components of Auto-Scaling

OpenStack offers a rich set of services to build, manage, orchestrate, and provision a cloud. This gives administrators some choices in how to best serve their customer's needs.

Scaling units - There are a number of components that can be controlled with Auto-Scaling.
- Compute Host
- VM running on a Compute Host
- Container running on a Compute Host
- Network Attached Storage
- Virtual Network Functions
Monitoring Service - either using an agent installed on the Scaling unit, or using a polling method to retrieve metrics
- Monasca
- Ceilometer from the Telemetry project
- Prometheus
Alarming Service
- Monasca has a built in alarm thresholding service and notification service
- Aodh from the Telemetry project
Decision Services - There are a number of services in OpenStack that can interpret metrics and alarms based on configured logic and produce commands to Orchestration Engines
- Congress
- Heat
- Vitrage
- Watcher
Orchestration Engines
- Heat
- Senlin is a clustering engine for OpenStack, and can orchestrate auto-scaling
- Tacker

Considerations and Guidelines

Monitoring takes resources, plan accordingly
Avoid scaling too quickly or too often
- This can be done by specifying appropriate cooldown periods.
- Another technique is to average the scaling metric over a longer time period to avoid reacting to sudden fluctuations
Don't expect instantaneous scaling (see above)
- Define thresholds to be predictive of scale needs, not reactive to a bad state
Be aware of where the logic for scaling is (alarm thresholds, decision services)
Define appropriate scaling limits in terms of minimum and maximum instances.
- Minimum number of instances will prevent all the instances from being removed.
- Maximum number of instances safeguards against provisioning too many resources that could adversely affect other workloads.
Applications must be horizontally scalable in order to auto-scale the underlying instances.
- Applications must be stateless or be able to drain existing stateful connections so that the underlying instances can be removed during a scale down.
- Incoming requests must be dynamically load balanced among the instances running the application.

@@ Line 4: / Line 4: @@
 <fill in>
+<what is the scope of auto-scaling, how does it differ from self-healing, what does it have in common with self-healing>
 == Conceptual Diagram ==
@@ Line 9: / Line 10: @@
 [[File:OpenStack-Auto-Scaling.svg|Auto-Scaling Architecture Component Diagram]]
-If you prefer PlantUML
+== Components of Auto-Scaling ==
-@startuml
+OpenStack offers a rich set of services to build, manage, orchestrate, and provision a cloud. This gives administrators some choices in how to best serve their customer's needs.
-cloud Cloud\n {
+* Scaling units - There are a number of components that can be controlled with Auto-Scaling.
-  rectangle host as "Host" {
+** Compute Host
-  }
+** VM running on a Compute Host
-  rectangle host2 as "Host" {
+** Container running on a Compute Host
-    agent VM
+** Network Attached Storage
-    agent VM2 as "VM"
+** Virtual Network Functions
-    agent Container
+* Monitoring Service - either using an agent installed on the Scaling unit, or using a polling method to retrieve metrics
-    agent Container2 as "Container"
+** [https://wiki.openstack.org/wiki/Monasca Monasca]
-  }
+** [https://wiki.openstack.org/wiki/Telemetry Ceilometer from the Telemetry project]
-}
+** Prometheus
+* Alarming Service
-agent MS as "Monitoring Service"
+** Monasca has a built in alarm thresholding service and notification service
-agent DS as "Decision Services\n(Clustering,\nOptimization,\nRoot Cause)"
+** [https://wiki.openstack.org/wiki/Telemetry Aodh from the Telemetry project]
-agent Heat as "Orchestration \nEngine"
+* Decision Services - There are a number of services in OpenStack that can interpret metrics and alarms based on configured logic and produce commands to Orchestration Engines
+** Congress
-host -down-> MS
+** Heat
-VM -down-> MS
+** Vitrage
-Container -down-> MS : "Metric \nSamples"
+** Watcher
+* Orchestration Engines
-MS -down-> DS : "Alarms"
+** Heat
-MS -down-> Heat : "Alarms"
+** Senlin is a clustering engine for OpenStack, and can orchestrate auto-scaling
+** [https://wiki.openstack.org/wiki/Tacker Tacker]
-DS -right-> Heat : "Scaling Commands"
-Heat -up-> host : "Orchestration"
-Heat -up-> VM2 : "Orchestration"
-Heat -up-> Container2 : "Orchestration"
-@enduml
 == Considerations and Guidelines ==
 * Monitoring takes resources, plan accordingly
 * Avoid scaling too quickly or too often
+** This can be done by specifying appropriate cooldown periods.
+** Another technique is to average the scaling metric over a longer time period to avoid reacting to sudden fluctuations
 * Don't expect instantaneous scaling (see above)
+** Define thresholds to be predictive of scale needs, not reactive to a bad state
 * Be aware of where the logic for scaling is (alarm thresholds, decision services)
+* Define appropriate scaling limits in terms of minimum and maximum instances.
+** Minimum number of instances will prevent all the instances from being removed.
+** Maximum number of instances safeguards against provisioning too many resources that could adversely affect other workloads.
+* Applications must be horizontally scalable in order to auto-scale the underlying instances.
+** Applications must be stateless or be able to drain existing stateful connections so that the underlying instances can be removed during a scale down.
+** Incoming requests must be dynamically load balanced among the instances running the application.
+=== Anecdotes and Stories ===

Difference between revisions of "Auto-scaling SIG/Theory of Auto-Scaling"

Latest revision as of 17:23, 28 May 2019

Contents

Theory of Auto-Scaling

General Description

Conceptual Diagram

Components of Auto-Scaling

Considerations and Guidelines

Anecdotes and Stories