Monasca/Transform-proposal

This is the original proposal for Monasca Transform circa Openstack Mitaka. Updated documentation can be found at docs.openstack.org and wiki.openstack.org/wiki/Monasca/Transform.

monasca-transform: A transformation and aggregation engine for Monasca
https://blueprints.launchpad.net/monasca/+spec/monasca-transform


 * The monasca-transform project is a new component in Monasca that will aggregate and transform metrics.
 * monasca-transform is a data driven aggregation engine which collects, groups and aggregates existing individual Monasca metrics according to business requirements and publishes new transformed (derived) metrics to the Monasca Kafka queue.
 * Since the new transformed metrics are published as any other metric in Monasca, alarms can be set and triggered on the transformed metric, just like any other metric.

Problem description
Currently there is no mechanism to aggregate/transform/derive and publish new metrics in Monasca. There are several use cases which can be addressed by transforming, aggregating and publishing new metrics in Monasca: Metrics that are currently available on a per host or per VM basis, for example mem.total_mb or vm.mem.used_mb, might have to be added up for all hosts or VMs so that cloud operator can know the the total memory available across all hosts and total memory being allocated to all VMs for the entire cloud. There might be a need to derive metrics from one or more metrics, for example cpu.total_logical_cores and cpu.idle_perc are two metrics for a host which can be used to derive a new metric which will give total utilized CPUs. Also these per host utilized CPUs can be further summed up to find the number of CPUs being utilized across the entire cloud. It might be necessary to calculate the rate at which a metric value is changing over time. For example, to find the rate at which swift disk capacity is increasing over a time range, one would have to find the value of the disk usage at the start of the time interval (say start of the hour) then find the disk usage at the end of the time interval (say end of the hour) in order to determine the rate at which the disk usage increased or decreased during that hour. There might be a need to apply complex business logic on a set of metrics and derive a new metric. For example consider the periodic heart beat metrics which come from a VM indicating what state the VM was in. One might have to look at how the state of the VM changed over the target time interval and then use some business logic come up with a new metric which would indicate the amount of time that the VM was in a usable state.
 * Aggregate individual metrics
 * Combine multiple metrics to derive a new metric
 * Find the rate at which a metric changed over a time range
 * Deriving metrics using complex business logic

Proposed Change
The proposed monasca-transform component framework will fit into the microservices architecture of Monasca. At its core, it will collect a set of metrics from Apache Kafka[1] over a configurable time period, then using grouping operations, run a series of transformation (or aggregation) operations which can consist of simple mathematical operations like sum/max/min/avg or complex operations such as iterating through group using some business rules. The resulting transformed metric or set of transformed metrics will be published back to Apache Kafka.

There are some design considerations that we found to be essential from our past experience when building a similar aggregation framework to process vast amounts of raw log records from public cloud services:

Processing of metrics has to be scalable since we will have to collect, aggregate and publish vast amounts of metrics. The processing of data should be able to withstand any hardware failures such as CPU, memory or disk errors. The monasca-transform process should be Highly Available. In case the monasca-transform daemon process goes down on a server because of any software or hardware problem the framework should start processing metrics on a another server. The transformation framework has to be data driven and it should be possible to add new transformations for any new metrics by configuration changes. The transformation requirements of various metrics can be similar, so once a transformation routine has been implemented it should be possible to re-use the same transformation components and routines for a different set of metrics. Adding new transformation components should be easy and straightforward. Over a period of time business requirements and needs can change. To meet those requirements new transformation components will have to be developed and added or existing ones updated. It should be possible to add and update new components with minimum effort. Though the transformation routines can be complex, it should be possible to write unit tests for each individual transformation component to be confident of the accuracy of the component.
 * Scalable, Fault Tolerant and Highly Available
 * Data driven transformations
 * Reusable components and reusable transformation routines
 * Easy to add components
 * Flexible to change and adapt
 * Unit Testing

Alternatives
To avoid re-inventing the wheel it is desirable to use a framework or tool that provides the ability to process data across a cluster of commodity machines and that features work distribution, fault tolerance and the capability to seamlessly withstand node failures.

Hadoop with MapReduce/Yarn satisfies these requirements but requiring some kind of stable storage, HDFS and relying on temporary files to save intermediate processing states, they can be slow. They also require a number of components to be configured, installed, monitored and maintained across a cluster which is often a challenging and daunting task. Also it's highly likely that other associated tools like Hive (for sql-like processing of data) and Pig (which would faciliate the expression map reduce jobs as high level constructs) would be entailed. Using additional tools adds to the resource utilization burden as well as increasing development time.

Spark allows jobs to be expressed in Python consequently siting well with the openstack toolset. It has demonstrated much higher processing rates than Hadoop based tools yet allows existing domain knowledge of mapreduce type tools to be re-purposed. Spark is gaining momentum and adoption generally and this upward trajectory can be capitalised upon. It also provides support for streaming data, again meaning that we can concentrate on the task at hand rather than plumbing. For these reasons it has been selected to provide the processing framework.

Architecture
monasca-transform will be a data driven data processing framework implemented in python and based on Apache Spark[2]. Apache Spark[2] is a highly scalable, fast, in-memory, fault tolerant and parallel data processing framework. All monasca-transform components will be implemented in python and will use Spark's Python API to interact with Spark.

To process data, monasca-transform components will make use of Spark's transformation operations like groupByKey, filter, map, reduceByKey on Spark's Resilient Distributed Datasets (also commonly referred to as RDDs). RDD is Spark's basic abstraction for distributed memory.

monasca-transform will use the Spark Steaming direct API to retrieve Monasca metrics from the Kafka queue. monasca-transform will use Spark Data Frames for processing whenever possible. Spark Data Frames use a new query planner and analyzer called Catalyst and is more efficient and optimized than comparable RDD operations.

Data Frames provide a more SQL-like data manipulation capability which also makes it easy to use from a development point of view. It is possible to convert from a Spark RDD to a Data Frame and vice versa. Also, Spark provides a common equivalent API for manipulating both an RDD as well as a Data Frame.

In addition to Apache Spark, monasca-transform will also rely on MySQL database tables to store processing and configuration information:


 * processing information: Kafka offsets from Spark batch, used to resume processing after a failure or a restart
 * configuration information: Runtime parameters and transformation specifications, used to drive transformation and aggregation of metrics

It will be possible to run multiple monasca-transform processes on multiple nodes. Multiple monasca-transform processes will use zookeeper's leader election capability to elect a master. In case a node on which monasca-transform process is running as a master goes down, another monasca-transform process running on an another node will be elected a leader and start processing data.

Here are the components of monasca-transform:



We are proposing to deploy Spark cluster in standalone cluster mode alongside other Monasca components. In the minimal configuration, say for example in a devstack deployment, both master and worker can be deployed on a single node. Spark can be also deployed on multiple nodes, with multiple masters and workers running on each of the nodes.

Logical processing data flow
The logical data flow within monasca-transform can be broadly divided into two distinct flows, namely conversion of input metrics into a record store format and transforming of the record store data using series of generic transformation components to derive transformed metric data.

Conversion to record store format


Conversion of input metrics into the record store format consists of following steps:

Identification: Identify and filter out unnecessary metrics. This is done by comparing input metric names with a list of metrics to be transformed in the configuration datastore. If an input metric is not recognized, the metric is ignored in further processing.

Validation and Field Extraction: Input data is validated by checking for presence of expected fields in the metric JSON (expected fields are retrieved from the configuration datastore). Assuming the format is valid, the relevant fields are extracted.

Record Generator: Extracted data is then used to generate one or more new internal processing metrics and the associated metadata, this is referred to as Record Store Data. Multiple internal metrics can be generated in the case where the input raw metric has to be aggregated in more than one way.

An advantage of converting the input metrics into internal record store data is that the rest of the processing does not depend on the input metric format. In the future if there is a need to transform and aggregate data coming in new format, new code will have to written to convert the new data format into record store, but the rest of the processing pipeline will be unchanged.

Generic transformation (aggregation) to derive and publish new metrics


This is the second half of the processing data flow. Record store data is routed to the appropriate processing component based on internal metric type and specifications in the configuration datastore. Additional processing is then performed. Processing includes grouping the data in the record store with various user-defined parameters, setting of the final transformed metric name, and setting appropriate dimensions in the new transformed metric. All these operations are driven from configuration parameters. The resulting new transformed metrics are then published back to Apache Kafka.

Data model design


pre_transform_specifications and transform_specifications tables are used in conversion of input metrics into record store format and conversion of record store data into final transformed metrics to be published to Apache Kafka.

kafka_offsets table stores the offset information retrieved by Spark streaming direct Kafka API, when it starts processing a batch. Only if the processing data in a batch succeeds, the latest offset information is updated in kafka_offset table. The last saved offset information is used by the Spark driver on initialization to start processing from where it left off.


 * pre_transform_specifications table
 * event_type: name of the metric to process
 * pre_transform_spec: Specification in JSON format to aid in processing of data to record store format. The JSON specification consists of following fields:
 * event_type: name of the metric to
 * event_processing_params: key-value pairs dictionary of parameters to aid in conversion of input metric to record store format. Examples include set_default_region
 * intermediate_id_list: list of internal metrics, represented by their identifiers, that the input metric should be converted into. During record store generation script multiple internal metrics will be generated from the same input metric. Each internal metric can be transformed with a different processing pipeline.
 * required_raw_fields_list: list of JSON path strings, that will be used during validation to check if they exist and are not empty.
 * service_id: service identifier, identifies which service this input metric belongs to.


 * transform_specs table
 * metric_id: Identifier representing internal processing metric
 * transform_spec: Specification in JSON format which defines the processing pipeline and runtime parameters to be used by processing pipeline components. The JSON specification consists of following fields:
 * aggregation_params_map: key-value pairs which represent the processing pipeline and runtime parameters.
 * aggregation_pipeline: JSON representation of processing pipeline specification and any run time parameters used by components in processing pipeline e.g. aggregation_group_by_list
 * kafka_offsets table
 * topic: Kafka topic for which this offsets are being collected and stored
 * until_offset: end of the offset range
 * from_offset: start of the offset range
 * app_name: name of the application, on behalf of which these offsets are being stored.
 * partition: partition identifier

Design of generic transformation components
Generic transformation components will be reusable components that can be assembled into a complex transformation pipeline. Each generic component will be implemented as a python class and can invoke multiple Spark transformations. Generic components of a same type will implement functions which take standard set of arguments and return a standard Spark RDD. Source, Usage, Setter and Insert are few generic component types.

All usage components will implement a python function called 'usage' which takes 'transform_context' and 'record_store_dataframe' as arguments and returns an 'instance_usage_dataframe'. Similarly all setter components will implement a function called 'setter' which takes 'transform_context' and 'instance_usage_dataframe' as arguments and returns a modified 'instance_usage_dataframe. Component interfaces are standardized by input and output type to allow plug-ability.

Generic transformation components are extensible and new components can be added when necessary. For example if there is a need for some transformation routine to lookup a value by making a REST API call, a new setter component can be developed which implements the standard "setter" function interface. Since the new setter component adheres to the Setter component interface, it is possible to add this new component after or before any previous setter component. Complex transformation pipelines can be built by using series of these pluggable components via the transformation specification.

Generic transform components are easy to write and maintain. Besides being re-usable, these generic transform components can be combined to create complex transformation routines or pipelines. Unit tests in python can be written to test each component individually and also to test a complex transformation routines.

Example transformation of input metric
Following table lists transformation steps for a input metric

REST API impact
There will be no change to current Monasca API.

Security impact
monasca-transform uses Spark Streaming to connect to the "metrics" topic in Kafka to retrieve metrics for processing. monasca-transform will write transformed metrics back to Kafka on behalf of the tenant which can be set in a configuration file. The tenant should have appropriate monasca-admin role so that the data gets persisted in the Monasca datastore. The Spark Web user interface and connections between master and worker nodes can be secured using ACL and encryption[4]. Connection to Kafka and encryption is not yet supported, but work is currently underway to provide such a support in Kafka[4].

Other end user impact
New transformed metrics will be available in Monasca via Monasca API.

Performance/Scalability Impacts

 * Since new transformed metrics will be written back to the "metrics" topic in Kafka, it will have some impact in terms of increasing the metrics to be persisted. This should not have any significant impact as transformed/aggregated metrics would necessarily be significantly fewer in number compared to the overall number of incoming metrics being persisted.
 * Given that transformed metrics will be significantly fewer in number they should not have any appreciable impact on data retention policy.
 * Spark master and worker will have to be configured and installed on nodes where monasca-transform will be deployed (most likely alongside other Monasca components though it's not necessary) and will have to be allocated resources like CPU and Memory for the Spark master, worker and executor processes. Also, monasca-transform python components will have to be configured and deployed. Deploying Spark and monasca-transform will have some impact in terms of increasing load on nodes where it runs. This impact will have to be evaluated.
 * In terms of scalability, Spark worker nodes can be scaled horizontally across additional nodes in case the amount of metrics that have to be processed increases.

Other deployer impact
monasca-transform will have to be configured via configuration file(s). Configuration options include pointers to Zookeeper, batching interval used by Spark Streaming, tenant id on behalf of which metrics will be submitted to Kafka and Spark Master information so that the monasca-transform can submit jobs to Spark cluster. There will also be configuration options to access the MySQL database to pull in information from the driver tables.

In addition, the deployer which installs monasca-transform will have to install start/stop scripts for the Spark worker, Spark master and monasca-transform processes. The deployer will also have to insert pre-transformation specifications and transformation specifications in the MySQL driver tables.

monasca-transform will be an optional component in Monasca and users can elect to either install it or not. A devstack plugin is being authored which will provide the means to install monasca-transform and Spark and also a tempest test. This will enable verification that monasca-transform is working. This can be used in CI/CD processing.

Developer impact
Developers who want to write their own transformation routines for aggregating new metrics will have to write new pre-transformation specifications in the transformation specifications JSON and add them to driver tables. To add new transformation components, the developers have to write python modules which implement the appropriate interface such as usage, setter, insert, etc. and use Spark transformations to manipulate and/or write data back to Kafka. New components will also have to be added to the setup.cfg so that new components can be loaded and utilized by monasca-transform.

Assignee(s)
Primary assignee:

TBD

Other contributors:

TBD

Ongoing maintainer:

TBD

Work Items

 * Spark driver program to connect to Kafka and retrieve metrics, retrieve offsets and store offsets in MySQL db
 * Define pre-transform specifications and transform specifications tables and code to populate those two tables
 * Define pre-transform specifications and transform specifications JSON
 * Identify, Validate and Transform input metrics into record store records
 * Generic Transformation builder which will invoke transformation components for a pipeline
 * Implement generic transformation usage components and unit tests for fetch_quantity, fetch_quantity_util and fetch_quantity_rate_of_change
 * Implement generic transformation setter components and unit tests for rollup_quantity, set_aggregated_metric_name
 * Implement insert Kafka component to write aggregated metrics to Kafka
 * Implement more transformation specs to transform more metrics

Future lifecycle
Phase 1: process streaming data: Process incoming metric data that is coming in via the Kafka queue for specified time interval

Phase 2: process streaming data+persisted data: Process incoming data via Kafka and data already persisted and available in Monasca data store This would require implementing Spark connector/plugin which can talk to Monasca (along the lines of how Spark is integrated with cassandra, hbase)

Dependencies
Apache Spark 1.6.0

Zookeeper

Testing
Python unit tests will be written to test each component of the overall transformation pipeline individually, as well as to test a part of or whole transformation pipeline. A devstack plugin will be developed to install monasca-transform and Spark. Also a tempest test will have to written to test the working of monasca-transform component which can used in CI/CD process. This test will configure the transform specifications tables and then run transformation for some incoming metrics and check for the presence of new transformed metrics in Monasca.

Documentation Impact
monasca-transform is a new component and will have to be documented in detail on the Monasca Wiki. The architecture, typical deployment, configuration options and dependencies will have to be documented extensively.