Monasca/Logging/Query API Design

Background
Currently, monasca-log-api provides a method for writing logs, but not reading. Reads are performed directly on Elasticsearch. Multi-tenant isolation is provided by a Kibana plugin, which directly accesses the Elasticsearch instance.

Given that Kibana is by far the most popular tool for analyzing logs stored in Elasticsearch, this is a very sensible solution. Additionally, even if a read API were to exist, then Kibana would not use it, because it is tightly coupled to Elasticsearch, and (AFAIK) does not support pluggable data sources. There is a fork of Kibana which attempts this, Kibi, but this appears to be focused on SQL databases.

Use Case
The use case in mind however is to allow logs to be visualized by tools other than Kibana, particularly Grafana, but also opening doors to bespoke tools being built which access the API via a Python client (or command line). The requirements for this are quite minimal - currently limited to what can be achieved using the Elasticsearch data-source in Grafana, which just allows specification of a query string (like that would be written in Kibana) for either raw document retrieval, or aggregation (counting).

Currently, this can be achieved by directly querying Elasticsearch, but this will not be suitable for multi-tenant systems as the logs from every tenant would be available to every other tenant. A workaround might be to give each tenant their own Elasticsearch instance.

Making an interface for accessing logs available to the command line tool would certainly be of use to operators.

Scope
I do not think there is much value, at least initially, in trying to replicate the full Elasticsearch API for Kibana to consume, as there is already a workable multi-tenant solution for providing access to the logs via Kibana if required. As mentioned before, Kibana would not be able to use the API regardless. There was initially an idea to proxy the Elasticsearch API, but this was not developed further.

I think there is value however in providing a cut down API to perform simple searches for common use cases, to be consumed by Grafana datasource or monasca-client. I think any work done here should complement the existing work in Monasca, not aim to replace it.

Initial Potential Design Path
Mimic the metric API to allow for simple searching of logs based on a conjunctive of dimension values (e.g. type=syslog,program=sshd) and time range. This has the advantage of consistency, and the Grafana plugin (for metrics) could be easily extended for finding logs (or events grok-ed out by logstash). The API would work in a similar regard to the Metric API, and only access the Elasticsearch index of the authorized tenant.

Some potential queries (using an imaginary CLI for example purposes):

monasca log-list --dimensions severity=error --limit 5 -10 monasca log-list --dimensions hostname=novacompute-042,severity=error --limit 10 -5 monasca log-list --dimensions type=syslog,program=sshd -30

The available dimensions would of course depend on how the user has configured their log storage pipeline. For example, logs which have been processed by suitable Logstash filters, more granular logs could be obtained. e.g. a common use case is to have a rule for parsing SSH logins, which could add some dimensions such as "event=ssh.reject,username=foo". In this case, these events could easily be searched for and used as Grafana annotations:

monasca log-list --dimensions event=ssh.reject -30

A compelling aspect about this approach option would be that it is almost identical to the Metrics API, so users will get a sense of strong sense of consistency. The CLI and the UI for creating queries in Grafana, would be very familiar. Users without prior knowledge of Elasticsearch/Kibana, but perhaps with some familiarity with Monasca Metrics, would also benefit from this.

Log Statistics
In keeping with the Metrics API, an additional API for Logs which exposes statistics (e.g. "log-statistics"), but driven from the log storage rather than metrics storage. The use case for this would be showing log activity in Grafana.

monasca log-count --dimensions type=syslog,severity=error -10

Extended Design Path
There are of course limitations to the breadth of searches which can be performed with the above API, being limited to a conjunctive of dimension values (wildcards could be considered). For more complex searches, there is a good chance the user will want to use Kibana (with the Multi-tenancy plugin). In might certainly be desirable to allow a wider range queries be performed via the API for visualizing in Grafana or viewing via the CLI. In this case, a second API could exist which simply passes through a query string to Elasticsearch, similar to how a search would be entered into Kibana, e.g.

monasca log-search “(connect AND program:sshd) OR event:ssh.accept” -10

There are questions here about the scope of this API, whether code from other OpenStack projects such as Searchlight could be re-used, and how much validation (if any) would need to be performed on the query strings.

Outstanding Issues

 * Should monasca-grafana-datasource be extended or forked (similar to how monasca-log-api is separate from monasca-api)
 * Should python-monascaclient be extended or forked

Decision taken for both of these points to integrate functionality into existing projects, not fork, for the sake of better user experience and reducing the code duplication and maintenance overhead.


 * Should we provide dimension names and dimension values listing endpoints as with metrics (would be useful for Grafana templating)

Given how useful the equivalent interfaces have been for the Metrics API, these would be desirable. They will be considered as a separate piece of development work from the initial "log list" interface.


 * How to interact with proposed log attributes bp:publish-logs-to-topic-selectively
 * Is an additional dependency on elasticsearch acceptable (alternatively just use raw HTTP requests - API is quite straightforward)
 * How configurable should the Elasticsearch queries be (e.g. w.r.t message/dimensions heirarchy)
 * Is it time to create a Monasca mapping template for Elasticsearch (needed to disable tokenization on dimensions for reliable term queries)

Overview
The intention of this API is to provide the ability for visualizing log data in Grafana or other tools. It is intended to be a precise mechanism of obtaining logs or events, based on dimensions either provided by the logging subsystem (e.g. rsyslog), or dimensions parsed from log messages using Logstash or similar tools. The API will aim to be as consistent as possible with the Metrics service, and largely follow that of the Metric Measurements API. Implementing this capability fully will span the following components:


 * monasca-log-api Extend the API with methods to obtain log listing based on a time range and matching a set of dimensions, similar to how metric measurements can be obtained (without ‘metric name’). Requests will be performed on the Elasticsearch index of the requested project/tenant, but only if authorized to do so, yielding necessary isolation.
 * monasca-log-persister It may be necessary to modify the log persister configuration in order to store log data in Elasticsearch in a consistent structure, and impose a mapping template for certain fields for search purposes.
 * monasca-grafana-datasource Extend the datasource with the ability to also obtain log entries from the log API. This would re-use lots of the existing functionality for obtaining metrics, but return entries as JSON, similar to the Elasticsearch datasource, so they can be displayed in tabular form. We ideally also want to support log entries as annotations.
 * python-monascaclient Optional piece of work to extend the Python client and command line interface to access the log API. This might be especially useful to operators who prefer command line tools.

API Design and Rationale
The API will be available in this document in the moasca-log-api Git repository:

Monasca Log API

The proposed changes currently exist as a pull-request:

Specification for log listing API

Query API will be added to v3.0 given that it is extending with new endpoints, and not modifying existing endpoints.

The choice of endpoint is based on the thinking that there may be a number of ways to access logs in potential future designs, such as:


 * GET /v3.0/logs - Log listing based on simple filtering by dimensions
 * GET /v3.0/logs/metrics - Log metrics (counting) based on same criteria as above
 * GET /v3.0/logs/search - Log search based on message contents using arbitrary ES query string

Elasticsearch Query
Logs are currently stored in different Elasticsearch indices for each project/tenant. Therefore log listing requests will be forwarded to the necessary index, assuming that the request has a necessary authorization (validated using the supplied Keystone token).

The requests will map roughly as follows into Elasticsearch query concepts:


 * tenant_id Index pattern to query, e.g. /monasca--*/_search
 * dimensions For each dimension passed in the request, a terms query will be used against the necessary field, in order to match against the exact values passed. This allows us to pass a list of values, and return documents which many any of the values, as per the API specification. In order to match documents precisely, not subject to any scoring heuristic, all terms will be inside a filter context.
 * start_time/end_time Will be applied with a greater-than-or-equal (gte) and less-than-or-equal (lte) range term-level query on the configured timestamp field.

In addition, a sorting will be applied as specified, or by default in most-recent first order (timestamp desc).

GET /-*/_search {   "from": , “size”: , "sort" : [ { "@timestamp" : "desc" } ],   "query": { “filter”: [ { "terms" : { "log.dimensions." : ["", "", ...]} }, { "terms" : { "log.dimensions." : ["", "", ...]} }, ...           { "range" : { "@timestamp" : { "gte": "",  "lte": "" } } ]   } }

Using timestamp to perform paging is not ideal as we could miss log entries with identical timestamps. It may be more robust to use the implicit _id field instead.

Given that how logs are stored in Elasticsearch is highly subjective, it may be necessary to have the index pattern and field paths for each term configurable, in order to accommodate different strategies. e.g. the following could be configurable


 * Index Pattern e.g. "{project_id}-*" (if using mapping template, might want to be "monasca-{project_id}-*")
 * Timestamp Field Path e.g. "@timestamp"
 * Message Field Path e.g. "message" or "log.message"
 * Dimension Fields Path e.g. "{name}.raw" or "log.dimensions.{name}"

It is important to note that for the terms query to work correctly on dimension fields, they must be stored in Elasticsearch as “not_analyzed”, in order to prevent values being tokenized. This is especially important with for example, host names with dashes in (e.g. dev-01, staging-01, ...) as Elasticsearch will by default decompose these down (e.g. dev, staging, 01).

Configuration File Additions
To accommodate the new endpoint, a new dispatcher will exist:

[dispatcher] logs_v3 = monasca_log_api.reference.v3.logs:LogsList

In keeping with the patterns employed by the monasca-api service, a logs repository interface will exist and by default be provided by an Elasticsearch driver:

[repositories] logs_driver = monasca_log_api.common.repositories.elasticsearch.logs_repository:LogsRepository
 * 1) The driver to use for the logs repository

The Elasticsearch repository will require configuration options for connecting to the database, but also allow specification of the various paths:

[elasticsearch]

uri = :, : ,... use_ssl = False verify_certs = False ca_certs = '' client_cert = '' client_key = ''

index_pattern = '{project_id}-*' timestamp_field = '@timestamp' message_field = 'message' not_analyzed_dimension_field = '{dimension_name}.raw'

monasca-log-persister
It may be necessary to modify the log persister configuration such that Logstash specifies a mapping template to Elasticsearch, in order to configure the way Elasticsearch interprets the fields it is passed. This would also impose a naming convention on the indices, in order to match the template (unless there is another way of doing this?).

We may want to impose a convention on where the dimensions are stored (e.g. in "dimensions.xxx" or "log.dimensions.xxx") so that we are able to provide a dimension name listing API in the future.

e.g. for Logstash currently our deployment uses:

output { elasticsearch { index => "monasca-%{[meta][tenantId]}-%{+YYYY.MM.dd}" document_type => "log" template_name => "monasca" manage_template => true template => "/etc/logstash/monasca_elasticsearch_template" template_overwrite => true hosts => ["monasca-elasticsearch"] } }

Where the template file used is as follows, in order to correctly configure the dimensions fields for term queries to work robustly. Alternately, we could follow the Logstash mapping, and provide a separate ".raw" field which contains the "not_analyzed" version of the field.

/etc/logstash/monasca_elasticsearch_template { "aliases": {}, "mappings": { "log": { "_all": { "enabled": true, "omit_norms": true },     "dynamic_templates": [ {         "message_field": { "mapping": { "fielddata": { "format": "disabled" },             "index": "analyzed", "omit_norms": true, "type": "string" },           "match": "message", "match_mapping_type": "string" }       },        {          "other_fields": { "mapping": { "index": "not_analyzed", "type": "string" },           "match": "*", "match_mapping_type": "string" }       }      ],      "properties": { "@timestamp": { "type": "date" },       "@version": { "index": "not_analyzed", "type": "string" },       "creation_time": { "type": "date" }     }    }  },  "order": 0, "settings": { "index": { "refresh_interval": "5s" } },  "template": "monasca-*" }

monasca-grafana-datasource
Section is unfinished.

Configuration Screen
New configuration option to specify the log-api endpoint.

Extend Query Control panel
Allow selection of “Metrics” or “Logs”.

Annotations
Similar modifications as to that of Query panel

Data Source
Methods to access API based off metric measurements.

python-monascaclient
Section is unfinished.