Difference between revisions of "Monasca/Logging/Query API Design"

Revision as of 10:36, 31 January 2017

Background

Currently, monasca-log-api provides a method for writing logs, but not reading. Reads are performed directly on Elasticsearch. Multi-tenant isolation is provided by a Kibana plugin, which directly accesses the Elasticsearch instance.

Given that Kibana is by far the most popular tool for analyzing logs stored in Elasticsearch, this is a very sensible solution. Additionally, even if a read API were to exist, then Kibana would not use it, because it is tightly coupled to Elasticsearch, and (AFAIK) does not support pluggable data sources. There is a fork of Kibana which attempts this, Kibi, but this appears to be focused on SQL databases.

Use Case

The use case in mind however is to allow logs to be visualized by tools other than Kibana, particularly Grafana, but also opening doors to bespoke tools being built which access the API via a Python client (or command line). The requirements for this are quite minimal - currently limited to what can be achieved using the Elasticsearch data-source in Grafana, which just allows specification of a query string (like that would be written in Kibana) for either raw document retrieval, or aggregation (counting).

Currently, this can be achieved by directly querying Elasticsearch, but this will not be suitable for multi-tenant systems as the logs from every tenant would be available to every other tenant. A workaround might be to give each tenant their own Elasticsearch instance.

Making an interface for accessing logs available to the command line tool would certainly be of use to operators.

Scope

I do not think there is much value, at least initially, in trying to replicate the full Elasticsearch API for Kibana to consume, as there is already a workable multi-tenant solution for providing access to the logs via Kibana if required. As mentioned before, Kibana would not be able to use the API regardless. There was initially an idea to proxy the Elasticsearch API, but this was not developed further.

I think there is value however in providing a cut down API to perform simple searches for common use cases, to be consumed by Grafana datasource or monasca-client. I think any work done here should complement the existing work in Monasca, not aim to replace it.

Initial Potential Design Path

Mimic the metric API to allow for simple searching of logs based on a conjunctive of dimension values (e.g. type=syslog,program=sshd) and time range. This has the advantage of consistency, and the Grafana plugin (for metrics) could be easily extended for finding logs (or events grok-ed out by logstash). The API would work in a similar regard to the Metric API, and only access the Elasticsearch index of the authorized tenant.

Some potential queries (using an imaginary CLI for example purposes):

monasca log-list --dimensions hostname=novacompute-042,severity=error --limit 10 -5
monasca log-list --dimensions type=syslog,program=sshd -30

The available dimensions would of course depend on how the user has configured their log storage pipeline. For example, logs which have been processed by suitable Logstash filters, more granular logs could be obtained. e.g. a common use case is to have a rule for parsing SSH logins, which could add some dimensions such as "event=ssh.reject,username=foo". In this case, these events could easily be searched for and used as Grafana annotations:

monasca log-list --dimensions event=ssh.reject -30

A compelling aspect about this approach option would be that it is almost identical to the Metrics API, so users will get a sense of strong sense of consistency. The CLI and the UI for creating queries in Grafana, would be very familiar. Users without prior knowledge of Elasticsearch/Kibana, but perhaps with some familiarity with Monasca Metrics, would also benefit from this.

Log Statistics

In keeping with the Metrics API, an additional API for Logs which exposes statistics (e.g. "log-statistics"), but driven from the log storage rather than metrics storage. The use case for this would be showing log activity in Grafana.

monasca log-count --dimensions type=syslog,severity=error -10

Extended Design Path

There are of course limitations to the breadth of searches which can be performed with the above API, being limited to a conjunctive of dimension values (wildcards could be considered). For more complex searches, there is a good chance the user will want to use Kibana (with the Multi-tenancy plugin). In might certainly be desirable to allow a wider range queries be performed via the API for visualizing in Grafana or viewing via the CLI. In this case, a second API could exist which simply passes through a query string to Elasticsearch, similar to how a search would be entered into Kibana, e.g.

monasca log-search “(connect AND program:sshd) OR event:ssh.accept” -10

There are questions here about the scope of this API, whether code from other OpenStack projects such as Searchlight could be re-used, and how much validation (if any) would need to be performed on the query strings.

@@ Line 7: / Line 7: @@
 == Use Case ==
-The use case in mind however is to allow logs to be visualized by tools other than Kibana, particularly Grafana, but also opening doors to bespoke tools being built which access the API via a Python client (or command line). The requirements for this are quite minimal - currently limited to use the Elasticsearch data-source in Grafana, which just allows specification of a query string (like that would be written in Kibana) for either raw document retrieval, or aggregation (counting).
+The use case in mind however is to allow logs to be visualized by tools other than Kibana, particularly Grafana, but also opening doors to bespoke tools being built which access the API via a Python client (or command line). The requirements for this are quite minimal - currently limited to what can be achieved using the Elasticsearch data-source in Grafana, which just allows specification of a query string (like that would be written in Kibana) for either raw document retrieval, or aggregation (counting).
 Currently, this can be achieved by directly querying Elasticsearch, but this will not be suitable for multi-tenant systems as the logs from every tenant would be available to every other tenant. A workaround might be to give each tenant their own Elasticsearch instance.
-Making the the same interface available to the command line tool would certainly be of use to operators, e.g.
+Making an interface for accessing logs available to the command line tool would certainly be of use to operators.
-<pre>
-monasca list-logs --start-time -5 --dimensions severity=error --limit 5
-</pre>
 == Scope ==