Designate/Blueprints/Filtering API

Discussion Needed
The spec is still under heavy revision, and some features need to be discussed further before implementation. The main topic for now regards substring filtering. Currently, the API supports wildcards using SQL LIKE pattern matching (i.e. abc% will match abc.com., %abc will match www.abc, and %abc% will match www.abc.com). However, queries with wildcards on the left side pose a strain on the Designate system when a customer has thousands or tens of thousands of resources to filter through, because they will require a near-full or full scan through these resources. Some questions need to be answered before moving further on this.


 * Does substring filtering need to be restricted so that wildcards can only be at the far-right end of searches (e.g. abc%)?
 * If so, how would we implement this? Two possible methods:
 * What was proposed in the blueprint before: Do not rely on "%" wildcards. Instead, add a URL parameter, "match_type", that determines whether the search will be an exact search or substring search (the default will most likely be exact search).  So, /zones?name="abc"&match_type=substr will match the zone "abc.com.", but /zones?name="abc"&match_type=exact will not.
 * Allow % wildcards, but return an error code if there is a % wildcard anywhere in the filter criterion other than at the far right end.
 * If this is not the right way to restrict substrings, how should it be done? Or does it need to be done at all?

Overview
Filtering provides the ability to qualify the result set returned by a query to the designate api. It will ultimately be available on all collections - zones, record sets, rdata of record sets, and pools.

Filtering will be controlled using query parameters which match the name of the attribute being filtered. It is *not* required that all attributes are available as filter targets, but the majority will be.

Filters are either an exact match or substring match, which is specified by the optional URL parameter match_type. Wildcard and regular expression matching may be introduced in a later revision of the v2 API, so long as the implementation is backward compatible. For now, substring matching should be restricted to right-side substrings only (i.e. /zones?name=abc&match_type=substr would match zones with names abc.com. and abcd.com., but not "aabc.com.").

If the filtering request is successful, the resources that pass the filter criteria are returned, as well as links for retrieving more details.

Pagination of results will take advantage of proposed Designate pagination (https://blueprints.launchpad.net/designate/+spec/pagination)

Filtering Clarification
NOTE: Filtering and searching are two completely different features and will be addressed separately. Search involves the ability to compile a list of results from storage, possibly drawing from many different places (for example, finding all of a tenant's A records with a certain IP address). Filtering only involves further restricting the standard queries that are offered by the API (for example, /zones, /zones/{id}/recordsets, etc.).

You can find the spec for search here.

Completion History
This spec has been partially implemented. The following is a list of features that have been completed as well as a list of features that are pending. In addition, to further clarify the difference between filtering and searching, a list of features that is exclusively part of searching is included.

Completed

 * Basic filtering for:
 * Blacklists: pattern
 * Records: data
 * Recordsets: name, type, TTL
 * TLDs: name
 * Zones: name
 * Substring search using SQL pattern matching

Pending

 * Restrictions on substring search (currently substring searching is unrestricted)
 * Add more attributes for filtering
 * Changes to record/recordset filtering after the recordset API change

Search features, will NOT be implemented here

 * Searching for other parameters across all tenants in general
 * Done by adding an all_tenants option
 * Searching for recordsets by IP address across all tenants

API Changes
The large part of the Filtering API will be extensions of existing GET APIs on existing Resources (zones, record sets, rdata of record sets, and pool) to specify filters using query parameters.

High level description of implementation logic
Following is a high level description of the flow of logic. More detailed request/response formats are specified in the examples and use case section below.

1. User issues a filter request on a resource with appropriate filters. Optionally, an extra parameter, match-type (exact/substr) may be specified.

For example: GET /v2/zones?name=example.com&match-type=exact HTTP/1.1 Host: dns.provider.com Accept: application/json X-Auth-Token: KeyStoneAuth_*****

2. In Central:
 * issue the correct SQL query against the Storage database using the request query parameter as WHERE clause value.
 * If the filter is a substring filter, modify the query so that a wildcard is on the right side of the parameter being filtered. Then, use the SQL LIKE clause to filter.
 * return to designate-api

3. Api - return the result set.

API Details
Below are some filtering examples, both with exact and substring matching.

A Privileged user searching for a specified recordset by name within a specified zone for a given tenant
This is the ability to filter for a record by name in a given zone that might have lots of records. By definition, a specific zone belongs to one tenant so even a privileged user (support/admin) is able to search only within any specific account but not across all accounts at the same time.

Request:

GET zones/{zone-id}/recordsets?name=example&match-type=substr Host: dns.provider.com Accept: application/json X-Auth-Token: KeyStoneAuth_***** X-Tenant-ID: 44441

Response: {  "recordsets": [{ "id": "9e27811d-0320-4179-abb7-0e00e371e25b", "zone_id": "a86dba58-0043-4cc6-a1bb-69d5e86f3ca3", "name": "_xmpp-server._tcp.example.org.", "type": "A", "ttl": 3600, "status": "ACTIVE", "version": 1, "created_at": "...", "updated_at": null, "links": { "self": "https://dns.provider.com/v2/zones/a86dba58-0043-4cc6-a1bb-69d5e86f3ca3/recordsets/9e27811d-0320-4179-abb7-0e00e371e25b" }  }, {     "id": "dedf6879-fd9a-41d6-a7c2-eeac316fa7b3", "zone_id": "a86dba58-0043-4cc6-a1bb-69d5e86f3ca3", "name": "_xmpp-server2._tcp.example.org.", "type": "SRV", "ttl": 3600, "status": "ACTIVE", "version": 1, "created_at": "...", "updated_at": null, "links": { "self": "https://dns.provider.com/v2/zones/a86dba58-0043-4cc6-a1bb-69d5e86f3ca3/recordsets/dedf6879-fd9a-41d6-a7c2-eeac316fa7b3" }  }],   "links": { "self": "https://dns.provider.com/v2/zones/a86dba58-0043-4cc6-a1bb-69d5e86f3ca3/recordsets", "next": "https://dns.provider.com/v2/zones/a86dba58-0043-4cc6-a1bb-69d5e86f3ca3/recordsets?marker=dedf6879-fd9a-41d6-a7c2-eeac316fa7b3" } }

A Customer filtering for a specified recordset by name within a specified zone
A customer is able to filter on zone that belongs to customer By definition, a specific zone belongs to one tenant. Besides, ordinary tenants are not privileged to perform any operation on any other tenant id.

Request:

GET zones/{zone-id}/recordsets?name=www.example.org.&match-type=exact Host: dns.provider.com Accept: application/json X-Auth-Token: KeyStoneAuth_***** X-Tenant-ID: 33377

Response:

{  "recordsets": [{ "id": "9e27811d-0320-4179-abb7-0e00e371e25b", "zone_id": "a86dba58-0043-4cc6-a1bb-69d5e86f3ca3", "name": "www.example.org.", "type": "SRV", "ttl": 3600, "status": "ACTIVE", "version": 1, "created_at": "...", "updated_at": null, "links": { "self": "https://dns.provider.com/v2/zones/a86dba58-0043-4cc6-a1bb-69d5e86f3ca3/recordsets/9e27811d-0320-4179-abb7-0e00e371e25b" }  }],   "links": { "self": "https://dns.provider.com/v2/zones/a86dba58-0043-4cc6-a1bb-69d5e86f3ca3/recordsets", } }

Database Changes
N/A

Proposed Test Scenarios
It's possible to test every possible filter criteria; however, since the query logic is similar regardless of the attribute being filtered, it may be necessary to only test one. There are still multiple degrees of freedom, including:


 * Filtering using match_type = exact or substr
 * Filtering on an empty list or a populated list

Ideally, all combinations of these factors should have a test case.

In addition, performance testing would also be good to have, particularly to test the efficiency of substring matching. The prime motivator for restricting substring matching to right-side matching is because of concerns about the effect of left-side matching on Designate's databases. Therefore, it would be useful to understand the true performance difference to see how necessary such a restriction would be.