Jump to: navigation, search

GeoTagging

Revision as of 22:02, 25 September 2013 by Malini-k-bhandaru (talk | contribs) (Storage Changes)

Geo and Asset Tagging

While the cloud enables workloads and data to reside anywhere, users may be constrained to run their workloads and save their data in certain geographies due to regulatory reasons. This extends beyond trusting the cloud's hardware resources to be free of malware and rootkits. Extensions to Trusted Compute Pools (TCP) enable associating with hardware, at provision time, asset and geo-tags. Intel Trusted Execution Environment (TXT) or other measure launch environments (MLEs) facilitate measuring such provision time information into the Trusted Platform Module (TPM). Attestation services can be used to ascertain that these provision time meta data have not been tampered with.

Asset and Geo Tags can be used to:

  1. Monitor and Enforce Customer Policies
    1. Control workload placement
    2. Control data storage
  2. Meet Service level agreements (SLAs)
    1. Resource reservation at provision time. Useful in private, public and hybrid clouds. For example, in a private cloud, Finance and HR may not want Engineering to overrun their resources.
  3. Provide Control and Visibility to Cloud End-users
    1. Display in dashboards asset/geo associations of VM and Data
    2. Generate audit logs of Hardware/VMs/data with asset/geo details.

Use Cases

Government Security

A variant of the general asset-tag is the geo-tag, where does a machine physically reside. Governments for instance may restrict where their workloads may run, where their data may be saved.

Commerce

For taxation purposes a retailer may want to ensure that their online web portal is placed only on machines in certain states. It may also have similar constraints on the data it stores. Yet another use case is banking and disclosure. Swiss banks protect their clients thanks to their client privacy and disclosure policies.

Research Freedom

Companies may restrict what categories of research are carried out in different geos. For example stem cell research, drug discovery research fall into this category. Each government may have different policies around these tracts.


Geo Tagging in OpenStack


NIST and Intel are collaborating on Asset Tagging and in particular Geo-Tagging. Mid-2014 Intel plans to release an attestation service that measures asset tag information, confirming that it has not been tampered with since the machine was registered at the time of provisioning.

This blueprint details how asset and geo-tagging can be incorporated and taken advantage of in OpenStack clouds.

OpenStack Changes

Asset/Geo Tagging builds on the Trusted Compute Pools feature, covered in blueprint: blueprint: trusted-computing-pools . Also see details: TrustedComputingPools

Compute Node Provisioning

In addition to compute nodes being provisioned for trust, asset-tags and geo-tags may be assigned at the same time. These can be simple strings, "3 rd Floor, Expo Center, Hong Kong", or complex XML data providing sub-items such as GPS co-ordinates, postal address, and more.

Dashboard

  1. Flavor Extra Specs, Volume Extra Specs

The extra specs field readily supports specifying geo and other asset tag constraints.

  1. Displaying VM and Volume geo/asset tag affiliations

The Horizon UI for instance and volume lists could be extended to display in addition to current information, trusted and geo tags. For instance, it would be logical to add a little trusted seal if a compute node is trusted, and by extension a VM running on the same compute node. A country flag would be a good geo indicator.

  1. Object listings

Could also contain geo indicators.

Nova Scheduler Filter

Asset /Geo Tag filters should be specified. They will be very similar to todays Aggregate and Availability filters with the distinction that the data they retrieve from the Attestation service may need to be parsed. For instance, geo-tag data may be retrieved as a json string or as XML. In the case of XML, the data may be comprised of a GPS element, a postal address element. The data so retrieved may need to be parsed if the filter requires match on country, or state and country. We recommend that filter code take a policy argument to determine what manner of parsing is required, and the extracted data then used to determine placement,

The same filter techniques are usable by the scheduler for volume placement and live migration of VMs. Object placement is a little more involved in the case of swift with its has code computation for object replica placement and re-balancing in the case of resources going offline.

Storage

Asset/Geo Tags are readily usable for block storage. Object storage in the context of Swift is a little more involved and shall be covered in a separate blueprint and be addressed in phase-2. This is chiefly because the functionality that computes hash codes to determine where to place the Swift replicas needs to be modified. Further Ring balance logic in the event of hardware and/or network failures needs to be modified. Last but not least, the Swift API for object put/get will need to be modified to specify geo/asset tag constraints.

Audit Tasks

Audit tasks could for trusted nodes also determine if any geo/asset-tags are specified and capture these in logs and/or reports.

Attestation Service

The TCP 1.5 Attestation Service, which can understand asset and geo tags, needs to be integrated into the cloud installation. The Attestation service will provide an API which enables retrieving asset and geo tags from attested machines. These can be cached at the attestation service or even at the nova scheduler to speed scheduling decisions as long as the value cached is no older than some specifiable time window.


Overall Flow

The cloud user specifies by way of filter extra-specs any asset and geo-tags require. This in turn is used to filter out the machines that are eligible to host the desired virtual machines and then deploy the same. Data get and put requests would take additional tag arguments if the user wants to restrict where data is stored.

Phased Release

Block Devices and VM placement can be supported in the first release. Object storage would happen in a second release because it touches up issues such as balancing rings in Swift.