Jump to: navigation, search

GeoTagging

Revision as of 21:18, 30 September 2013 by Malini-k-bhandaru (talk | contribs)
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)

Geo Tagging

While the cloud enables workloads and data to reside anywhere, users may be constrained to run their workloads and save their data in certain geographies due to regulatory reasons. This extends beyond trusting the cloud's hardware resources to be free of malware and rootkits. Extensions to Trusted Compute Pools (TCP) enable associating with hardware at provision time geo-tags. Intel Trusted Execution Environment (TXT) and other measured launch environments (MLEs) facilitate measuring such provision time information into the Trusted Platform Module (TPM). Attestation services can be used to ascertain that provision time meta data have not been tampered.

Asset and Geo Tags can be used to:

  1. Monitor and Enforce Customer Policies

This could be for security, fault tolerance, and/or meeting Service Level Agreements (SLAs). For example, even in a private corporate cloud, Finance and HR may not want Engineering to overrun their resources.

    1. Control workload placement
    2. Control data storage
  1. Provide Control and Visibility to Cloud End-users
    1. Display in dashboards asset/geo associations of VM and Data
    2. Generate audit logs of Hardware/VMs/data with asset/geo details.

Use Cases

Government Security Requirements

Governments may require that their workloads run and their data be saved only in certain geos. For instance, they may not want either to leave their sovereign territory, with exceptions being made for embassies and international waters/air.

Commerce

Retailers for taxation purposes -- either to avoid or reduce them (some US states have higher tax rates than others) or even gain special tax benefits (such as hosting sites in export only zones) may want to restrict and/or enforce where their workloads and data are stored in the cloud. Retail goes beyond the brick and mortar store .. when consumables are digital such as video, audio, images, software, books and more. Banking is another regulated industry and customer data in some banks enjoy greater privileges due to international agreements.

Research Freedom

Companies may restrict what categories of research are carried out in different geos. For example stem cell research, drug discovery research fall into this category. Each government may have different policies around these tracts.


Geo Tagging in OpenStack

See NIST's recommendation on geo tags: Geo Tag Presentation Draft NIST Geo-Tag. Intel is driving the effort to realize geo-tags in TCP, where the measured launch environment will measure the tags, and the associated attestation service shall determine that the measured values match registered whitelist values, confirm that the certificates are authentic, and neither expired or revoked.

Nova Aggregates and Availability Zones

The partitioning, resource reservation, and fault tolerance benefits that Nova aggregates and availability zones bring have a lot in common with geo tags. However, the main difference is that trusted tags are provision time values, and attached to the hardware resource. Re-purposing a machine is more easy via the command line with aggregates and availability zones, does not require machine reboot, but to modify trusted tags more deliberate action is required, machine reboot, which ensures no VM or data unbeknownstis relocated, and there is a re-boot audit trail. The geo-tag by virtue of being associated with a hardware root of trust is more valuable with respect to meeting regulatory requirements.

Further, the Attestation service could be independent of the cloud provider to increase credibility and better meet regulatory requirements. In addition, geo-tags can be verified with about 90% accuracy using software techniques using the Internet Protocol (IP) address of the device being attested.

This blueprint details how geo-tags can be incorporated and taken advantage of in OpenStack clouds.

OpenStack Changes

Geo Tagging builds on the Trusted Compute Pools feature, covered in blueprint: trusted-computing-pools . Also see: details: TrustedComputingPools

Compute Node Provisioning

During compute nodes provisioning for trust, geo-tags may also be assigned. These can be simple strings, such as, "3 rd Floor, Expo Center, Hong Kong", or complex, such as XML data providing sub-items such as GPS co-ordinates, postal address, and more, or json strings.

Dashboard

  1. Flavor Extra Specs, Volume Extra Specs The extra specs field readily supports specifying geo and other asset tag constraints.
  2. Displaying VM and Volume geo/asset tag affiliations The Horizon UI for instance and volume lists could be extended to display in addition to current information, trusted and geo tags. For instance, it would be logical to add a little trusted seal if a compute node is trusted, and by extension a VM running on the same compute node. A country flag would be a good geo indicator.
  3. Object listings Could also contain geo indicators.

Nova Scheduler Filter

Asset /Geo Tag filters should be specified. They will be very similar to todays Aggregate and Availability filters with the distinction that the data they retrieve from the Attestation service may need to be parsed. For instance, geo-tag data may be retrieved as a json string or as XML. In the case of XML, the data may be comprised of a Global Positioning System (GPS) cordinates element, a postal address element. The data so retrieved may need to be parsed if the filter requires match on country, or state and country. We recommend that filter code take a policy argument to determine what manner of parsing is required, and the extracted data then used to determine placement. Filters could also be a logical OR of geos.

The same filter techniques are usable by the scheduler for volume placement and live migration of VMs.

Storage

Geo tags are readily usable for block storage. Object storage in the context of Swift is a little more involved and shall be covered in a separate blueprint and be addressed in phase-2. This is chiefly because the functionality that computes hash codes to determine where to place the Swift replicas needs to be modified. Further Ring balance logic in the event of hardware and/or network failures needs to be modified. Last but not least, the Swift API for object put/get will need to be modified to specify geo/asset tag constraints.

Audit Tasks

Audit logs of VM and volume related CRUD activity could capture geo tags. These would serve well compliance. Further periodic audit reports of all cloud resources could also capture the geo/ tags. Cloud asset particulars could also be saved in databases, along with configuration information about patches and upgrades. A sanity check would be that the reported geo tags match what is in the database.

Attestation Service

Existing Attestion services need to be upgraded to understand geo tags, support an API to retrieve them for registered hardware resources. The geo tags retrieved for hardware resource could be cached at the attestation service or even at the nova scheduler to speed scheduling decisions as long as the cached value is no older than some specifiable time window.

The simplest geo tag is a string, while more complex variants are XML and json strings. A match policy (country match, state and country match, or city, state, and country match) and a formatter to parse a given representation is required to facilitate match.

Overall Flow

The cloud user specifies by way of flavor extra-specs for instances and volumes any desired geo tag. These in turn are used to filter out the compute nodes/volume devices that are ineligible. In the context of object storage, data get and put requests would need additional tag arguments, in order to restrict where data is to be stored, and can be retrieved.

Phased Release

Block Devices and VM placement can be supported in the first release. Object storage would happen in a second release because it touches upon both API changes and issues such as balancing rings in Swift.