Jump to: navigation, search

Difference between revisions of "GeoTagging"

(Dashboard)
 
(14 intermediate revisions by the same user not shown)
Line 1: Line 1:
  
=Geo and Asset Tagging=
+
=Geo Tagging=
  
 
While the cloud enables workloads and data to reside anywhere,  users may be constrained to run their workloads and save their data in certain geographies due to regulatory reasons. This extends beyond trusting the cloud's hardware resources to be free of  malware and rootkits.
 
While the cloud enables workloads and data to reside anywhere,  users may be constrained to run their workloads and save their data in certain geographies due to regulatory reasons. This extends beyond trusting the cloud's hardware resources to be free of  malware and rootkits.
Extensions to Trusted Compute Pools (TCP) enable associating with hardware, at provision time, asset and geo-tags. Intel Trusted Execution Environment (TXT) or other measure launch environments (MLEs) facilitate measuring such provision time information into the Trusted Platform Module (TPM). Attestation services can be used to ascertain that these provision time meta data have not been tampered with.
+
Extensions to Trusted Compute Pools (TCP) enable associating with hardware at provision time geo-tags. Intel Trusted Execution Environment (TXT) and other measured launch environments (MLEs) facilitate measuring such provision time information into the Trusted Platform Module (TPM). Attestation services can be used to ascertain that provision time meta data have not been tampered.
  
 
Asset and Geo Tags can be used to:
 
Asset and Geo Tags can be used to:
 
# Monitor and Enforce Customer Policies
 
# Monitor and Enforce Customer Policies
 +
This could be for security, fault tolerance, and/or meeting Service Level Agreements (SLAs). For example, even in  a private corporate cloud, Finance and HR may not want Engineering to overrun their resources.
 
## Control workload placement
 
## Control workload placement
 
## Control data storage
 
## Control data storage
# Meet Service level agreements (SLAs)
 
## Resource reservation at provision time. Useful in private, public and hybrid clouds. For example, in a private cloud, Finance and HR may not want Engineering to overrun their resources.
 
 
# Provide Control and Visibility to Cloud End-users
 
# Provide Control and Visibility to Cloud End-users
 
## Display in dashboards asset/geo associations of VM and Data
 
## Display in dashboards asset/geo associations of VM and Data
Line 16: Line 15:
  
 
=== Use Cases  ===
 
=== Use Cases  ===
==== Government  Security====
+
==== Government  Security Requirements ====
A variant of the general asset-tag is the geo-tag, where does a machine physically reside.  Governments for instance may restrict where their workloads may run, where their data may be saved.
+
Governments may require that their workloads run and their data be saved only in certain geos. For instance, they may not want either to leave their sovereign territory, with exceptions being made for embassies and international waters/air.  
  
 
==== Commerce ====
 
==== Commerce ====
For taxation purposes a retailer may want to ensure that their online web portal is placed only on machines in certain states. It may also have similar constraints on the data it stores. Yet another use case is banking and disclosureSwiss banks protect their clients thanks to their client privacy and disclosure policies.
+
Retailers for taxation purposes -- either to avoid or reduce them (some US states have higher tax rates than others) or even gain special tax benefits (such as hosting sites in export only zones) may want to restrict and/or enforce where their workloads and data are stored in the cloud. Retail goes beyond the brick and mortar store .. when consumables are digital such as  video, audio, images, software, books and moreBanking is another regulated industry and customer data in some banks enjoy greater privileges due to international agreements.  
  
 
==== Research Freedom ====
 
==== Research Freedom ====
Line 28: Line 27:
 
[[File:Geo-tagging-in-openstack.JPG|Geo Tagging in OpenStack]]
 
[[File:Geo-tagging-in-openstack.JPG|Geo Tagging in OpenStack]]
  
 +
See NIST's recommendation on geo tags: [http://csrc.nist.gov/groups/SMA/forum/documents/april2013presentations/forum_april_11_2013_bartock.pdf Geo Tag Presentation ][http://csrc.nist.gov/publications/drafts/ir7904/draft_nistir_7904.pdf Draft NIST Geo-Tag]. Intel is driving the effort to realize geo-tags in TCP, where the measured launch environment will measure the tags, and the associated attestation service shall determine that the measured values match registered whitelist values, confirm that the certificates are authentic, and neither expired or revoked.
  
NIST and Intel are collaborating on Asset Tagging and in particular Geo-Tagging. Mid-2014 Intel plans to release an attestation service that measures asset tag information, confirming that it has not been tampered with since the machine was registered at the time of provisioning.
+
=== Nova Aggregates and Availability Zones ===
 +
The partitioning, resource reservation, and fault tolerance benefits that Nova aggregates and availability zones bring have a lot in common with
 +
geo tags. However, the main difference is that trusted tags are provision time values, and attached to  the hardware resource. Re-purposing a machine is more easy via the command line with aggregates and availability zones, does not require machine reboot,  but to modify trusted tags more deliberate action is required, machine reboot, which ensures no VM or data unbeknownstis relocated, and there is a re-boot audit trail.  The geo-tag by virtue of being associated with a hardware root of trust is more valuable with respect to meeting regulatory requirements.
  
This blueprint details how asset and geo-tagging can be incorporated and taken advantage of in OpenStack clouds.
+
Further, the Attestation service could be independent of the cloud provider to increase credibility and better meet regulatory requirements. In addition, geo-tags can be verified with about 90% accuracy using software techniques using the Internet Protocol (IP) address of the device being attested.
 +
 
 +
This blueprint details how geo-tags can be incorporated and taken advantage of in OpenStack clouds.
  
 
== OpenStack Changes ==
 
== OpenStack Changes ==
Asset/Geo Tagging builds on the Trusted Compute Pools feature, covered in blueprint: [https://blueprints.launchpad.net/nova/+spec/trusted-computing-pools%20 blueprint: trusted-computing-pools ]. Also see   [[TrustedComputingPools|details: TrustedComputingPools]]
+
Geo Tagging builds on the Trusted Compute Pools feature, covered in [https://blueprints.launchpad.net/nova/+spec/trusted-computing-pools%20 blueprint: trusted-computing-pools ]. Also see: [[TrustedComputingPools|details: TrustedComputingPools]]
  
 
===Compute Node Provisioning ===
 
===Compute Node Provisioning ===
In addition to compute nodes being provisioned for trust, asset-tags and geo-tags may be assigned at the same time. These can be simple strings, "3 rd Floor, Expo Center, Hong Kong", or complex XML data providing sub-items such as GPS co-ordinates, postal address, and more.
+
During compute nodes provisioning  for trust, geo-tags may also be assigned. These can be simple strings, such as, "3 rd Floor, Expo Center, Hong Kong", or complex, such as  XML data providing sub-items such as GPS co-ordinates, postal address, and more, or json strings.
  
 
=== Dashboard ===
 
=== Dashboard ===
# '''Flavor Extra Specs, Volume Extra Specs'''
+
# '''Flavor Extra Specs, Volume Extra Specs''' The extra specs field readily supports specifying geo and other asset tag constraints.
The extra specs field readily supports specifying geo and other asset tag constraints.
+
# '''Displaying VM and Volume geo/asset tag affiliations''' The Horizon UI for instance and volume lists could be extended to display in addition to current information, trusted and geo tags. For instance, it would be logical to add a little trusted seal if a compute node is trusted, and by extension a VM running on the same compute node. A country flag would be a good geo indicator.
# '''Displaying VM and Volume geo/asset tag affiliations'''
+
# '''Object listings''' Could also contain geo indicators.
The Horizon UI for instance and volume lists could be extended to display in addition to current information, trusted and geo tags.
 
For instance, it would be logical to add a little trusted seal if a compute node is trusted, and by extension a VM running on the same compute node. A country flag would be a good geo indicator.
 
# '''Object listings'''
 
Could also contain geo indicators.
 
 
 
== Nova Scheduler Filter ==
 
Asset Tag and Geo Tag filters should be specified. These are used to filter out compute nodes from the set of devices able to host a virtual machine based the geo measured and that requested.
 
  
During Live migration, the filter is applied to determine to which machine a given VW may be relocated.
+
=== Nova Scheduler Filter ===
 +
Asset /Geo Tag filters should be specified. They will be very similar to todays  Aggregate and Availability filters with the distinction that
 +
the data they retrieve from the Attestation service may need to be parsed. For instance, geo-tag data may be retrieved as a json string or as XML. In the case of XML, the data may be comprised of a Global Positioning System (GPS) cordinates element, a postal address element.  The data so retrieved may need to be parsed if the filter requires match on country, or state and country. We recommend that filter code take a policy argument to determine what manner of parsing is required, and the extracted data then used to determine placement. Filters could also be a logical OR of geos.
  
== Storage Changes ==
+
The same filter techniques are usable by the scheduler for volume placement and live migration of VMs.
If an asset or geo-tag is specified as part of the put or get request, it is honored.  This will need to be reflected as changes in the Swift hash functions which determine where
+
=== Storage ===
the replicas are stored.
+
Geo tags are readily usable for block storage. Object storage in the context of Swift is a little more involved and shall be covered in a separate blueprint and be addressed in phase-2. This is chiefly because the functionality that computes hash codes  to determine where to place the  Swift replicas needs to be modified. Further Ring balance logic in the event of hardware and/or network failures needs to be modified.
 +
Last but not least, the Swift API for object put/get will need to be modified to specify geo/asset tag constraints.
  
== Audit Tasks ==
+
=== Audit Tasks ===
Audit tasks could for trusted nodes also determine if any geo/asset-tags are specified and capture these in logs and/or reports.
+
Audit logs of VM and volume related CRUD activity could capture geo tags. These would serve well compliance.
 +
Further periodic audit reports of all cloud resources could also capture the geo/ tags.  Cloud asset particulars could also be saved in databases, along with configuration information about patches and upgrades. A sanity check would be that the reported geo tags match what is in the database.
  
 
== Attestation Service ==
 
== Attestation Service ==
The TCP 1.5 Attestation Service, which can understand asset and geo tags, needs to be integrated into the cloud installation. The Attestation service will provide an
+
Existing Attestion services need to be upgraded to understand geo tags, support an API to retrieve them for registered hardware resources. The geo tags retrieved for hardware resource could be cached at the attestation service or even at the nova scheduler to speed scheduling decisions
API which enables retrieving asset and geo tags from attested machines. These can be cached at the attestation service or even at the nova scheduler to speed scheduling decisions
+
as long as the cached value is no older than some specifiable time window.
as long as the value cached is no older than some specifiable time window.
 
  
 +
The simplest geo tag is a string, while more complex variants are XML and json strings. A match policy (country match, state and country match, or city, state, and country match) and a formatter to parse a given  representation is required to facilitate match.
  
 
== Overall Flow ==
 
== Overall Flow ==
  
The cloud user specifies by way of filter extra-specs any asset and geo-tags  require. This in turn is used to filter out the machines that are eligible to host the desired virtual machines and then deploy the sameData get and put requests would take additional tag arguments if the user wants to restrict where data is stored.
+
The cloud user specifies by way of flavor extra-specs for instances and volumes any desired  geo tag. These in turn are used to filter out the compute nodes/volume devices  that are ineligible.   
 +
In the context of object storage, data get and put requests would need additional tag arguments, in order to restrict where data is to be stored, and can be retrieved.
  
 
=== Phased Release ===
 
=== Phased Release ===
Block Devices  and VM placement can be supported in the first release.  Object storage would happen in a second release because it touches up issues such as balancing rings in Swift.
+
Block Devices  and VM placement can be supported in the first release.  Object storage would happen in a second release because it touches upon both API changes and issues such as balancing rings in Swift.

Latest revision as of 21:18, 30 September 2013

Geo Tagging

While the cloud enables workloads and data to reside anywhere, users may be constrained to run their workloads and save their data in certain geographies due to regulatory reasons. This extends beyond trusting the cloud's hardware resources to be free of malware and rootkits. Extensions to Trusted Compute Pools (TCP) enable associating with hardware at provision time geo-tags. Intel Trusted Execution Environment (TXT) and other measured launch environments (MLEs) facilitate measuring such provision time information into the Trusted Platform Module (TPM). Attestation services can be used to ascertain that provision time meta data have not been tampered.

Asset and Geo Tags can be used to:

  1. Monitor and Enforce Customer Policies

This could be for security, fault tolerance, and/or meeting Service Level Agreements (SLAs). For example, even in a private corporate cloud, Finance and HR may not want Engineering to overrun their resources.

    1. Control workload placement
    2. Control data storage
  1. Provide Control and Visibility to Cloud End-users
    1. Display in dashboards asset/geo associations of VM and Data
    2. Generate audit logs of Hardware/VMs/data with asset/geo details.

Use Cases

Government Security Requirements

Governments may require that their workloads run and their data be saved only in certain geos. For instance, they may not want either to leave their sovereign territory, with exceptions being made for embassies and international waters/air.

Commerce

Retailers for taxation purposes -- either to avoid or reduce them (some US states have higher tax rates than others) or even gain special tax benefits (such as hosting sites in export only zones) may want to restrict and/or enforce where their workloads and data are stored in the cloud. Retail goes beyond the brick and mortar store .. when consumables are digital such as video, audio, images, software, books and more. Banking is another regulated industry and customer data in some banks enjoy greater privileges due to international agreements.

Research Freedom

Companies may restrict what categories of research are carried out in different geos. For example stem cell research, drug discovery research fall into this category. Each government may have different policies around these tracts.


Geo Tagging in OpenStack

See NIST's recommendation on geo tags: Geo Tag Presentation Draft NIST Geo-Tag. Intel is driving the effort to realize geo-tags in TCP, where the measured launch environment will measure the tags, and the associated attestation service shall determine that the measured values match registered whitelist values, confirm that the certificates are authentic, and neither expired or revoked.

Nova Aggregates and Availability Zones

The partitioning, resource reservation, and fault tolerance benefits that Nova aggregates and availability zones bring have a lot in common with geo tags. However, the main difference is that trusted tags are provision time values, and attached to the hardware resource. Re-purposing a machine is more easy via the command line with aggregates and availability zones, does not require machine reboot, but to modify trusted tags more deliberate action is required, machine reboot, which ensures no VM or data unbeknownstis relocated, and there is a re-boot audit trail. The geo-tag by virtue of being associated with a hardware root of trust is more valuable with respect to meeting regulatory requirements.

Further, the Attestation service could be independent of the cloud provider to increase credibility and better meet regulatory requirements. In addition, geo-tags can be verified with about 90% accuracy using software techniques using the Internet Protocol (IP) address of the device being attested.

This blueprint details how geo-tags can be incorporated and taken advantage of in OpenStack clouds.

OpenStack Changes

Geo Tagging builds on the Trusted Compute Pools feature, covered in blueprint: trusted-computing-pools . Also see: details: TrustedComputingPools

Compute Node Provisioning

During compute nodes provisioning for trust, geo-tags may also be assigned. These can be simple strings, such as, "3 rd Floor, Expo Center, Hong Kong", or complex, such as XML data providing sub-items such as GPS co-ordinates, postal address, and more, or json strings.

Dashboard

  1. Flavor Extra Specs, Volume Extra Specs The extra specs field readily supports specifying geo and other asset tag constraints.
  2. Displaying VM and Volume geo/asset tag affiliations The Horizon UI for instance and volume lists could be extended to display in addition to current information, trusted and geo tags. For instance, it would be logical to add a little trusted seal if a compute node is trusted, and by extension a VM running on the same compute node. A country flag would be a good geo indicator.
  3. Object listings Could also contain geo indicators.

Nova Scheduler Filter

Asset /Geo Tag filters should be specified. They will be very similar to todays Aggregate and Availability filters with the distinction that the data they retrieve from the Attestation service may need to be parsed. For instance, geo-tag data may be retrieved as a json string or as XML. In the case of XML, the data may be comprised of a Global Positioning System (GPS) cordinates element, a postal address element. The data so retrieved may need to be parsed if the filter requires match on country, or state and country. We recommend that filter code take a policy argument to determine what manner of parsing is required, and the extracted data then used to determine placement. Filters could also be a logical OR of geos.

The same filter techniques are usable by the scheduler for volume placement and live migration of VMs.

Storage

Geo tags are readily usable for block storage. Object storage in the context of Swift is a little more involved and shall be covered in a separate blueprint and be addressed in phase-2. This is chiefly because the functionality that computes hash codes to determine where to place the Swift replicas needs to be modified. Further Ring balance logic in the event of hardware and/or network failures needs to be modified. Last but not least, the Swift API for object put/get will need to be modified to specify geo/asset tag constraints.

Audit Tasks

Audit logs of VM and volume related CRUD activity could capture geo tags. These would serve well compliance. Further periodic audit reports of all cloud resources could also capture the geo/ tags. Cloud asset particulars could also be saved in databases, along with configuration information about patches and upgrades. A sanity check would be that the reported geo tags match what is in the database.

Attestation Service

Existing Attestion services need to be upgraded to understand geo tags, support an API to retrieve them for registered hardware resources. The geo tags retrieved for hardware resource could be cached at the attestation service or even at the nova scheduler to speed scheduling decisions as long as the cached value is no older than some specifiable time window.

The simplest geo tag is a string, while more complex variants are XML and json strings. A match policy (country match, state and country match, or city, state, and country match) and a formatter to parse a given representation is required to facilitate match.

Overall Flow

The cloud user specifies by way of flavor extra-specs for instances and volumes any desired geo tag. These in turn are used to filter out the compute nodes/volume devices that are ineligible. In the context of object storage, data get and put requests would need additional tag arguments, in order to restrict where data is to be stored, and can be retrieved.

Phased Release

Block Devices and VM placement can be supported in the first release. Object storage would happen in a second release because it touches upon both API changes and issues such as balancing rings in Swift.