Launchpad Entry: Essex
Created: 2011-09-20
Contributors: Fred Yang
Summary
In cloud computing environment, there can be thousands of compute nodes located in different geographical, or remote locations. Cloud subscribers may require their applications or virtual machines to only run on compute nodes which are verified in running known and good hypervisors to ensure the trustworthiness of the running environment . The feature enables cloud hosting providers to build trusted computing pools based on H/W-based security features, such as Intel Trusted Execution Technology (TXT). Combined with external standalone web-based remote attestation server done by a separate open source project (i.e. "remote attestation"), the providers can ensure that the compute node is running software with verified measurements, thus they can establish the foundation for the secure cloud stack. Through the Trusted Computing Pools, cloud subscribers can request services to be run on verified compute nodes.
Remote Attestation server performs nodes verification through following steps -
1. Compute nodes boot with Intel TXT technology enabled
2. The compute node's BIOS, hypervisor and OS are measured
3. These measured data is sent to Attestation server when challenged by attestation server
4. Attestation server verifies those measurements against good/known database to determine nodes' trustworthiness
Release Note
The cloud providers who deploy Trusted Computing Pools can provide premiere services to users who require services to be only run on compute nodes which are verified in running known and good hypervisors for ensured trustworthy environment. Users will have the option to specify services to be run on compute nodes with verified environment. This set of enhancement will not impact users consuming the HTTP OSAPI.
Rationale
Cloud computing pool can involve thousands of compute nodes located at different geographical locations which are not easy for cloud providers to identify a node's trustworthiness. With enhancement to verify remote attestation service combined with Intel TXT, Openstack scheduler can provide VMs to run on compute nodes with verified software
User Stories
Users can have options to specify their services to be run on compute nodes within the trusted computing pools
Assumptions
Trusted Computing Pools take advantage of flavor based filter mechanism and its corresponding APIs to support compute nodes' trustworthiness filtering, such as BaseScheduler() that invokes JsonFiler() in selecting compute nodes as scheduling candidate
Design
Design adds new flavor based filter to support users specified Trusted_lvl=trusted or Trust_lvl=untrusted as filter option to select compute nodes for service. New components are
A new host filter driver added to filter compute nodes with trust_lvl capability from per ZoneManager's 'Capability' list
A periodic service, per ZoneManager, to retrieve all compute nodes' trust state from the attestation server and populate the data into ZoneManager's 'Capability' list
- * A node's trustworthiness can't be trusted if it is directly reported by the node itself, rather the trustworthiness should be verified and provided through a trusted 3rd party, such as Attestation server
- API service routine to connect remote attestation server through https to retrieve nodes' trust state
Implementation
Three key components are built:
nova.scheduler.filters.json_filter_integrity - this is a new filter driver, supporting JSON filter, derived from JsonFilter class for added trustworthiness filtering in selecting compute nodes based on zone_manager.service_states{}
- nova.scheduler.manager_integrity - this is the code hooked into nova.scheduler.manager as scheduled periodic task on every FLAGS.periodic_interval to retrieve compute nodes' trust state from the attestation server and populate into zone_manager.service_states{} for json_filter_integrity.Jsonfilter
- nova.scheduler.attestation.service - this is the logic that wraps Restful APIs to invoke the standalone attestation server to verify a compute node's trustworthiness
1. Manager_integrity:
IntegrityService() hooks into manager.SchedulerManager.periodic_tasks() and runs every FLAGS.periodic_interval seconds to build nodes' trust_lvl information into zone_manager.service_states{}. Key logic is as follows:
- Zone_manager.snapshots{} tracks latest updated_at and report_count data reported by compute nodes
On each IntegrityService() execution, db.service_get_all_by_topic(context, 'compute') is invoked to locate all the compute nodes. Then every node's current updated_at data is checked against snapshots{} to identify the nodes that have missed any report tick since last checking.
- Post request to attestation service if the node(s) has missed any tick
- Trust state retrieved from Attestation server is then populated into Zone_manager.service_states{}. Trust state is also tracked separately in trust_cache{}
After Scheduler restarted, it takes (2 * FLAGS.periodic_interval) for IntegrityService() to populate trust states into service_states{} before scheduler can find nodes to dispatch tasks
2. Json_filter_integrity
Supports JsonFilterIntegrity() which takes request_spec from OSAPI select() and filters compute nodes through service_states{}, such as
- trust_lvl = trusted by ['=', '$trust_state.trust_lvl', 'trusted'], or
- trust_lvl = untrusted
- For euca run_instance commands, the pre-defined attributes will be converted to Json format for processing
- filter_hosts() imposes "trust_lvl = untrusted" to request_spec if no trust_lvl specified
- To ensure trustworthiness information in service_states{} is indeed reflecting a compute node's current trust state, filter_hosts() will also verify against its internal trust_cache{} after the node(s) has been filtered through service_states{}; further, it also ensures the nodes haven't been rebooted since last FLAGS.report_interval.
3. Attestation.service:
Provides Restful API connection to https attestation server with server identification through certification. Access to attestation APIs requires user authentication
4. Configuration Flags
The following nova.conf enables trusted computing pools
--scheduler_manager=nova.scheduler.manager_integrity.SchedulerManagerIntegrity --compute_scheduler_driver=nova.scheduler.least_cost.LeastCostScheduler --default_host_filter=JsonFilterIntegrity --attestation_server=Attestation.OpenStack.org --attestation_port=8443 --attestation_server_ca_file=/root/OpenStack/attestation_Ca_file --attestation_auth_user=AuthUserId --attestation_auth_passwd=AuthPasswd
Where
--scheduler_manager is set to nova.scheduler.manager_integrity.SchedulerManagerIntegrity to invoke IntegrityService() periodically in setting up compute nodes' trustworthiness into zone_manager.service_states{} --compute_scheduler_driver=nova.scheduler.least_cost.LeastCostScheduler is an example where Least-Cost scheduler is inherited from BaseScheduler() which is to invoke JsonFiler specified through --default_host_filter --default_host_filter=JsonFilterIntegrity is set to use trust_lvl filter driver --attestation_server=scheduler.OpenStack.org specifies attestation server name --attestation_port=8443 specifies https port exported by attestation server --attestation_server_ca_file=/root/OpenStack/attestation_Ca_file Certificate file used to verify Attestation server's identity --attestation_auth_user=AuthUserId used by Attestation server to verify if the incoming connectiion is from a valid user --attestation_auth_passwd=AuthPasswd
UI Changes
There should be no visible changes to the end users, the work is behind the API servers
Code Changes
Code changes should be isolated from the existing API, compute and scheduler modules. Rather, new code modules get added and take effect only when Trusted Computing Pools feature is configured
Migration
Coming soon once implementation nears beta
Test/Demo Plan
Unit tests will be provided as part of enhancements. Integration and large scale testing can be added once there is infrastructure exist
Unresolved Issues
None
BoF agenda and Discussion
The following relevant sessions were discussed at the Diablo design summit
https://blueprints.launchpad.net/nova/+spec/trusted-computing-pools
http://www.runmapglobal.com/blog/infrastructure-service-iaas/