Jump to: navigation, search

Difference between revisions of "Trio2o"

 
(13 intermediate revisions by the same user not shown)
Line 2: Line 2:
  
 
==Use Cases==
 
==Use Cases==
==== Massive distributed edge cloud ====
 
 
Now building massive distributed edge clouds in edge data centers with computing and storage close to end users is emerging for enterprise application, NFV service and personal service. 
 
 
===== Enterprise Application=====
 
Some enterprises also found issues for applications running in remote  centralized cloud, for example for video editing, 3D modeling application and IoT service etc which bandwidth and latency are sensitive.
 
 
The high bandwidth and low latency provided by the edge cloud are critical for enterprise level applications like video editing, 3D modeling, AR/VR, IoT service, etc
 
 
For Enterprise, most of the employee will work in different branches, and access to the nearby edge cloud, and collaboration among employee from different branch leads to the requirement on cross edge cloud functionalities, like tenant level networking, data distribution and migration.
 
 
===== NFV and Edge Cloud Service=====
 
NFV(network function virtualization) will provide more flexible and better customized networking capability, for example, dynamic customized network bandwidth management,  and also help to move the computing and storage close to end users. With shortest path from the end users to the storage and computing, the uplink speed could be larger and terminate the bandwidth consumption as quick as possible,  will definitely bring better user experience, and change the way of content generation and store: real time, all data in cloud.
 
 
For example, an user/enterprise can dynamically ask for high bandwidth/storage requirement for streaming the HD video/AR/VR data into the cloud temporarily, after finishing streaming, ask for more computing resources to do the post processing, and re-distribute the video to other sites. And when a user want to move/re-distribute the application and data from one edge cloud to another one, should be able to dynamically ask for cross edge cloud bandwidth managed by NFV.
 
 
For VNF(telecom virtualized application), distributed designed VNF will be placed to multiple edge data centers for higher reliability/availability. To provide this support, typically requires state replication between application instances (directly or via replicated database services, or via private designed message format), tenant level isolated networking plane across data centers is needed for application state replication.
 
  
===== Personal service =====
+
Single endpoint requirement for multiple OpenStack instances is up to the cloud operators' favorite. Some operators may prefer to have single endpoint, but others may not. There are lots of scenario will have multiple OpenStack instaces in the cloud.
Current Internet is good at processing down-link service. All contents are stored in remote centralized data centers and to some extent the access is accelerated with CDN.<br />
 
 
 
As more and more users generate content uploaded/streamed to the cloud and web site, these contents and data still have to be uploaded/streamed to some centralized data centers, the path is long and the bandwidth is limited and slow. For example, it’s very slow to uploading/streaming HD/2k/4k video for every user concurrently. For pictures or videos, they have to be uploaded with quality loss, and slow, using cloud as the first storage for users data has not the choice yet, currently it’s mainly for backup, and for none time/latency sensitive data. Some video captured and stored with quality loss even lead to the difficulty to provide the crime evidence or other purpose. The last mile of network access (fix or mobile) is wide enough, the main hindrance is that bandwidth in MAN(Metropolitan Area Network) and Backbone and WAN is limited and expensive. Real time video/data uploading/streaming from end user/terminal to the local edge cloud is quite attractive cloud service.
 
 
 
From family or personal point of view, the movement or distribution of App/Storage from one edge data center to another one is also needed. For example, all video will be stored and processed in Hawaii locally when I am taking video in travelling, but I want the video after processing to be moved to Shenzhen China when I come back to China. But in Shenzhen, I want to share the video with streaming service not only in Shenzhen but to friends in Shanghai Beijing, so the data and the streaming service can be built in Shenzhen/Shanghai/Beijing too. The dynamically bandwidth incremental and app/data movement/replication can be helped through NFV edge cloud.
 
 
 
===== Requirements =====
 
The emerging massive distributed edge clouds will be not only some cloud islands, but also some new requirements are needed:
 
* Tenant level L2/L3 networking across data centers for isolation to tenant's E-W traffic
 
* Tenant level Volume/VM/object storage backup/migration/distribution
 
* Distributed image management
 
* Distributed quota management
 
* ...
 
  
 
==== Large scale cloud ====
 
==== Large scale cloud ====
Line 57: Line 27:
 
You have to estimate, calculate, monitor, simulate, test, online grey expansion for controller nodes and network nodes…whenever you add new machines to the cloud. The difficulty is that you can’t test and verify in all size.  
 
You have to estimate, calculate, monitor, simulate, test, online grey expansion for controller nodes and network nodes…whenever you add new machines to the cloud. The difficulty is that you can’t test and verify in all size.  
  
The feasible way to expand one large scale cloud is to add some already tested building block. That means we would prefer to build large scale public cloud by adding tested OpenStack instance (including controller and compute nodes) one by one, but would not prefer to unconditionally enlarge the capacity of one OpenStack instance. This way put the cloud construction under control.
+
The feasible way to expand one large scale cloud is to add some already tested building block. That means we would prefer to build large scale cloud by adding tested OpenStack instance (including controller and compute nodes) one by one, but would not prefer to unconditionally enlarge the capacity of one OpenStack instance. This way put the cloud construction under control.
  
Building large scale cloud by adding tested OpenStack instance one by one, but tenant's VMs may need to to be added to same network even if you add a new OpenStack building, or networks will be added into same router even if these networks of the tenant located in different OpenStack building blocks. But from the end user and PaaS point of view, they still want to use OpenStack API for already developed CLI, SDK, Portal, PaaS, Heat, Maganum, Murano etc. This way of building large scale public cloud also brings some new requirement to OpenStack based cloud which is quite similar like that in massive distributed edge clouds:
+
Building large scale cloud by adding tested OpenStack instance one by one, but from the end user and PaaS point of view, they still want to use OpenStack API for already developed CLI, SDK, Portal, PaaS, Heat, Maganum, Murano etc. This way of building large scale public cloud also brings some new requirement to OpenStack based cloud which is quite similar like that in massive distributed edge clouds:
  
* Tenant level L2/L3 networking across OpenStack instances for isolation to tenant's E-W traffic
+
* Single endpoint for the large scale cloud
 
* Distributed quota management
 
* Distributed quota management
 
* Global resource view of the tenant
 
* Global resource view of the tenant
 
* Tenant level Volume/VM migration/backup
 
* Tenant level Volume/VM migration/backup
 
* Multi-DC image import/clone/export
 
* Multi-DC image import/clone/export
 +
* ...
 +
 +
 +
==== Massive distributed edge cloud ====
 +
 +
Now building massive distributed edge clouds in edge data centers with computing and storage close to end users is emerging for enterprise application, NFV service and personal service. 
 +
 +
===== Enterprise Application=====
 +
Some enterprises also found issues for applications running in remote  centralized cloud, for example for video editing, 3D modeling application and IoT service etc which bandwidth and latency are sensitive.
 +
 +
The high bandwidth and low latency provided by the edge cloud are critical for enterprise level applications like video editing, 3D modeling, AR/VR, IoT service, etc
 +
 +
For Enterprise, most of the employee will work in different branches, and access to the nearby edge cloud, and collaboration among employee from different branch leads to the requirement on cross edge cloud functionalities, like data distribution and migration.
 +
 +
===== NFV and Edge Cloud Service=====
 +
NFV(network function virtualization) will provide more flexible and better customized networking capability, for example, dynamic customized network bandwidth management,  and also help to move the computing and storage close to end users. With shortest path from the end users to the storage and computing, the uplink speed could be larger and terminate the bandwidth consumption as quick as possible,  will definitely bring better user experience, and change the way of content generation and store: real time, all data in cloud.
 +
 +
For example, an user/enterprise can dynamically ask for high bandwidth/storage requirement for streaming the HD video/AR/VR data into the cloud temporarily, after finishing streaming, ask for more computing resources to do the post processing, and re-distribute the video to other sites. And when a user want to move/re-distribute the application and data from one edge cloud to another one, should be able to dynamically ask for cross edge cloud bandwidth managed by NFV.
 +
 +
===== Personal service =====
 +
Current Internet is good at processing down-link service. All contents are stored in remote centralized data centers and to some extent the access is accelerated with CDN.<br />
 +
 +
As more and more users generate content uploaded/streamed to the cloud and web site, these contents and data still have to be uploaded/streamed to some centralized data centers, the path is long and the bandwidth is limited and slow. For example, it’s very slow to uploading/streaming HD/2k/4k video for every user concurrently. For pictures or videos, they have to be uploaded with quality loss, and slow, using cloud as the first storage for users data has not the choice yet, currently it’s mainly for backup, and for none time/latency sensitive data. Some video captured and stored with quality loss even lead to the difficulty to provide the crime evidence or other purpose. The last mile of network access (fix or mobile) is wide enough, the main hindrance is that bandwidth in MAN(Metropolitan Area Network) and Backbone and WAN is limited and expensive. Real time video/data uploading/streaming from end user/terminal to the local edge cloud is quite attractive cloud service.
 +
 +
===== Requirements =====
 +
The emerging massive distributed edge clouds will be not only some cloud islands, but also some new requirements are needed:
 +
* Single endpoint for large amount of small edge cloud  in specific deployment scenario
 +
* Tenant level Volume/VM/object storage backup/migration/distribution
 +
* Distributed image management
 +
* Distributed quota management
 
* ...
 
* ...
  
 
====  OpenStack API enabled hybrid cloud ====
 
====  OpenStack API enabled hybrid cloud ====
 
Refer to https://wiki.openstack.org/wiki/Jacket
 
Refer to https://wiki.openstack.org/wiki/Jacket
 
 
<br />
 
<br />
The detailed use cases could be found in this presentation: https://docs.google.com/presentation/d/1UQWeAMIJgJsWw-cyz9R7NvcAuSWUnKvaZFXLfRAQ6fI/edit?usp=sharing
+
Strong requirement on single endpoint of the hybrid cloud.
  
More technical use cases could be found in the communication material used in Tricircle big-tent project application: https://docs.google.com/presentation/d/1Zkoi4vMOGN713Vv_YO0GP6YLyjLpQ7fRbHlirpq6ZK4/edit?usp=sharing
+
The detailed use cases could be found in this presentation: https://docs.google.com/presentation/d/16laTyn4ra-446v4p0kwMnpgHqwzMsz1r6QeiSI2Kq2M/
  
And also can meet the demand for several working group: Telco WG [https://wiki.openstack.org/wiki/TelcoWorkingGroup/UseCases documents], [https://etherpad.openstack.org/p/Network_Segmentation_Usecases Large Deployment Team Use Cases], and [https://wiki.opnfv.org/multisite/use_cases OPNFV Multisite Use Cases]
+
More technical use cases could be found in the communication material used in Tricircle big-tent project application, there is also requirement on single endpoint requirement in some deployments: https://docs.google.com/presentation/d/1Zkoi4vMOGN713Vv_YO0GP6YLyjLpQ7fRbHlirpq6ZK4/edit?usp=sharing
  
 
==Overview==
 
==Overview==
Line 94: Line 93:
 
==Architecture==
 
==Architecture==
  
Now the Trio2o is designed as standalone API-gateway service which is decoupled from OpenStack existing services like Nova, Cinder. The design blueprint has been developed with ongoing improvement in https://docs.google.com/document/d/18kZZ1snMOCD9IQvUKI5NVDzSASpw-QKj7l2zNqMEd3g/edit?usp=sharing,  
+
Now the Trio2o is designed as standalone API-gateway service which is decoupled from OpenStack existing services like Nova, Cinder. The design blueprint has been developed with ongoing improvement in https://docs.google.com/document/d/1cmIUsClw964hJxuwj3ild87rcHL8JLC-c7T-DUQzd4k/,  
  
 
<br />
 
<br />
  
[[File:Tricircle_Stateless_proposal.png|frameless|center|x400px|Tricircle improved architecture design - stateless]]
+
[[File:Trio2o_architecture.png|frameless|center|x400px|Trio2o architecture design]]
 
<br />
 
<br />
  
Line 104: Line 103:
  
 
* Nova API-GW  
 
* Nova API-GW  
# A standalone web service to receive all nova API request, and routing the request to appropriate bottom OpenStack instance according to Availability Zone ( during creation ) or VM's uuid ( during operation and query ). If more than one OpenStack instance in one Availability Zone, schedule one and forward the request to proper OpenStack instance, and build the binding relationship beween tenant ID and OpenStack instance.
+
# A standalone web service to receive all nova API request, and routing the request to appropriate bottom OpenStack instance according to Availability Zone ( during creation ) or VM's uuid ( during operation and query ). If more than one OpenStack instance in one Availability Zone, schedule one and forward the request to proper OpenStack instance, and build the binding relationship between tenant ID and OpenStack instance.
# Nova APIGW is the functionality to trigger networking automation when new VMs are being provisioned.
 
 
# work as stateless service, and could run with processes distributed in multi-hosts.
 
# work as stateless service, and could run with processes distributed in multi-hosts.
 
* Cinder API-GW  
 
* Cinder API-GW  
Line 111: Line 109:
 
# Cinder APIGW and Nova APIGW will make sure the volumes for the same VM will co-locate in same OpenStack instance.
 
# Cinder APIGW and Nova APIGW will make sure the volumes for the same VM will co-locate in same OpenStack instance.
 
# work as stateless service, and could run with processes distributed in multi-hosts.
 
# work as stateless service, and could run with processes distributed in multi-hosts.
* Neutron API Server
 
# Neutron API Server is reused from Neutron to receive and handle Neutron API request.
 
# Neutron Tricircle Plugin. It runs under Neutron API server in the same process like OVN Neutron plugin. The Tricircle plugin serve for tenant level L2/L3 networking automation across multi-OpenStack instances. It will use driver interface to call bottom OpenStack Neutron API and L2GW API if needed, especially for cross OpenStack mixed VLAN / VxLAN L2 networking.
 
 
* Admin API
 
* Admin API
 
# manage sites(bottom OpenStack instances) and availability zone mappings.
 
# manage sites(bottom OpenStack instances) and availability zone mappings.
Line 119: Line 114:
 
# Expose API for maintenance.
 
# Expose API for maintenance.
 
* XJob
 
* XJob
# Receive and process cross OpenStack functionalities and other async. jobs from Nova API-GW, or Cinder API-GW, Admin API or Neutron Tricircle Plugin. For example, when booting a VM for the first time for the tenant, router, security group rule, FIP and other resources may have not already been created in the bottom OpenStack instance. But it’s required. Not like network,security group, ssh keypair, other resources they must be created before a VM booting. These resources could be created in async. way to accelerate response for the first VM booting request.
+
# Receive and process cross OpenStack functionalities and other async. jobs from Nova API-GW, or Cinder API-GW, Admin API
# Cross OpenStack networking also will be done in async. jobs.
+
# Any of Admin API, Nova API-GW, Cinder API-GW could send an async. job to XJob through message bus with RPC API provided by XJob.
# Any of Admin API, Nova API-GW, Cinder API-GW, Neutron Tricircle plugin could send an async. job to XJob through message bus with RPC API provided by XJob.
 
 
* Database
 
* Database
 
# The Tricircle has its own database to store pods, pod-bindings, jobs, resource routing tables.
 
# The Tricircle has its own database to store pods, pod-bindings, jobs, resource routing tables.
# Neutron Tricircle plugin reuse DB of Neutron, for one tenant’s network, router will be spread into multiple OpenStack instances, and managing tenant level IP/mac address to avoid conflict across different OpenStack instances.
 
  
 
For Glance deployment, there are several choice:
 
For Glance deployment, there are several choice:
Line 134: Line 127:
 
==Value==
 
==Value==
 
The motivation to develop the Trio2o open source project:
 
The motivation to develop the Trio2o open source project:
 
The cascading solution based on PoC design with enhancement has been running in several production clouds, which showed the value of one OpenStack API gateway layer with networking automation functionality above multiple OpenStack instances, no matter in large scale centralized cloud scenario, or distributed enterprise application located in the distributed edge clouds, even hybrid cloud scenario:
 
  
 
* OpenStack API eco-system reserved, from CLI, SDK to Heat, Murano, Magum etc, all of these could be reused seamlessly.
 
* OpenStack API eco-system reserved, from CLI, SDK to Heat, Murano, Magum etc, all of these could be reused seamlessly.
Line 146: Line 137:
  
 
==Installation and Play==
 
==Installation and Play==
Refer to installation guide in https://github.com/openstack/tricircle for single node/two nodes setup using devstack.
+
Refer to installation guide in https://github.com/openstack/trio2o for single node/two nodes setup using devstack.
  
==FAQ==
+
==Resources==
 +
* Design documentation: [https://docs.google.com/document/d/1cmIUsClw964hJxuwj3ild87rcHL8JLC-c7T-DUQzd4k/ Trio2o Design Blueprint]
 +
* Wiki: https://wiki.openstack.org/wiki/trio2o
 +
* Source: https://github.com/openstack/trio2o
 +
* Bugs: http://bugs.launchpad.net/trio2o
 +
* Blueprints: https://launchpad.net/trio2o
 +
* Review Board: https://review.openstack.org/#/q/project:openstack/trio2o
 +
* Weekly meeting IRC channel: #openstack-meeting, irc.freenode.net on every Wednesday starting from UTC 13:00 to UTC 14:00
 +
* Weekly meeting IRC log: https://wiki.openstack.org/wiki/Meetings/Trio2o
 +
* Trio2o project IRC channel: #openstack-trio2o on irc.freenode.net
 +
* Trio2o project IRC channel log: http://eavesdrop.openstack.org/irclogs/%23openstack-trio2o/
 +
* Mail list:  openstack-dev@lists.openstack.org, with [openstack-dev][trio2o] in the mail subject
 +
* New contributor's guide: http://docs.openstack.org/infra/manual/developers.html
  
'''Q: What is the difference between Tricircle and OpenStack Cascading?''' <br />
+
Trio2o is designed to use the same tools for submission and review as other OpenStack projects.  As such we follow the [http://docs.openstack.org/infra/manual/developers.html#development-workflow OpenStack development workflow]. New contributors should follow the [http://docs.openstack.org/infra/manual/developers.html#getting-started getting started] steps before proceeding, as a Launchpad ID and signed contributor license are required to add new entries.
  
OpenStack Cascading was mainly a solution used in a PoC done in late 2014 and early 2015, which aims to test out the idea that multiple OpenStack instances '''COULD''' be deployed across multiple geo-diverse sites, and managed by an OpenStack API layer, which was based on OpenStack services. After the PoC was carried out successfully, the team then planned to contribute the core idea to the community.
+
==How to read the source code==
 +
To read the source code, it's much easier if you follow this blueprint:
  
Tricircle was born out of that idea, however got a different shape and focus. Unlike what is implemented in the V1 of OpenStack cascading solution in the PoC, which has plenty twists and plumbers of feature enhancements, Tricircle in its earliest stage tries to build a clean architecture that is extendable, pluggable and reusable in nature.
+
Implement Stateless Architecture: https://blueprints.launchpad.net/tricircle/+spec/implement-stateless
  
In short, OpenStack Cascading is a specific deployment solution, while Tricircle represents a standalone project with decoupled group of services, like any other OpenStack project for example Neutron, Nova or Glance, etc, that in the future could be applied to OpenStack Ecosystem.
+
This blueprint is to build Tricircle from scratch, also was the code base for the Trio2o project.
  
'''Q: What is the goal of Tricircle?'''<br />
+
==History==
  
In short term, Tricircle would focus on developing a robust architecture and related features, in a long run, we hope we could successfully establish a paradigm that could be applied to the whole OpenStack community
+
'''Q: What is the difference between Trio2o, Tricircle and OpenStack Cascading?''' <br />
  
==How to read the source code==
+
OpenStack Cascading was mainly a solution used in a PoC done in late 2014 and early 2015, which aims to test out the idea that multiple OpenStack instances '''COULD''' be deployed across multiple geo-diverse sites, and managed by an OpenStack API layer, which was based on OpenStack services. After the PoC was carried out successfully, the team then planned to contribute the core idea to the community.
To read the source code, it's much easier if you follow this blueprint:
 
  
Implement Stateless Architecture: https://blueprints.launchpad.net/tricircle/+spec/implement-stateless
+
Tricircle was born out of that idea, unlike what is implemented in the V1 of OpenStack cascading solution in the PoC, which has plenty twists and plumbers of feature enhancements, Tricircle in its earliest stage tries to build a clean architecture that is extendable, pluggable and reusable in nature, it includes the OpenStack API gateway and networking automation functionalities
  
This blueprint is to build Tricircle from scratch
+
In Sept. 2016, according to the feedback from TCs on Tricircle big-tent application, Trio2o, which is to provide API gateway functionalities, was moved away from Tricircle, this makes Tricircle dedicated for networking automation across Neutron.
  
==Resources==
+
Tricircle: Dedicated for cross Neutron networking automation in multi-region OpenStack deployments, run without or with Trio2o. <br />
* Design documentation: [https://docs.google.com/document/d/18kZZ1snMOCD9IQvUKI5NVDzSASpw-QKj7l2zNqMEd3g/edit?usp=sharing Tricircle Design Blueprint]
 
* Wiki: https://wiki.openstack.org/wiki/tricircle
 
* Source: https://github.com/openstack/tricircle
 
* Bugs: http://bugs.launchpad.net/tricircle
 
* Blueprints: https://launchpad.net/tricircle
 
* Review Board: https://review.openstack.org/#/q/project:openstack/tricircle
 
* Weekly meeting IRC channel: #openstack-meeting, irc.freenode.net on every Wednesday starting from UTC 13:00 to UTC 14:00
 
* Weekly meeting IRC log: https://wiki.openstack.org/wiki/Meetings/Tricircle
 
* Tricircle project IRC channel: #openstack-tricircle on irc.freenode.net
 
* Tricircle project IRC channel log: http://eavesdrop.openstack.org/irclogs/%23openstack-tricircle/
 
* Mail list:  openstack-dev@lists.openstack.org, with [openstack-dev][tricircle] in the mail subject
 
* New contributor's guide: http://docs.openstack.org/infra/manual/developers.html
 
* Documentation: http://docs.openstack.org/developer/tricircle
 
  
* Tricircle big-tent application defense: https://review.openstack.org/#/c/338796 (A lots of comment and discussion to learn about Tricircle from many aspects)
+
Trio2o: Dedicated to provide API gateway for those who need single Nova/Cinder API endpoint in multi-region OpenStack deployment, run without or with Tricircle.<br />
  
 +
The wiki for Tricircle before splitting is linked here: https://wiki.openstack.org/wiki/tricircle_before_splitting
  
Tricircle is designed to use the same tools for submission and review as other OpenStack projects.  As such we follow the [http://docs.openstack.org/infra/manual/developers.html#development-workflow OpenStack development workflow]. New contributors should follow the [http://docs.openstack.org/infra/manual/developers.html#getting-started getting started] steps before proceeding, as a Launchpad ID and signed contributor license are required to add new entries.
+
'''Q: Where is the source code from''' <br />
 +
Trio2o source code is forked from Tricircle and then with cleaning
  
==Meeting minutes and logs==
+
https://etherpad.openstack.org/p/Trio2oCleaning
all meeting logs and minutes could be found in <br />
 
2016: http://eavesdrop.openstack.org/meetings/tricircle/2016/
 
<br />
 
2015: http://eavesdrop.openstack.org/meetings/tricircle/2015/
 
  
 
==To do list==
 
==To do list==
To do list is in the etherpad: https://etherpad.openstack.org/p/TricircleToDo<br />
+
To do list is in the etherpad: https://etherpad.openstack.org/p/Trio2oToDo
 +
 
 +
Sync patch from Tricircle for Nova API-GW/Cinder API-GW part:
 +
http://lists.openstack.org/pipermail/openstack-dev/2016-December/108552.html
  
Splitting Tricircle into two projects: https://etherpad.openstack.org/p/TricircleSplitting
+
==Meeting minutes and logs==
 +
all meeting logs and minutes could be found in <br />
 +
2016: http://eavesdrop.openstack.org/meetings/trio2o/2016/
  
 
==Team Member==
 
==Team Member==
Contact team members in IRC channel: #openstack-tricircle
+
Contact team members in IRC channel: #openstack-trio2o
  
 
===Current active participants===
 
===Current active participants===

Latest revision as of 01:31, 17 February 2017

The Trio2o is to provide APIs gateway for multiple OpenStack clouds, spanning in one site or multiple sites or in hybrid cloud, to act as a single OpenStack cloud.

Use Cases

Single endpoint requirement for multiple OpenStack instances is up to the cloud operators' favorite. Some operators may prefer to have single endpoint, but others may not. There are lots of scenario will have multiple OpenStack instaces in the cloud.

Large scale cloud

Compared with Amazon, the scalability of OpenStack is still not good enough. One Amazon AZ can supports >50000 servers(http://www.slideshare.net/AmazonWebServices/spot301-aws-innovation-at-scale-aws-reinvent-2014).

Cells is a good enhancement for Nova scalability, but the shortage of Cells are: 1) using RPC for inter-data center communication will bring the difficulty in inter-dc troubleshooting and maintenance, and some critical issue in operation. No CLI or restful API or other tools to manage a child cell directly. If the link between the API cell and child cells is broken, then the child cell in the remote edge cloud is unmanageable, no matter locally or remotely. 2). The challenge in security management for inter-site RPC communication. Please refer to the slides[1] for the challenge 3: Securing OpenStack over the Internet, Over 500 pin holes had to be opened in the firewall to allow this to work – Includes ports for VNC and SSH for CLIs. Using RPC in cells for edge cloud will face same security challenges.3)only nova supports cells. But not only Nova needs to support edge clouds, Neutron, Cinder should be taken into account too. How about Neutron to support service function chaining in edge clouds? Using RPC? how to address challenges mentioned above? And Cinder? 4). Using RPC to do the production integration for hundreds of edge cloud is quite challenge idea, it's basic requirements that these edge clouds may be bought from multi-vendor, hardware/software or both.

From the experience of production large scale public cloud point of view, the large scale cloud can be built by capacity expansion step by step (intra-AZ and inter-AZ). And the challenge in capacity expansion is how to do the sizing:

  • Number of Nova-API Server...
  • Number of Cinder-API Server..
  • Number of Neutron-API Server…
  • Number of Scheduler..
  • Number of Conductor…
  • Specification of physical switch…
  • Size of storage for Image..
  • Size of management plane bandwidth…
  • Size of data plane bandwidth…
  • Reservation of rack space …
  • Reservation of networking slots…
  • ….

You have to estimate, calculate, monitor, simulate, test, online grey expansion for controller nodes and network nodes…whenever you add new machines to the cloud. The difficulty is that you can’t test and verify in all size.

The feasible way to expand one large scale cloud is to add some already tested building block. That means we would prefer to build large scale cloud by adding tested OpenStack instance (including controller and compute nodes) one by one, but would not prefer to unconditionally enlarge the capacity of one OpenStack instance. This way put the cloud construction under control.

Building large scale cloud by adding tested OpenStack instance one by one, but from the end user and PaaS point of view, they still want to use OpenStack API for already developed CLI, SDK, Portal, PaaS, Heat, Maganum, Murano etc. This way of building large scale public cloud also brings some new requirement to OpenStack based cloud which is quite similar like that in massive distributed edge clouds:

  • Single endpoint for the large scale cloud
  • Distributed quota management
  • Global resource view of the tenant
  • Tenant level Volume/VM migration/backup
  • Multi-DC image import/clone/export
  • ...


Massive distributed edge cloud

Now building massive distributed edge clouds in edge data centers with computing and storage close to end users is emerging for enterprise application, NFV service and personal service.

Enterprise Application

Some enterprises also found issues for applications running in remote centralized cloud, for example for video editing, 3D modeling application and IoT service etc which bandwidth and latency are sensitive.

The high bandwidth and low latency provided by the edge cloud are critical for enterprise level applications like video editing, 3D modeling, AR/VR, IoT service, etc

For Enterprise, most of the employee will work in different branches, and access to the nearby edge cloud, and collaboration among employee from different branch leads to the requirement on cross edge cloud functionalities, like data distribution and migration.

NFV and Edge Cloud Service

NFV(network function virtualization) will provide more flexible and better customized networking capability, for example, dynamic customized network bandwidth management, and also help to move the computing and storage close to end users. With shortest path from the end users to the storage and computing, the uplink speed could be larger and terminate the bandwidth consumption as quick as possible, will definitely bring better user experience, and change the way of content generation and store: real time, all data in cloud.

For example, an user/enterprise can dynamically ask for high bandwidth/storage requirement for streaming the HD video/AR/VR data into the cloud temporarily, after finishing streaming, ask for more computing resources to do the post processing, and re-distribute the video to other sites. And when a user want to move/re-distribute the application and data from one edge cloud to another one, should be able to dynamically ask for cross edge cloud bandwidth managed by NFV.

Personal service

Current Internet is good at processing down-link service. All contents are stored in remote centralized data centers and to some extent the access is accelerated with CDN.

As more and more users generate content uploaded/streamed to the cloud and web site, these contents and data still have to be uploaded/streamed to some centralized data centers, the path is long and the bandwidth is limited and slow. For example, it’s very slow to uploading/streaming HD/2k/4k video for every user concurrently. For pictures or videos, they have to be uploaded with quality loss, and slow, using cloud as the first storage for users data has not the choice yet, currently it’s mainly for backup, and for none time/latency sensitive data. Some video captured and stored with quality loss even lead to the difficulty to provide the crime evidence or other purpose. The last mile of network access (fix or mobile) is wide enough, the main hindrance is that bandwidth in MAN(Metropolitan Area Network) and Backbone and WAN is limited and expensive. Real time video/data uploading/streaming from end user/terminal to the local edge cloud is quite attractive cloud service.

Requirements

The emerging massive distributed edge clouds will be not only some cloud islands, but also some new requirements are needed:

  • Single endpoint for large amount of small edge cloud in specific deployment scenario
  • Tenant level Volume/VM/object storage backup/migration/distribution
  • Distributed image management
  • Distributed quota management
  • ...

OpenStack API enabled hybrid cloud

Refer to https://wiki.openstack.org/wiki/Jacket
Strong requirement on single endpoint of the hybrid cloud.

The detailed use cases could be found in this presentation: https://docs.google.com/presentation/d/16laTyn4ra-446v4p0kwMnpgHqwzMsz1r6QeiSI2Kq2M/

More technical use cases could be found in the communication material used in Tricircle big-tent project application, there is also requirement on single endpoint requirement in some deployments: https://docs.google.com/presentation/d/1Zkoi4vMOGN713Vv_YO0GP6YLyjLpQ7fRbHlirpq6ZK4/edit?usp=sharing

Overview

The Trio2o is to provide APIs gateway for multiple OpenStack clouds, spanning in one site or multiple sites or in hybrid cloud, to act as a single OpenStack cloud.

The Trio2o and these managed OpenStack instances will use shared KeyStone (with centralized or distributed deployment) or federated KeyStones for identity management. The Trio2o presents one big region to the end user in KeyStone. And each OpenStack instance called a pod is a sub-region of the Trio2o in KeyStone, and usually not visible to end user directly.

The Trio2o acts as OpenStack API gateway, can handle OpenStack API calls, schedule one proper OpenStack instance if needed during the API calls handling, forward the API calls to the appropriate OpenStack instance.

The end user can see availability zone(AZ) and use AZ to provision VM, Volume through the Trio2o . One AZ can include many OpenStack instances, the Trio2o can schedule and bind OpenStack instance for the tenant inside one AZ. A tenant's resources could be bound to multiple specific bottom OpenStack instances in one or multiple AZs automatically.

The Trio2o is derived from the old Tricircle, and work dedicated for the API-gateway.

The Trio2o could be extended to support more powerful capabilities such as support the Trio2o instance being virtually splitted into multiple micro instances which could enable user to have a more fine granularity on the tenancy and service. And the Trio2o also enables OpenStack based hybrid cloud.

Architecture

Now the Trio2o is designed as standalone API-gateway service which is decoupled from OpenStack existing services like Nova, Cinder. The design blueprint has been developed with ongoing improvement in https://docs.google.com/document/d/1cmIUsClw964hJxuwj3ild87rcHL8JLC-c7T-DUQzd4k/,


Trio2o architecture design


Sared KeyStone (centralized or distributed deployment) or federated KeyStones could be used for identity management for the Trio2o and managed OpenStack instances. The Trio2o presents one big region to the end user in KeyStone. And each OpenStack instance called a pod is a sub-region of the Trio2o in KeyStone, and usually not visible to end user directly.

  • Nova API-GW
  1. A standalone web service to receive all nova API request, and routing the request to appropriate bottom OpenStack instance according to Availability Zone ( during creation ) or VM's uuid ( during operation and query ). If more than one OpenStack instance in one Availability Zone, schedule one and forward the request to proper OpenStack instance, and build the binding relationship between tenant ID and OpenStack instance.
  2. work as stateless service, and could run with processes distributed in multi-hosts.
  • Cinder API-GW
  1. a standalone web service to receive all cinder API request, and route the request to appropriate bottom OpenStack instance according to Availability Zone ( during creation ) or resource id like volume/backup/snapshot uuid ( during operation and query ). If more than one OpenStack instance in one Availability Zone, schedule one OpenStack instance and forward the request to proper OpenStack instance, and build binding relationship between the tenant ID and OpenStack instance.
  2. Cinder APIGW and Nova APIGW will make sure the volumes for the same VM will co-locate in same OpenStack instance.
  3. work as stateless service, and could run with processes distributed in multi-hosts.
  • Admin API
  1. manage sites(bottom OpenStack instances) and availability zone mappings.
  2. Retrieve object uuid routing.
  3. Expose API for maintenance.
  • XJob
  1. Receive and process cross OpenStack functionalities and other async. jobs from Nova API-GW, or Cinder API-GW, Admin API
  2. Any of Admin API, Nova API-GW, Cinder API-GW could send an async. job to XJob through message bus with RPC API provided by XJob.
  • Database
  1. The Tricircle has its own database to store pods, pod-bindings, jobs, resource routing tables.

For Glance deployment, there are several choice:

  • Shared Glance, if all OpenStack instances are located inside a high bandwidth, low latency site.
  • Shared Glance with distributed back-end, if OpenStack instances are located in several sites.
  • Distributed Glance deployment, Glance service is deployed distributed in multiple site with distributed back-end
  • Separate Glance deployment, each site is installed with separate Glance instance and back-end, no cross site image sharing is needed.

Value

The motivation to develop the Trio2o open source project:

  • OpenStack API eco-system reserved, from CLI, SDK to Heat, Murano, Magum etc, all of these could be reused seamlessly.
  • support modularized capacity expansion in large scale cloud.
  • Tenant level quota control across OpenStack instances.
  • Global resource usage view across OpenStack instances.
  • User level KeyPair management across OpenStack instances.
  • Tenant's data movement across OpenStack instances thanks to the tenant level L2/L3 networking.
  • ...

Installation and Play

Refer to installation guide in https://github.com/openstack/trio2o for single node/two nodes setup using devstack.

Resources

Trio2o is designed to use the same tools for submission and review as other OpenStack projects. As such we follow the OpenStack development workflow. New contributors should follow the getting started steps before proceeding, as a Launchpad ID and signed contributor license are required to add new entries.

How to read the source code

To read the source code, it's much easier if you follow this blueprint:

Implement Stateless Architecture: https://blueprints.launchpad.net/tricircle/+spec/implement-stateless

This blueprint is to build Tricircle from scratch, also was the code base for the Trio2o project.

History

Q: What is the difference between Trio2o, Tricircle and OpenStack Cascading?

OpenStack Cascading was mainly a solution used in a PoC done in late 2014 and early 2015, which aims to test out the idea that multiple OpenStack instances COULD be deployed across multiple geo-diverse sites, and managed by an OpenStack API layer, which was based on OpenStack services. After the PoC was carried out successfully, the team then planned to contribute the core idea to the community.

Tricircle was born out of that idea, unlike what is implemented in the V1 of OpenStack cascading solution in the PoC, which has plenty twists and plumbers of feature enhancements, Tricircle in its earliest stage tries to build a clean architecture that is extendable, pluggable and reusable in nature, it includes the OpenStack API gateway and networking automation functionalities

In Sept. 2016, according to the feedback from TCs on Tricircle big-tent application, Trio2o, which is to provide API gateway functionalities, was moved away from Tricircle, this makes Tricircle dedicated for networking automation across Neutron.

Tricircle: Dedicated for cross Neutron networking automation in multi-region OpenStack deployments, run without or with Trio2o.

Trio2o: Dedicated to provide API gateway for those who need single Nova/Cinder API endpoint in multi-region OpenStack deployment, run without or with Tricircle.

The wiki for Tricircle before splitting is linked here: https://wiki.openstack.org/wiki/tricircle_before_splitting

Q: Where is the source code from
Trio2o source code is forked from Tricircle and then with cleaning

https://etherpad.openstack.org/p/Trio2oCleaning

To do list

To do list is in the etherpad: https://etherpad.openstack.org/p/Trio2oToDo

Sync patch from Tricircle for Nova API-GW/Cinder API-GW part: http://lists.openstack.org/pipermail/openstack-dev/2016-December/108552.html

Meeting minutes and logs

all meeting logs and minutes could be found in
2016: http://eavesdrop.openstack.org/meetings/trio2o/2016/

Team Member

Contact team members in IRC channel: #openstack-trio2o

Current active participants

Joe Huang, Huawei

Khayam Gondal, Dell

Shinobu Kinjo, RedHat

Ge Li, China UnionPay

Vega Cai, Huawei

Pengfei Shi, OMNI Lab

Bean Zhang, OMNI Lab

Yipei Niu, Huazhong University of Science and Technology

Ronghui Cao, Hunan University

Xiongqiu Long, Hunan University

Zhuo Tang, Hunan University

Liuzyu, Hunan University

Jiawei He, Hunan University

KunKun Liu, Hunan University

Yangkai Shi, Hunan University

Yuquan Yue, Hunan University

Howard Huang, Huawei