Jump to: navigation, search

Difference between revisions of "Tricircle"

(Use Cases)
(Use Cases)
Line 2: Line 2:
  
 
==Use Cases==
 
==Use Cases==
* Massive distributed edge clouds<br />
+
'''* Massive distributed edge clouds'''<br />
  
 
Current Internet is good at processing downlink service. All contents are stored in centralized data centers and to some extent the access is accelerated with CDN.<br />
 
Current Internet is good at processing downlink service. All contents are stored in centralized data centers and to some extent the access is accelerated with CDN.<br />
Line 21: Line 21:
 
       * ...
 
       * ...
  
* Large scale cloud<br />
+
'''* Large scale cloud'''<br />
  
 
Compared Amazon, the scalability of OpenStack is still not good enough. One Amazon AZ can supports >50000 servers(http://www.slideshare.net/AmazonWebServices/spot301-aws-innovation-at-scale-aws-reinvent-2014).  
 
Compared Amazon, the scalability of OpenStack is still not good enough. One Amazon AZ can supports >50000 servers(http://www.slideshare.net/AmazonWebServices/spot301-aws-innovation-at-scale-aws-reinvent-2014).  
Line 54: Line 54:
 
* ...
 
* ...
  
* OpenStack API enabled hybrid cloud
+
'''* OpenStack API enabled hybrid cloud'''
 
Refer to https://wiki.openstack.org/wiki/Jacket
 
Refer to https://wiki.openstack.org/wiki/Jacket
  

Revision as of 03:56, 24 March 2016

Tricircle is a project offering a solution to provide OpenStack API gateway including networking automation when deploying multi-OpenStack instances in one site, or across multi-sites, or hybrid cloud

Use Cases

* Massive distributed edge clouds

Current Internet is good at processing downlink service. All contents are stored in centralized data centers and to some extent the access is accelerated with CDN.

As more and more user generated content uploaded/streamed to the cloud and web site, these contents and data still have to be uploaded/streamed to some big data centers, the path is long and the bandwidth is limited and slow. For example, it’s very slow to upload/streaming HD/2k/4k video for every user concurrently, both for pictures or videos, they have to be uploaded with quality loss, and slow, use cloud as the first storage for user data is not the choice yet, currently it’s mainly for backup, and for not time sensitive data. Some video captured and stored with quality loss even lead to the difficulty to provide the crime evidence or other purpose. The last mile of network access (fix or mobile) is wide enough, the main hindrance is that bandwidth in MAN(Metropolitan Area Network) and Backbone and WAN is limited and expensive.

Now in telecom are, building the massive distributed edge clouds in edge data centers with computing and storage close to end user is emerging, and even for NFV with more flexible and customized networking capability will provide better networking functionalities, and also help to move the computing and storage. With shortest path from the end user to the storage and computing, the uplink speed could be larger and terminate the bandwidth consumption as early as possible, will definitely bring better user experience, and change the way of content generation and store: real time, all data in cloud. For example, an user/enterprise can dynamically ask for high bandwidth/storage requirement for streaming the HD video/AR/VR data into the cloud temporary, after finish the streaming, ask for more computing resources to do the post processing, and re-distribute the video to other sites.

VNF/App/Storage in the edge cloud can provide better user experience for the end user, the movement or distribution of VNF/App/Storage from one edge data center to another one is also needed. For example, all video will be stored and processed in Hawaii locally when I am taking video in travelling, but I hope the video after processing will be moved to China Shenzhen when I come back to China. But in Shenzhen, I want to share the video with streaming service not only in Shenzhen but to friends in Shanghai Beijing, so the data and the streaming service can be built in Shenzhen/Shanghai/Beijing too.

For VNF, distributed designed VNF will be placed to multiple edge data centers for higher reliability/availability. For example, the vEPC is designed to be able to distributed into multiple data centers by make the DB being fully distributed, and even chaining multiple VNFs cross edge data centers for better customized networking capabilities.

The emerging massive distributed edge clouds will not only be some independent clouds, some new requirements are needed:

     * L2/L3 networking across data centers
     * Volume/VM/object storage backup/migration/distribution
     * Distributed image management
     * Distributed quota management
     * ...

* Large scale cloud

Compared Amazon, the scalability of OpenStack is still not good enough. One Amazon AZ can supports >50000 servers(http://www.slideshare.net/AmazonWebServices/spot301-aws-innovation-at-scale-aws-reinvent-2014).

Cells is a good enhancement, but the shortage of Cells is: 1) only nova supports cells. 2) using RPC for inter-datacenter communication will bring the difficulty in inter-dc troubleshooting. 3) upgrade has to deal with DB and RPC change. 4)difficult for multi-vendor integration for different cell.

From the experience of production large scale public cloud, the large scale cloud can only be built by capacity expansion step by step (intra-AZ and inter-AZ). And the challenge in capacity expansion is how to do the sizing:

  • Number of Nova-API Server...
  • Number of Cinder-API Server..
  • Number of Neutron-API Server…
  • Number of Scheduler..
  • Number of Conductor…
  • specification of physical server…
  • specification of physical switch…
  • Size of storage for Image..
  • Size of management plane bandwidth…
  • size of data plane bandwidth…
  • reservation of rack space …
  • reservation of networking slots…
  • ….

You have to estimate, calculate, monitoring, simulate, test, online grey expansion for controller nodes and network nodes…whenever you add new machines to the cloud. The difficulty is that you can’t test and verify in all size.

The feasible way to expand one large scale cloud is to add some already tested building block, that means we would prefer to build large scale public cloud by adding tested OpenStack instance (including controller and compute nodes) one by one, but not enlarge one OpenStack uncontraintly. This way put the cloud construction under control.

Building large scale cloud by by adding tested OpenStack instance one by one, will lead to tenant’s resource distributed in multiple OpenStacks, also brings some new requirement to OpenStack based cloud, quite similar like that in massive distributed edge clouds:

  • L2/L3 networking across OpenStack instances
  • Distributed quota management
  • Global resource view of the tenant
  • Volume/VM migration/backup
  • Multi-DC image import/clone/export
  • ...

* OpenStack API enabled hybrid cloud Refer to https://wiki.openstack.org/wiki/Jacket


The detailed use cases could be found in this presentation: https://docs.google.com/presentation/d/1UQWeAMIJgJsWw-cyz9R7NvcAuSWUnKvaZFXLfRAQ6fI/edit?usp=sharing

And also can meet the demand for several working group: Telco WG documents, Large Deployment Team Use Cases, and OPNFV Multisite Use Cases

Overview

Tricircle is a project offering a solution to provide OpenStack API gateway including networking automation when deploying multi-OpenStack instances in one site, or across multi-sites, or hybrid cloud.

Tricircle would enable user to have a single management view by having only one Tricircle instance on behalf of all the involved OpenStack instances. Tricircle essentially serves as the central OpenStack API calls gateway to other OpenStack instances that are called upon, and doing the networking automation across multiple OpenStack instances.

Tricircle is the formal open source project for OpenStack cascading solution ( https://wiki.openstack.org/wiki/OpenStack_cascading_solution ).

Currently there have been various good sub-projects within different OpenStack projects that try to solve the same problem as Tricircle does, however when OpenStack is deployed in the real world, a suite of OpenStack projects need to be deployed together rather than individual one. That puts additional difficulties on the multi-openstack-instances deployment. Builds upon that, the management of such deployment would make it sounds even impossible to deal with.

Here in Tricircle, we try to solve this problem by defining a unified approach that would apply to any OpenStack projects, as well as providing a plugable structure that is extensible and has minimal impact on the main in-tree code.

Tricircle could be extended to support more powerful capabilities such as support the central Tricircle instance being virtually splitted into multiple micro instances which could enable user to have a more fine granularity on the tenancy and service. And the Tricircle also enables OpenStack based hybrid cloud.

Architecture

The cascading solution based on PoC design with enhancement is running in several production clouds like Huawei Public Cloud in China, which brings the confidence of the value of cascading, here the focus is on how to design and develop a perfect cascading solution in open source.

The initial architectural in the PoC is stateful, which could be found in https://wiki.openstack.org/wiki/OpenStack_cascading_solution, and the major notorious part identified in the PoC are status synchronization for VM,Volume, etc, UUID mapping and coupling with OpenSatck existing services like Nova, Cinder.

Now Tricircle will be developed with stateless design to remove the challenges, and fully decouple with OpenStack services. An improved design is discussed and developed in https://docs.google.com/document/d/18kZZ1snMOCD9IQvUKI5NVDzSASpw-QKj7l2zNqMEd3g/edit?usp=sharing,

Stateless Architecture

Tricircle improved architecture design - stateless


Admin API

  • expose api for administrator to manage the cascading
  • manage sites and availability zone mapping
  • retrieve resource routing
  • expose api for maintenance

Nova API-GW

  • an standalone web service to receive all nova api request, and routing the request to regarding bottom OpenStack according to Availability Zone ( during creation ) or resource id ( during operation and query ).
  • work as stateless service, and could run with processes distributed in mutlti-hosts.

Cinder API-GW

  • an standalone web service to receive all cinder api request, and routing the request to regarding bottom OpenStack according to Availability Zone ( during creation ) or resource id ( during operation and query ).
  • work as stateless service, and could run with processes distributed in mutlti-hosts.

XJob

  • receive and process cross OpenStack functionalities and other aync. jobs from message bus
  • for example, when booting a VM for the first time for the project, router, security group rule, FIP and other resources may have not already been created in the bottom site, but it’s required. Not like network,security group, ssh key etc resources they must be created before a VM booting, these resources could be created in async.way to accelerate response for the first VM booting request
  • cross OpenStack networking also will be done in async. jobs
  • Any of Admin API, Nova API-GW, Cinder API-GW, Neutron Tricircle plugin could send an async. job to XJob through message bus with RPC API provided by XJob

Neutron Tricircle plugin

  • Just like OVN Neutron plugin, the tricircle plugin serve for multi-site networking purpose, including interaction with DCI SDN controller, will use ML2 mechnism driver interface to call DCI SDN controller, especially for cross OpenStack provider multi-segment L2 networking.

DB

  • Tricircle can have its own database to store sites, availability zone mapping, jobs, resource routing tables

FAQ

Q: What is the different between Tricircle and OpenStack Cascading?

OpenStack Cascading was mainly an implementation method used in a PoC done in late 2014 and early 2015, which aims to test out the idea that multiple OpenStack instances COULD be deployed across multiple geo-diverse sites. After the PoC was carried out successfully, the team then planned to contribute the core idea to the community.

Tricircle Project was born out of that idea, however got a different shape and focus. Unlike what is usually part of in a PoC, which has plenty twists and plumbers of feature enhancements, Tricircle in its earliest stage tries to build a clean architecture that is extendable, pluggable and reusable in nature.

In short, OpenStack Cascading is a specific deployment solution used for production purpose, while Tricircle represents an idea of one type of services, like Neutron or Murano, that in the future could be applied to OpenStack Ecosystem.

Q: What is the goal of Tricircle?

In short term, Tricircle would focus on developing a robust architecture and related features, in a long run, we hope we could successfully establish a paradigm that could be applied to the whole OpenStack community

Q: How can I set up Tricircle hand by hand ?

Yes, some volunteers sucessfully set up the Tricircle in 3 VMs with virtualbox in Ubuntu 14.04 LTS. The blog can be found in this

To do list

To do list is in the etherpad: https://etherpad.openstack.org/p/TricircleToDo

How to read source code

To read the source code, it's much easier if you follow this blueprint:

Implement Stateless Architecture: https://blueprints.launchpad.net/tricircle/+spec/implement-stateless

This blueprint is to build Tricircle from scratch

How to contribute

  1. Clone https://github.com/openstack/tricircle
  2. Make the changes to your entry, be sure to include what’s changed and why
  3. Commit the change for review
  4. The changes will be reviewed, merged within a day or so.

Tricircle is designed to use the same tools for submission and review as other OpenStack projects. As such we follow the OpenStack development workflow. New contributors should follow the getting started steps before proceeding, as a Launchpad ID and signed contributor license are required to add new entries.

The Tricircle Launchpad page can be found at https://launchpad.net/tricircle. Register BP or report bug in https://launchpad.net/tricircle

Community

We have regular weekly meetings at #openstack-meeting on every Wednesday starting from UTC 13:00.

You are also welcomed to discuss issues you cared about using openstack-dev mailing list with [Tricircle] in the mail title. I believe our team member would be quite responsible :)

Meeting minutes and logs

all meeting logs and minutes could be found in
2016: http://eavesdrop.openstack.org/meetings/tricircle/2016/
2015: http://eavesdrop.openstack.org/meetings/tricircle/2015/

Team Member

Contact team members in IRC channel: #openstack-tricircle

Current active participants

Joe Huang, Huawei

Khayam Gondal, Dell

Shinobu Kinjo, RedHat

Vega Cai, Huawei

Pengfei Shi, OMNI Lab

Bean Zhang, OMNI Lab

Yipei Niu, Huazhong University of Science and Technology

Howard Huang, Huawei