Jump to: navigation, search

Difference between revisions of "ThirdPartySystems/Intel-PCI-CI-internal"

(Over View)
 
(31 intermediate revisions by 5 users not shown)
Line 1: Line 1:
 +
https://etherpad.openstack.org/p/third-party-ci-status-tracking-intel-hardware
 
== Intel PCI CI==
 
== Intel PCI CI==
Intel PCI CI is used to ensure Openstack will run properly on Intel hardware with PCI device.
+
Intel PCI CI is used to ensure OpenStack to run properly on Intel hardware with PCI device.
  
It use Gerrit Trigger to trigger local jenkins project. Deploy OpenStack on Intel hardware and do some custom Tempest tests with parameters of booting VM with PCI feature.
+
It leverages Gerrit Trigger to trigger local Jenkins project. Deploy OpenStack on Intel hardware and do some custom Tempest tests with parameters of booting VM with PCI feature.
  
=== Over View ===
 
picture: see this link https://wiki.openstack.org/w/images/0/0d/F2Xi3Ucq.1447053596.png
 
  
And this is the CI work flow: <br/>
+
=== Overview ===
1 . CI Listens a patchset action from gerrit server. <br/>
+
[[file:F2Xi3Ucq.1447053596.png]]
2.Begin to clean local server.Including execute unstack.sh, execute clean.sh, delete related logs, kill all openstack daemons,delete devstack directory.<br/>
+
And this is the CI working flow: <br/>
3.After Clean step is over,do git pull in all directories in /opt/stack/ ,make sure the source code is the latest version.<br/>
+
Structure: Jenkins, Gerrit Trigger, devstack, Tempest(not upstream version).<br/>
4.Apply patch to Nova source code. git fetch xxxx &amp;&amp; git checkout xxxx<br/>
+
# Jenkins master server Listens a patchset action from gerrit server,assign task to a testing server. <br/>
5.Modify devstack/lib/nova,insert PCI parameters.<br/>
+
# Begin to clean local server.Including execute unstack.sh, execute clean.sh, delete related logs, kill all openstack daemons,remove mysql packages,delete devstack directory.<br/>
6.Generate local.conf file.<br/>
+
# After Clean step is over,do git pull in all directories in /opt/stack/ ,make sure the source code is the latest version.<br/>
7.Running devstack.<br/>
+
# Apply patch to Nova source code.Use command: git fetch xxxx &amp;&amp; git checkout xxxx<br/>
8.After devstack is done,run PCI Tempest test cases.<br/>
+
# Modify devstack/lib/nova,insert PCI parameters.<br/>
9.Report results back to gerrit server.<br/>
+
# Generate local.conf file.<br/>
<font color="#A82F2F"><font size="2">(15ʱ04·Ö55Ãë)</font> <b>jyuso1:</b></font> The main troubles PCI CI met:<br/>
+
# Running devstack installation.<br/>
<font color="#A82F2F"><font size="2">(15ʱ05·Ö06Ãë)</font> <b>jyuso1:</b></font> 1.Networking is not stable enough.Git operation fails some time.<br/>
+
# After devstack is done,run PCI Tempest test cases.<br/>
<font color="#A82F2F"><font size="2">(15ʱ05·Ö07Ãë)</font> <b>jyuso1:</b></font>    Our solution:Add HA pypi mirror,including local mirror and internal mirror to avoid networking down.<br/>
+
# Report results back to gerrit server.<br/>
<font color="#A82F2F"><font size="2">(15ʱ05·Ö07Ãë)</font> <b>jyuso1:</b></font>                  Add main alert.If there is something wrong in network,we&apos;ll know is ASAP.<br/>
+
 
<font color="#A82F2F"><font size="2">(15ʱ05·Ö33Ãë)</font> <b>jyuso1:</b></font> 2.Tempest cases merge new content,test cases have to be update usually.   Our solution:Don&apos;t use upstreaming Tempest directory,just maintain local Tempest code.<br/>
+
=== Issue ===
<font color="#A82F2F"><font size="2">(15ʱ05·Ö43Ãë)</font> <b>jyuso1:</b></font> 3.Software on CI server is not clean entirely and it may cause CI testing stuck,like mysql.     Our solution:Add specified steps to clean these software.<br/>
+
In Intel CI environment, we need a proxy to pull git repository, pip package and apt  package. Also we will push the CI test log to  an AWS log server.<br/>
 +
Any problem during the pull or push process, Our CI will report failure to the gerrit. <br/>  
 +
We have do some improvement to our CI. <br/>
 +
But still some problems need to fix. <br/>
 +
 
 +
* especially it will failed when pull git repository.(pull git repository every test, incremental pull)
 +
* some times the proxy will disconnect.
 +
* the network speed is very slow
 +
* some times testing server got a kernel panic error make system down
 +
* other software uninstall/install error occurred  some time,like mysql
 +
 
 +
=== Our plan ===
 +
[[File:Intel-CI-improvement1.png]]
 +
 
 +
* Zabbix monitor/alarm: this will be triggered automatically to notice operation[ongoing].
 +
* Proxy HA: currently we have 3 proxy server, and they can back up each other[automation switching scripts available].
 +
* Local PIP mirror and PIP repository mirror server form a active active HA mode(done): all test machine has their own local pip mirror, this will reduce testing time.
 +
* Mail alert  (done): any network issue will be sent out to operations. We have work shift between US and China.
 +
* Automation: all operation tasking is going to be automated by Ansible. which including networking fail-over(done), recovery CI machines(partly done), and any other roles in CI system like Monitor, Alarm etc.

Latest revision as of 02:34, 16 November 2015

https://etherpad.openstack.org/p/third-party-ci-status-tracking-intel-hardware

Intel PCI CI

Intel PCI CI is used to ensure OpenStack to run properly on Intel hardware with PCI device.

It leverages Gerrit Trigger to trigger local Jenkins project. Deploy OpenStack on Intel hardware and do some custom Tempest tests with parameters of booting VM with PCI feature.


Overview

F2Xi3Ucq.1447053596.png And this is the CI working flow:
Structure: Jenkins, Gerrit Trigger, devstack, Tempest(not upstream version).

  1. Jenkins master server Listens a patchset action from gerrit server,assign task to a testing server.
  2. Begin to clean local server.Including execute unstack.sh, execute clean.sh, delete related logs, kill all openstack daemons,remove mysql packages,delete devstack directory.
  3. After Clean step is over,do git pull in all directories in /opt/stack/ ,make sure the source code is the latest version.
  4. Apply patch to Nova source code.Use command: git fetch xxxx && git checkout xxxx
  5. Modify devstack/lib/nova,insert PCI parameters.
  6. Generate local.conf file.
  7. Running devstack installation.
  8. After devstack is done,run PCI Tempest test cases.
  9. Report results back to gerrit server.

Issue

In Intel CI environment, we need a proxy to pull git repository, pip package and apt package. Also we will push the CI test log to an AWS log server.
Any problem during the pull or push process, Our CI will report failure to the gerrit.
We have do some improvement to our CI.
But still some problems need to fix.

  • especially it will failed when pull git repository.(pull git repository every test, incremental pull)
  • some times the proxy will disconnect.
  • the network speed is very slow
  • some times testing server got a kernel panic error make system down
  • other software uninstall/install error occurred some time,like mysql

Our plan

Intel-CI-improvement1.png

  • Zabbix monitor/alarm: this will be triggered automatically to notice operation[ongoing].
  • Proxy HA: currently we have 3 proxy server, and they can back up each other[automation switching scripts available].
  • Local PIP mirror and PIP repository mirror server form a active active HA mode(done): all test machine has their own local pip mirror, this will reduce testing time.
  • Mail alert (done): any network issue will be sent out to operations. We have work shift between US and China.
  • Automation: all operation tasking is going to be automated by Ansible. which including networking fail-over(done), recovery CI machines(partly done), and any other roles in CI system like Monitor, Alarm etc.