Jump to: navigation, search

Difference between revisions of "XenServer/XenServer CI/AdminTips"

(What branches are used in production)
 
(10 intermediate revisions by the same user not shown)
Line 1: Line 1:
= Questions =
+
= Status =
How can I get the list of job IDs (and the URL for the logs) given a specific testcase. For example, I would like to see all the jobs that failed in tempest.api.compute.servers.test_server_actions.ServerActionsTest.test_reboot_suspended_server_hard
+
The status report of the CI is available here:
 +
http://f0ab4dc4366795c303a0-8fd069087bab3f263c7f9ddd524fce42.r22.cf1.rackcdn.com/ci_status/results.html
 +
This report is updated by a cron job, which is part of osci (see below).
  
= Status =
+
= Components =
 +
* <tt>infrastructure.hg</tt> This repo contains the high level installation scripts. http://hg.uk.xensource.com/openstack/infrastructure.hg
 +
* <tt>install-nodepool</tt> Contains the low-level installation scripts, used by <tt>infrastructure.hg/osci/*</tt>
 +
* <tt>nodepool</tt> Responsible for pre-baking nodes
 +
* <tt>project-config</tt> Scripts for baking the nodes
 +
* <tt>openstack-citrix-ci</tt> Services to run tests on the nodes
 +
* <tt>openstack-xenapi-testing-xva</tt> Scripts to build the ubuntu appliance
 +
* <tt>xenapi-os-testing</tt> Main entry point for the tests, and an exclusion list.
 +
* <tt>devstack-gate</tt> Used to run the tests
  
http://f0ab4dc4366795c303a0-8fd069087bab3f263c7f9ddd524fce42.r22.cf1.rackcdn.com/ci_status/results.html
+
== What branches are used in production ==
 +
{| class="wikitable"
 +
!Component
 +
!branch/tag
 +
|-
 +
| citrix-openstack/install-nodepool
 +
| 2014-11
 +
|-
 +
| citrix-openstack/nodepool
 +
| 2014-11
 +
|-
 +
| citrix-openstack/project-config
 +
| 2014-11
 +
|-
 +
| citrix-openstack/openstack-citrix-ci
 +
| 2014-11
 +
|-
 +
| citrix-openstack/openstack-xenapi-testing-xva
 +
| 1.1.4
 +
|-
 +
| stackforge/xenapi-os-testing
 +
| master
 +
|-
 +
| citrix-openstack/devstack-gate
 +
| master
 +
|-
 +
|}
  
 
= Description =
 
= Description =
Nodepool service will launch an instance in the Rackspace cloud, and run the XenServer related scripts https://github.com/citrix-openstack/project-config/tree/xenserver-trusty/nodepool/scripts on those instances. The responsibilities of those scripts are:
+
Nodepool service will launch an instance in the Rackspace cloud, and run the XenServer related scripts from <tt>project-config/nodepool/scripts</tt>. The responsibilities of those scripts are:
 
* Convert the instance to a xenserver
 
* Convert the instance to a xenserver
* Install a virtual appliance inside the xenserver. The virtual appliance is created by a service running inside Citrix. The appliance is created with these scripts: https://github.com/citrix-openstack/openstack-xenapi-testing-xva and the produced appliances are stored here: http://downloads.vmd.citrix.com/OpenStack/xenapi-in-the-cloud-appliances/
+
* Install a virtual appliance inside the xenserver. The virtual appliance is created by a service running inside Citrix. The appliance is created with the scripts living at <tt>openstack-xenapi-testing-xva</tt> and the produced appliances are stored here: http://downloads.vmd.citrix.com/OpenStack/xenapi-in-the-cloud-appliances/ and in RackSpace containers as well.
 
* prepare the node for running OpenStack tests - this is "standard OpenStack magic"
 
* prepare the node for running OpenStack tests - this is "standard OpenStack magic"
 
* once preparation finished, shut down the node
 
* once preparation finished, shut down the node
Line 23: Line 59:
 
* Authentication via ssh keys. For access, join the XenAPI team meeting on IRC.
 
* Authentication via ssh keys. For access, join the XenAPI team meeting on IRC.
 
* All the components are deployed to this instance.
 
* All the components are deployed to this instance.
* 166.78.153.192
+
* created with scripts living inside infrastructure.hg/osci (this is a citrix -internal repository)
* created via jenkins job at Citrix
+
* IP address is also stored within the repository
  
 
=== openstack-citrix-ci ===
 
=== openstack-citrix-ci ===
 
* source code: https://github.com/citrix-openstack/openstack-citrix-ci
 
* source code: https://github.com/citrix-openstack/openstack-citrix-ci
* branch: master
+
* <tt>/opt/osci/src</tt> is where sources live
* cloned to: /root/src/openstack-citrix-ci/
+
* <tt>/opt/osci/env</tt> is the python virtual environment where it's installed
 +
* <tt>/etc/osci/osci.config</tt> is the configuration file
 +
* <tt>/var/log/osci/citrix-ci-gerritwatch.log</tt> is the log file for gerritwatch
 +
* <tt>/var/log/osci/citrix-ci.log</tt> is the log file for osci
 +
* <tt>/var/log/osci/status_upload.log</tt> is the log file for the status-upload cron job
 +
* A separate user, <tt>osci</tt> is used to run all the services. If you want to access the python environment, do the following:
  
* to update:
+
  sudo -u osci -i
** Stop all its services
+
  . /opt/osci/env/bin/activate
  cd /root/src/openstack-citrix-ci
 
git pull
 
  pip install -e .
 
** Start the required services
 
  
* configuration file: /root/osci.config
+
* For update, use the scripts provided at infrastructure.hg/osci.
  
 
==== service: citrix-ci-gerritwatch ====
 
==== service: citrix-ci-gerritwatch ====
 
This service watches the gerrit stream and adds jobs to the queue
 
This service watches the gerrit stream and adds jobs to the queue
* Logs: /var/log/citrix-ci-gerritwatch.log
+
* Logs: <tt>/var/log/osci/citrix-ci-gerritwatch.log</tt>
* (Re)start: (re)start citrix-ci-gerritwatch  
+
* (Re)start: (re)start the service <tt>citrix-ci-gerritwatch</tt>
  
 
==== service: citrix-ci ====
 
==== service: citrix-ci ====
 
This service progresses jobs through the lifecycle (see below)
 
This service progresses jobs through the lifecycle (see below)
* Logs: /var/log/citrix-ci.log
+
* Logs: <tt>/var/log/osci/citrix-ci.log</tt>
* (Re)start: (re)start citrix-ci
+
* (Re)start: (re)start the service <tt>citrix-ci</tt>
  
 
* three threads:
 
* three threads:
Line 54: Line 91:
 
** collect results (long blocking process, so its split out so main stuff makes progress -- service net?)
 
** collect results (long blocking process, so its split out so main stuff makes progress -- service net?)
 
** delete node (go to collected, delete is a pain, tries up to to 10 mins, need to fix the delete process!!!!!)
 
** delete node (go to collected, delete is a pain, tries up to to 10 mins, need to fix the delete process!!!!!)
 +
 +
==== cron job: upload status ====
 +
* Uploads the CI status to swift
 +
* Can be disabled by touching the file <tt>/etc/osci/skip_status_update</tt>
 +
* see:
 +
crontab -l -u osci
  
 
==== Utilities ====
 
==== Utilities ====
* osci-check-connection
+
These utilities are available if you activated the environment. Some of the utilities are available without activating the environment by using symlinks/wrapper scripts. These are: <tt>osci-manage</tt> and <tt>osci-view</tt>.
 +
 
 +
* osci-check-connection - should be removed.
 
** checks to node, to xenserver, then logs uploaded to swift from both
 
** checks to node, to xenserver, then logs uploaded to swift from both
 
* osci-manage
 
* osci-manage
Line 70: Line 115:
 
* osci-create-dbschema
 
* osci-create-dbschema
 
** creates job DB
 
** creates job DB
* osci-run-tests
+
* osci-run-tests - could be used to run the tests on a node.
** e.g. osci-run-tests print user host ref
+
** print out what commands would be executed, use the "print" executor: <pre>osci-run-tests [print|exec] user host ref</pre>
 +
** To run the tests, use a exec executor:<pre>osci-run-tests exec jenkins 162.242.171.81 refs/changes/86/131386/1 openstack/nova</pre>
 
** calls out to host to run tests
 
** calls out to host to run tests
 
** used (via python not cli) by osci-manage
 
** used (via python not cli) by osci-manage
Line 79: Line 125:
 
To report status:
 
To report status:
 
* osci-view is called and content uploaded to swift
 
* osci-view is called and content uploaded to swift
* /src/openstack-citrix-ci/upload-ci-status
+
* /opt/osci/src/openstack-citrix-ci/upload-ci-status
  
 
=== nodepool ===
 
=== nodepool ===
 
* source code: https://github.com/citrix-openstack/nodepool
 
* source code: https://github.com/citrix-openstack/nodepool
* branch: master
+
* installed branch: <tt>2014-11</tt>
* cloned to: /root/src/nodepool
+
* configuration file: <tt>/etc/nodepool/nodepool.yaml</tt>
* configuration file: /etc/nodepool/nodepool.yaml
+
* a separate nodepool user runs this service.
* was initially deployed with: https://github.com/citrix-openstack/install-nodepool/blob/master/inp/osci_installscript.sh
 
The ssh-keys given to created xenserver boxes (nodes) are configured in the jenkins job that runs the above script to create the xenserver-ci box.
 
  
 
==== service: nodepool ====
 
==== service: nodepool ====
Line 106: Line 150:
 
These scripts are used by nodepool to prepare the nodes. Please see that nodepool's configuration refers to the location of these scripts.
 
These scripts are used by nodepool to prepare the nodes. Please see that nodepool's configuration refers to the location of these scripts.
 
* source code: https://github.com/citrix-openstack/project-config
 
* source code: https://github.com/citrix-openstack/project-config
* branch: xenserver-trusty
+
* branch: xenserver-ci
 
* cloned to: /root/src/project-config
 
* cloned to: /root/src/project-config
 
* to update these scripts, you don't need to restart any services.
 
* to update these scripts, you don't need to restart any services.

Latest revision as of 08:40, 3 December 2014

Status

The status report of the CI is available here: http://f0ab4dc4366795c303a0-8fd069087bab3f263c7f9ddd524fce42.r22.cf1.rackcdn.com/ci_status/results.html This report is updated by a cron job, which is part of osci (see below).

Components

  • infrastructure.hg This repo contains the high level installation scripts. http://hg.uk.xensource.com/openstack/infrastructure.hg
  • install-nodepool Contains the low-level installation scripts, used by infrastructure.hg/osci/*
  • nodepool Responsible for pre-baking nodes
  • project-config Scripts for baking the nodes
  • openstack-citrix-ci Services to run tests on the nodes
  • openstack-xenapi-testing-xva Scripts to build the ubuntu appliance
  • xenapi-os-testing Main entry point for the tests, and an exclusion list.
  • devstack-gate Used to run the tests

What branches are used in production

Component branch/tag
citrix-openstack/install-nodepool 2014-11
citrix-openstack/nodepool 2014-11
citrix-openstack/project-config 2014-11
citrix-openstack/openstack-citrix-ci 2014-11
citrix-openstack/openstack-xenapi-testing-xva 1.1.4
stackforge/xenapi-os-testing master
citrix-openstack/devstack-gate master

Description

Nodepool service will launch an instance in the Rackspace cloud, and run the XenServer related scripts from project-config/nodepool/scripts. The responsibilities of those scripts are:

  • Convert the instance to a xenserver
  • Install a virtual appliance inside the xenserver. The virtual appliance is created by a service running inside Citrix. The appliance is created with the scripts living at openstack-xenapi-testing-xva and the produced appliances are stored here: http://downloads.vmd.citrix.com/OpenStack/xenapi-in-the-cloud-appliances/ and in RackSpace containers as well.
  • prepare the node for running OpenStack tests - this is "standard OpenStack magic"
  • once preparation finished, shut down the node

After nodepool ran those scripts, it will take a snapshot of that node, so that further instances could be launched quicker. Unfortunately the upstream nodepool is not prepared for this lifecycle, so we are running a custom version of nodepool.

The second service is citrix-ci, which will get nodes from the pool and run https://github.com/stackforge/xenapi-os-testing/blob/master/run_tests.sh on them. This script will customise the cached version of devstack-gate.

The third service is citrix-ci-gerritwatch that is listening for gerrit events and communicates with citrix-ci.

Main box

This is the instance which orchestrates the execution of tests.

  • Authentication via ssh keys. For access, join the XenAPI team meeting on IRC.
  • All the components are deployed to this instance.
  • created with scripts living inside infrastructure.hg/osci (this is a citrix -internal repository)
  • IP address is also stored within the repository

openstack-citrix-ci

  • source code: https://github.com/citrix-openstack/openstack-citrix-ci
  • /opt/osci/src is where sources live
  • /opt/osci/env is the python virtual environment where it's installed
  • /etc/osci/osci.config is the configuration file
  • /var/log/osci/citrix-ci-gerritwatch.log is the log file for gerritwatch
  • /var/log/osci/citrix-ci.log is the log file for osci
  • /var/log/osci/status_upload.log is the log file for the status-upload cron job
  • A separate user, osci is used to run all the services. If you want to access the python environment, do the following:
sudo -u osci -i
. /opt/osci/env/bin/activate
  • For update, use the scripts provided at infrastructure.hg/osci.

service: citrix-ci-gerritwatch

This service watches the gerrit stream and adds jobs to the queue

  • Logs: /var/log/osci/citrix-ci-gerritwatch.log
  • (Re)start: (re)start the service citrix-ci-gerritwatch

service: citrix-ci

This service progresses jobs through the lifecycle (see below)

  • Logs: /var/log/osci/citrix-ci.log
  • (Re)start: (re)start the service citrix-ci
  • three threads:
    • main (run through state machine, see below)
    • collect results (long blocking process, so its split out so main stuff makes progress -- service net?)
    • delete node (go to collected, delete is a pain, tries up to to 10 mins, need to fix the delete process!!!!!)

cron job: upload status

  • Uploads the CI status to swift
  • Can be disabled by touching the file /etc/osci/skip_status_update
  • see:
crontab -l -u osci

Utilities

These utilities are available if you activated the environment. Some of the utilities are available without activating the environment by using symlinks/wrapper scripts. These are: osci-manage and osci-view.

  • osci-check-connection - should be removed.
    • checks to node, to xenserver, then logs uploaded to swift from both
  • osci-manage
    • what citrix-ci service runs
    • runs main lifecycle
    • can also manually add a job to the queue
  • osci-upload
    • upload logs from local host directory up to swift
    • called from osci-manage (via python not cli)
  • osci-watch-gerrit
    • what citrix-ci-gerrit-watch runs
    • reads gerrit stream and adds jobs to DB
  • osci-create-dbschema
    • creates job DB
  • osci-run-tests - could be used to run the tests on a node.
    • print out what commands would be executed, use the "print" executor:
      osci-run-tests [print|exec] user host ref
    • To run the tests, use a exec executor:
      osci-run-tests exec jenkins 162.242.171.81 refs/changes/86/131386/1 openstack/nova
    • calls out to host to run tests
    • used (via python not cli) by osci-manage
  • osci-view
    • prints out the job DB in useful ways

To report status:

  • osci-view is called and content uploaded to swift
  • /opt/osci/src/openstack-citrix-ci/upload-ci-status

nodepool

service: nodepool

Provisions VMs to use in the tests

  • Logs: /var/log/nodepool/nodepool.log, /var/log/nodepool/debug.log
  • (Re)start: killall nodepool; rm /var/run/nodepool/nodepool.pid; start nodepool

Utilities

to get information/control the nodepool service.

  • To list the images:
nodepool image-list

To see how its all doing:

nodepool list

To claim a node:

nodepool hold id

project-config

These scripts are used by nodepool to prepare the nodes. Please see that nodepool's configuration refers to the location of these scripts.

Useful commands

  • osci-view list: Gives current queue, what is running etc. Shouldn't have jobs in here that are 'older' than 2 hours unless they are 'Finished'.
  • nodepool list: Gives a list of the currently available nodes. Should have some nodes that are 'Ready' or 'Building'
  • eval `ssh-agent`; ssh-add ~/.ssh/citrix_gerrit; osci-manage -c 12345/1; ssh-agent -k: Queue job 12345, patchset 1

VM lifecycle

  • Queued -> Running: citrix-ci job has got a new node from nodepool (nodepool list will show it as 'held') does osci-run-tests to hand over to xenapi-os-testing
  • Running -> Collecting: Job has finished; citrix-ci has changed state to Collecting - waiting on log collection thread
  • Collecting -> Collected: Log collection thread has posted logs to swift and updated job with logs URL
  • Collected -> Finished: Citrix-ci has posted to gerrit and the job is now complete
  • <anything> -> Obsolete: a new job for the same change (recheck or new patchset) has been added

Code


TODO list

  • start using stackforge http://git.openstack.org/cgit/stackforge/xenapi-os-testing/
  • reduce number of excluded tests, using above stackforge integration
  • stop forking infra projects, where possible
  • consider moving to zuul, talk to turbo hipster folks
  • create new environment to test out config changes
  • create system to construct dev environments
  • trial out neutron based tests (with quark?)
  • BobBall to send johnthetubaguy an example on how to deploy the Main Box