XenServer/XenServer CI/AdminTips

= Status = The status report of the CI is available here: http://f0ab4dc4366795c303a0-8fd069087bab3f263c7f9ddd524fce42.r22.cf1.rackcdn.com/ci_status/results.html This report is updated by a cron job, which is part of osci (see below).

= Components =
 * infrastructure.hg This repo contains the high level installation scripts. http://hg.uk.xensource.com/openstack/infrastructure.hg
 * install-nodepool Contains the low-level installation scripts, used by infrastructure.hg/osci/*
 * nodepool Responsible for pre-baking nodes
 * project-config Scripts for baking the nodes
 * openstack-citrix-ci Services to run tests on the nodes
 * openstack-xenapi-testing-xva Scripts to build the ubuntu appliance
 * xenapi-os-testing Main entry point for the tests, and an exclusion list.
 * devstack-gate</tt> Used to run the tests

What branches are used in production
= Description = Nodepool service will launch an instance in the Rackspace cloud, and run the XenServer related scripts from project-config/nodepool/scripts</tt>. The responsibilities of those scripts are: After nodepool ran those scripts, it will take a snapshot of that node, so that further instances could be launched quicker. Unfortunately the upstream nodepool is not prepared for this lifecycle, so we are running a custom version of nodepool.
 * Convert the instance to a xenserver
 * Install a virtual appliance inside the xenserver. The virtual appliance is created by a service running inside Citrix. The appliance is created with the scripts living at openstack-xenapi-testing-xva</tt> and the produced appliances are stored here: http://downloads.vmd.citrix.com/OpenStack/xenapi-in-the-cloud-appliances/ and in RackSpace containers as well.
 * prepare the node for running OpenStack tests - this is "standard OpenStack magic"
 * once preparation finished, shut down the node

The second service is citrix-ci, which will get nodes from the pool and run https://github.com/stackforge/xenapi-os-testing/blob/master/run_tests.sh on them. This script will customise the cached version of devstack-gate.

The third service is citrix-ci-gerritwatch that is listening for gerrit events and communicates with citrix-ci.

Main box
This is the instance which orchestrates the execution of tests.


 * Authentication via ssh keys. For access, join the XenAPI team meeting on IRC.
 * All the components are deployed to this instance.
 * created with scripts living inside infrastructure.hg/osci (this is a citrix -internal repository)
 * IP address is also stored within the repository

openstack-citrix-ci

 * source code: https://github.com/citrix-openstack/openstack-citrix-ci
 * /opt/osci/src</tt> is where sources live
 * /opt/osci/env</tt> is the python virtual environment where it's installed
 * /etc/osci/osci.config</tt> is the configuration file
 * /var/log/osci/citrix-ci-gerritwatch.log</tt> is the log file for gerritwatch
 * /var/log/osci/citrix-ci.log</tt> is the log file for osci
 * /var/log/osci/status_upload.log</tt> is the log file for the status-upload cron job
 * A separate user, osci</tt> is used to run all the services. If you want to access the python environment, do the following:

sudo -u osci -i . /opt/osci/env/bin/activate


 * For update, use the scripts provided at infrastructure.hg/osci.

service: citrix-ci-gerritwatch
This service watches the gerrit stream and adds jobs to the queue
 * Logs: /var/log/osci/citrix-ci-gerritwatch.log</tt>
 * (Re)start: (re)start the service citrix-ci-gerritwatch</tt>

service: citrix-ci
This service progresses jobs through the lifecycle (see below)
 * Logs: /var/log/osci/citrix-ci.log</tt>
 * (Re)start: (re)start the service citrix-ci</tt>


 * three threads:
 * main (run through state machine, see below)
 * collect results (long blocking process, so its split out so main stuff makes progress -- service net?)
 * delete node (go to collected, delete is a pain, tries up to to 10 mins, need to fix the delete process!!!!!)

cron job: upload status
crontab -l -u osci
 * Uploads the CI status to swift
 * Can be disabled by touching the file /etc/osci/skip_status_update</tt>
 * see:

Utilities
These utilities are available if you activated the environment. Some of the utilities are available without activating the environment by using symlinks/wrapper scripts. These are: osci-manage</tt> and <tt>osci-view</tt>.


 * osci-check-connection - should be removed.
 * checks to node, to xenserver, then logs uploaded to swift from both
 * osci-manage
 * what citrix-ci service runs
 * runs main lifecycle
 * can also manually add a job to the queue
 * osci-upload
 * upload logs from local host directory up to swift
 * called from osci-manage (via python not cli)
 * osci-watch-gerrit
 * what citrix-ci-gerrit-watch runs
 * reads gerrit stream and adds jobs to DB
 * osci-create-dbschema
 * creates job DB
 * osci-run-tests - could be used to run the tests on a node.
 * print out what commands would be executed, use the "print" executor: osci-run-tests [print|exec] user host ref
 * To run the tests, use a exec executor: osci-run-tests exec jenkins 162.242.171.81 refs/changes/86/131386/1 openstack/nova
 * calls out to host to run tests
 * used (via python not cli) by osci-manage
 * osci-view
 * prints out the job DB in useful ways

To report status:
 * osci-view is called and content uploaded to swift
 * /opt/osci/src/openstack-citrix-ci/upload-ci-status

nodepool

 * source code: https://github.com/citrix-openstack/nodepool
 * installed branch: <tt>2014-11</tt>
 * configuration file: <tt>/etc/nodepool/nodepool.yaml</tt>
 * a separate nodepool user runs this service.

service: nodepool
Provisions VMs to use in the tests
 * Logs: /var/log/nodepool/nodepool.log, /var/log/nodepool/debug.log
 * (Re)start: killall nodepool; rm /var/run/nodepool/nodepool.pid; start nodepool

Utilities
to get information/control the nodepool service. nodepool image-list To see how its all doing: nodepool list To claim a node: nodepool hold id
 * To list the images:

project-config
These scripts are used by nodepool to prepare the nodes. Please see that nodepool's configuration refers to the location of these scripts.
 * source code: https://github.com/citrix-openstack/project-config
 * branch: xenserver-ci
 * cloned to: /root/src/project-config
 * to update these scripts, you don't need to restart any services.

Useful commands

 * osci-view list: Gives current queue, what is running etc. Shouldn't have jobs in here that are 'older' than 2 hours unless they are 'Finished'.
 * nodepool list: Gives a list of the currently available nodes. Should have some nodes that are 'Ready' or 'Building'
 * eval `ssh-agent`; ssh-add ~/.ssh/citrix_gerrit; osci-manage -c 12345/1; ssh-agent -k: Queue job 12345, patchset 1

VM lifecycle

 * Queued -> Running: citrix-ci job has got a new node from nodepool (nodepool list will show it as 'held') does osci-run-tests to hand over to xenapi-os-testing
 * Running -> Collecting: Job has finished; citrix-ci has changed state to Collecting - waiting on log collection thread
 * Collecting -> Collected: Log collection thread has posted logs to swift and updated job with logs URL
 * Collected -> Finished: Citrix-ci has posted to gerrit and the job is now complete
 * -> Obsolete: a new job for the same change (recheck or new patchset) has been added

Code

 * http://git.openstack.org/cgit/stackforge/xenapi-os-testing/ (although currently using https://github.com/citrix-openstack/xenapi-os-testing)
 * Actual job runner; downloaded by openstack-citrix-ci/osci/tests/test_instructions.py
 * citrix-ci: https://github.com/citrix-openstack/openstack-citrix-ci
 * Workflow manager
 * TODO - move to upstream, but for now... https://github.com/citrix-openstack/devstack-gate
 * TODO - move to upstream, but for now... https://github.com/citrix-openstack/install-nodepool

TODO list

 * start using stackforge http://git.openstack.org/cgit/stackforge/xenapi-os-testing/
 * reduce number of excluded tests, using above stackforge integration
 * stop forking infra projects, where possible
 * consider moving to zuul, talk to turbo hipster folks
 * create new environment to test out config changes
 * create system to construct dev environments
 * trial out neutron based tests (with quark?)
 * BobBall to send johnthetubaguy an example on how to deploy the Main Box