Jump to: navigation, search

Difference between revisions of "XenServer/XenServer CI/AdminTips"

(service: citrix-ci-gerritwatch)
Line 79: Line 79:
=== nodepool ===
=== nodepool ===
* source code: https://github.com/citrix-openstack/nodepool
* source code: https://github.com/citrix-openstack/nodepool
* branch: master
* installed branch: <tt>2014-11</tt>
* cloned to: /root/src/nodepool
* configuration file: <tt>/etc/nodepool/nodepool.yaml</tt>
* configuration file: /etc/nodepool/nodepool.yaml
* a separate nodepool user runs this service.
* was initially deployed with: https://github.com/citrix-openstack/install-nodepool/blob/master/inp/osci_installscript.sh
The ssh-keys given to created xenserver boxes (nodes) are configured in the jenkins job that runs the above script to create the xenserver-ci box.
==== service: nodepool ====
==== service: nodepool ====

Revision as of 22:17, 2 December 2014


How can I get the list of job IDs (and the URL for the logs) given a specific testcase. For example, I would like to see all the jobs that failed in tempest.api.compute.servers.test_server_actions.ServerActionsTest.test_reboot_suspended_server_hard




Nodepool service will launch an instance in the Rackspace cloud, and run the XenServer related scripts https://github.com/citrix-openstack/project-config/tree/xenserver-trusty/nodepool/scripts on those instances. The responsibilities of those scripts are:

After nodepool ran those scripts, it will take a snapshot of that node, so that further instances could be launched quicker. Unfortunately the upstream nodepool is not prepared for this lifecycle, so we are running a custom version of nodepool.

The second service is citrix-ci, which will get nodes from the pool and run https://github.com/stackforge/xenapi-os-testing/blob/master/run_tests.sh on them. This script will customise the cached version of devstack-gate.

The third service is citrix-ci-gerritwatch that is listening for gerrit events and communicates with citrix-ci.

Main box

This is the instance which orchestrates the execution of tests.

  • Authentication via ssh keys. For access, join the XenAPI team meeting on IRC.
  • All the components are deployed to this instance.
  • created with scripts living inside infrastructure.hg/osci (this is a citrix -internal repository)
  • IP address is also stored within the repository


  • to update:
    • Use the scripts provided at infrastructure.hg/osci
  • configuration file: /etc/osci/osci.config

service: citrix-ci-gerritwatch

This service watches the gerrit stream and adds jobs to the queue

  • Logs: /var/log/osci/citrix-ci-gerritwatch.log
  • (Re)start: (re)start citrix-ci-gerritwatch

service: citrix-ci

This service progresses jobs through the lifecycle (see below)

  • Logs: /var/log/osci/citrix-ci.log
  • (Re)start: (re)start citrix-ci
  • three threads:
    • main (run through state machine, see below)
    • collect results (long blocking process, so its split out so main stuff makes progress -- service net?)
    • delete node (go to collected, delete is a pain, tries up to to 10 mins, need to fix the delete process!!!!!)


  • osci-check-connection
    • checks to node, to xenserver, then logs uploaded to swift from both
  • osci-manage
    • what citrix-ci service runs
    • runs main lifecycle
    • can also manually add a job to the queue
  • osci-upload
    • upload logs from local host directory up to swift
    • called from osci-manage (via python not cli)
  • osci-watch-gerrit
    • what citrix-ci-gerrit-watch runs
    • reads gerrit stream and adds jobs to DB
  • osci-create-dbschema
    • creates job DB
  • osci-run-tests - could be used to run the tests on a node.
    • print out what commands would be executed, use the "print" executor:
      osci-run-tests [print|exec] user host ref
    • To run the tests, use a exec executor:
      osci-run-tests exec jenkins refs/changes/86/131386/1 openstack/nova
    • calls out to host to run tests
    • used (via python not cli) by osci-manage
  • osci-view
    • prints out the job DB in useful ways

To report status:

  • osci-view is called and content uploaded to swift
  • /opt/osci/src/openstack-citrix-ci/upload-ci-status


service: nodepool

Provisions VMs to use in the tests

  • Logs: /var/log/nodepool/nodepool.log, /var/log/nodepool/debug.log
  • (Re)start: killall nodepool; rm /var/run/nodepool/nodepool.pid; start nodepool


to get information/control the nodepool service.

  • To list the images:
nodepool image-list

To see how its all doing:

nodepool list

To claim a node:

nodepool hold id


These scripts are used by nodepool to prepare the nodes. Please see that nodepool's configuration refers to the location of these scripts.

Useful commands

  • osci-view list: Gives current queue, what is running etc. Shouldn't have jobs in here that are 'older' than 2 hours unless they are 'Finished'.
  • nodepool list: Gives a list of the currently available nodes. Should have some nodes that are 'Ready' or 'Building'
  • eval `ssh-agent`; ssh-add ~/.ssh/citrix_gerrit; osci-manage -c 12345/1; ssh-agent -k: Queue job 12345, patchset 1

VM lifecycle

  • Queued -> Running: citrix-ci job has got a new node from nodepool (nodepool list will show it as 'held') does osci-run-tests to hand over to xenapi-os-testing
  • Running -> Collecting: Job has finished; citrix-ci has changed state to Collecting - waiting on log collection thread
  • Collecting -> Collected: Log collection thread has posted logs to swift and updated job with logs URL
  • Collected -> Finished: Citrix-ci has posted to gerrit and the job is now complete
  • <anything> -> Obsolete: a new job for the same change (recheck or new patchset) has been added


TODO list

  • start using stackforge http://git.openstack.org/cgit/stackforge/xenapi-os-testing/
  • reduce number of excluded tests, using above stackforge integration
  • stop forking infra projects, where possible
  • consider moving to zuul, talk to turbo hipster folks
  • create new environment to test out config changes
  • create system to construct dev environments
  • trial out neutron based tests (with quark?)
  • BobBall to send johnthetubaguy an example on how to deploy the Main Box