Jump to: navigation, search

Stackmonkey

We want this tool to simulate “unexpected” negative scenarios, outages in the cloud deployment. Those things that happen completely randomly.

Examples:

Physical Node level:

  • a) Node shutdown, reboot, power outage. How other service nodes will behave? Compute cluster, API server down, rollback request? (Use openIPMI, Cobbler's power management features.)
b. Node network interface is down, cable unplugged or Network port problems (host firewall rules, port forwarding enabled/disabled etc.). Use shell commands.
c. Node memory utilization crosses thresholds, swap memory exceeded, kernel panic, memory leaks / memory unavailable to Nova services.  (swapon/ swapoff) 
d. Node CPU utilization is high, slowing down Instance creation, Nova API response slows, VM Instance responsiveness affected. Maybe run some CPU intensive scripts on node.
e. Node Disk full

Virtualization level (hypervisor):

  • a) Kvm (or other hv) service/daemon crashes. How it affects running instances, new instance creation, etc.
b. Libvirt API calls not responding, functionality not working (pause, resume, boot etc).
c. KVM networking issues, cannot create bridge, /etc/network/ configuration garbled.

Service and Process level:

  • a) Nova binary processes stopped unexpectedly (for api,network, compute, volume etc). Kill nova process.
b. Zombie process
c. Kill dependent processes – NTP server, RabbitMQ server, MySQL or any database server, dependent package gets removed mistakenly (greenlet, rabbitmq etc).
d. Nova Services restarting continuously in loop
e. Glance server down during Instance creation

API hacking (monkey patch, etc):

  • a) Randomly delete/terminate instances in Nova.
b. Try to create/spawn too many instances
c. Too many API requests to Compute.
d. Flood RabbitMQ
e. Rate Limit and Absolute Limit thresholds exceeded for POST/PUT/GET/DELETE
f. API requests data gets corrupted, garbled before reaching API server

Miscellaneous:

  • a) Delete key-pair /cert files
b.	Corrupt key-pair / cert files
g.	Delete data in Images folder in Nova (Running Image data)
h.	Delete data in networks folder in Nova (Running network Information)
  1. Volume full – Write many files to volume.
j.	Delete Image in Glance.
k.	Delete Virtual Network information in Nova-network / Quantum.
l.	Corrupt the configuration data (nova.conf/ keystone.conf)