Why and how to use Intel Node Manager in Openstack
This Wiki is used to show how we can use Intel Node Manager in Openstack.
First, it will give a brief introduction to Intel Node Manager; then it will describe how we can leverage these features in Openstack.
Introduction of Node Manager
Intel Node Manager is a server management technology that allows management software to accurately monitor and control the platform’s power and thermal behaviors through industry defined standards: Intelligent Platform Management Interface(IPMI) and Datacenter Manageability Interface (DCMI).
It allows the administrator of IT to monitor the power and thermal behaviors of servers and make reasonable operation or policy according to the real-time power/thermal status of data center. Using Intel Node Manager,the power and thermal information of these servers can be used to improve overall data center efficiency and maximize overall data center usage. Data center managers can maximize the rack density with the confidence that rack power budget will not be exceeded. During a power or thermal emergency, Intel Node Manager can automatically limit server power consumption and send alert to administrators via the pre-defined policy.
The main features of Intel Node Manager can be described as following.
1) Monitor power/temperature.(Trying to implemented in Openstack and patches are ready, please see the following link in next section)
- Collect the power/thermal information via IPMI command. It also can collect data of each subsystem, such as CPU, Memory or I/O.
2) Make policy based on power/thermal.(It is not be implemented yet. You can use the 3rd party software or Intel software to do the policy stuffs)
- Pre-defined policy can be made based on power/thermal data.
- We can set power budget for the server, if power exceeds the threshold, frequency of CPU will decrease to bring power consumption down. (It is also called "Power Capping").
- Policy also can be made for the power/thermal emergency, if the data exceed threshold, the pre-defined operation will executed, such as shut down server or decrease CPU frequency or send alert, etc.
How to use Node Manager in Openstack
One of reasonable usage of Node Manager in Openstack is for Nova, nova-scheduler. For now, nova collects some information for scheduler, CPU/Memory. But that may be not enough. Power/temperature can also be used to describe the status of a compute node and Filter/Weight can be made according to the power/temperature information.
For example, when scheduler need to choose one compute node to launch new instance and the power of the nodes are collected as 100Watts, 120Watts and 150Watts. To balance the power consumption of all nodes, a filter can be made to remove the node with highest power; or to save power, remove the lowest node. That depends on users.
Another example is that users can set the power threshold for the servers in the data center, 200Watts, for example, then the power consumption higher than 200 is filtered when scheduler choose the available nodes.
Weight can be made for the scheduler via power information. To balance power consumption, higher power node gets lower weight, or higher power node gets higher weight to save power.
The temperature information can be used for the scheduler in the same manner.
To enable above described usage, we can add node manager monitors in nova-compute. We call IPMI command locally to get power and temperature. Here are the patches.
Another usage is for ceilometer. Power/temperature status of servers or whole data center can be finding out in ceilometer. (More details is needed)
Policy can be made ad load from configure file(Just take an example, other solutions can be provide either) and threshold/operations can be set. When the power/temperature exceed the threshold, alert message will be send out and showed in the UI of ceilometer or dashboard. Pre-defined operation will be adopted also.