Jump to: navigation, search

Difference between revisions of "EfficientMetering/FutureNovaInteractionModel"

 
m (Text replace - "__NOTOC__" to "")
 
(4 intermediate revisions by one other user not shown)
Line 1: Line 1:
__NOTOC__
 
A discussion is [http://lists.openstack.org/pipermail/openstack-dev/2012-November/002791.html ongoing] on the openstack-dev mailing list, with a view to reaching a consensus with the nova domain experts the best (most stable/supportable) approach for ceilometer to interact with nova going forward.
 
  
1. Extend the existing os-server-diagnostics API extension to expose any additional stats that ceilo needs.
+
= Future ceilometer/nova interaction model =
 +
 
 +
A discussion is [http://lists.openstack.org/pipermail/openstack-dev/2012-November/002791.html ongoing] on the openstack-dev mailing list, with a view to reaching a consensus with the nova domain experts on the best (most stable/supportable) approach for ceilometer to interact with nova going forward.
 +
 
 +
<span id="option_1"></span>
 +
'''1.''' Extend the existing os-server-diagnostics API extension to expose any additional stats that ceilo needs.
  
 
   ||<style="border:none;">'''+'''||<style="border:none;">would allow the ceilo compute agent to be scaled independently of the nova-compute node (i.e. no need for a 1:1 correspondence)||
 
   ||<style="border:none;">'''+'''||<style="border:none;">would allow the ceilo compute agent to be scaled independently of the nova-compute node (i.e. no need for a 1:1 correspondence)||
Line 8: Line 11:
 
   ||<style="border:none;">'''-'''||<style="border:none;">the additional nova-api-->nova-compute RPC call would add lag and impact timeliness for metrics gathering||
 
   ||<style="border:none;">'''-'''||<style="border:none;">the additional nova-api-->nova-compute RPC call would add lag and impact timeliness for metrics gathering||
  
2. Call the nova get_diagnostics RPC directly (as per the experimental [https://review.openstack.org/15952 patch] proposed by Yunhong Jiang), or use a new RPC message specifically designed for this purpose.
+
<span id="option_2"></span>
 +
'''2.''' Call the nova get_diagnostics RPC directly (as per the experimental [https://review.openstack.org/15952 patch] proposed by Yunhong Jiang), or use a new RPC message specifically designed for this purpose.
  
 
   ||<style="border:none;">'''+/-'''||<style="border:none;">as for #1, but also removes the lag involved in an additional hop between nova services||
 
   ||<style="border:none;">'''+/-'''||<style="border:none;">as for #1, but also removes the lag involved in an additional hop between nova services||
 
   ||<style="border:none;">'''-'''||<style="border:none;">calling RPC directly would expose ceilo to a much less stable (i.e. rapidly rev'd) API than would be the case for #1||
 
   ||<style="border:none;">'''-'''||<style="border:none;">calling RPC directly would expose ceilo to a much less stable (i.e. rapidly rev'd) API than would be the case for #1||
  
3. Have nova itself emit metering messages directly onto the ceilo message bus, encompassing both lifecycle events and usage stats, to be picked up and persisted by the ceilo collector or other agent.
+
<span id="option_3"></span>
 +
'''3.''' Have nova itself emit metering messages directly onto the ceilo message bus, encompassing both lifecycle events and usage stats, to be picked up and persisted by the ceilo collector or other agent.
  
 
   ||<style="border:none;">'''-'''||<style="border:none;">leaks ceilo concerns into nova||
 
   ||<style="border:none;">'''-'''||<style="border:none;">leaks ceilo concerns into nova||
 
   ||<style="border:none;">'''-'''||<style="border:none;">requires message bus usage, probably inappropriate for time-sensitive measurements feeding into near-realtime metrics.||
 
   ||<style="border:none;">'''-'''||<style="border:none;">requires message bus usage, probably inappropriate for time-sensitive measurements feeding into near-realtime metrics.||
  
4. Invert control and have the nova compute service itself call into a ceilo-provided API that abstracts the conduit used for publication (could be via the message bus, or UDP, or a direct call to a CW API)
+
<span id="option_4"></span>
 +
'''4.''' Invert control and have the nova compute service itself call into a ceilo-provided API that abstracts the conduit used for publication (could be via the message bus, or UDP, or a direct call to a CW API)
  
 
   ||<style="border:none;">'''-'''||<style="border:none;">a loaded nova compute service may fall behind in this periodic task, especially if the reporting cadence is configured high||
 
   ||<style="border:none;">'''-'''||<style="border:none;">a loaded nova compute service may fall behind in this periodic task, especially if the reporting cadence is configured high||
  
  This option developed in the discussion to:
+
<span id="option_4a"></span>
 
+
'''4a.''' Rename ceilometer-compute-agent to nova-compute-metering and move it into nova with its pollster. Make it uses the multi-publisher code from Ceilometer so it's able to publish to a variety of destination (ceilometer-collector, CW…) according to configuration, and polling on interval that is configured via the publisher (as already discussed on the [https://blueprints.launchpad.net/ceilometer/+spec/multi-publisher multi-publisher] blueprints).
  Rename ceilometer-compute-agent to nova-compute-metering and move it into nova with its pollster. Make it uses the multi-publisher code from
 
Ceilometer so it's able to publish to a variety of destination (ceilometer-collector, CW…) according to configuration, and polling on interval that is configured via the publisher (as already discussed on the [https://blueprints.launchpad.net/ceilometer/+spec/multi-publisher |multi-publisher] blueprints).
 
  
 
   ||<style="border:none;">'''+'''||<style="border:none;">no request/reply (like option #1 and #2)||
 
   ||<style="border:none;">'''+'''||<style="border:none;">no request/reply (like option #1 and #2)||
Line 32: Line 36:
 
   ||<style="border:none;">'''+'''||<style="border:none;">no lag||
 
   ||<style="border:none;">'''+'''||<style="border:none;">no lag||
  
5. nova packages a consumable library layered over the hypervisor driver, that just exposes the diagnostics available from libvirt ''et al''. The ceilo compute agent continues to exist under the ceilo umbrella, but talks to the hypervisor directly via this stable, versioned nova library.
+
<span id="option_5"></span>
 +
'''5.''' nova packages a consumable library layered over the hypervisor driver, that just exposes the diagnostics available from libvirt ''et al''. The ceilo compute agent continues to exist under the ceilo umbrella, but talks to the hypervisor directly via this stable, versioned nova library.
  
 
   ||<style="border:none;">'''+'''||<style="border:none;">no remote calls required from ceilo-->nova-{api|compute}||
 
   ||<style="border:none;">'''+'''||<style="border:none;">no remote calls required from ceilo-->nova-{api|compute}||
Line 38: Line 43:
 
   ||<style="border:none;">'''-'''||<style="border:none;">still stuck in the "implicit trust" model?||
 
   ||<style="border:none;">'''-'''||<style="border:none;">still stuck in the "implicit trust" model?||
  
The discussion has not yet reached a definite conlusion, but there was definite [http://lists.openstack.org/pipermail/openstack-dev/2012-November/002806.html push-back] from the nova domain expert on direct use of nova RPC by the ceilo agent (as this is considered an internal API). We await further feedback from the nova team on their attitude to accepting the ceilometer compute agent into nova as a separate daemon to run on nova compute nodes.
+
The discussion has not yet reached a definitive conlusion, but there was definite [http://lists.openstack.org/pipermail/openstack-dev/2012-November/002806.html push-back] from the nova domain expert on direct use of nova RPC by the ceilo agent (as this is considered an internal API). We await further feedback from the nova team on their attitude to accepting the ceilometer compute agent into nova as a separate daemon to run on nova compute nodes.

Latest revision as of 23:29, 17 February 2013

Future ceilometer/nova interaction model

A discussion is ongoing on the openstack-dev mailing list, with a view to reaching a consensus with the nova domain experts on the best (most stable/supportable) approach for ceilometer to interact with nova going forward.

1. Extend the existing os-server-diagnostics API extension to expose any additional stats that ceilo needs.

  ||<style="border:none;">+||<style="border:none;">would allow the ceilo compute agent to be scaled independently of the nova-compute node (i.e. no need for a 1:1 correspondence)||
  ||<style="border:none;">-||<style="border:none;">the diagnostics returned are currently hypervisor-specific||
  ||<style="border:none;">-||<style="border:none;">the additional nova-api-->nova-compute RPC call would add lag and impact timeliness for metrics gathering||

2. Call the nova get_diagnostics RPC directly (as per the experimental patch proposed by Yunhong Jiang), or use a new RPC message specifically designed for this purpose.

  ||<style="border:none;">+/-||<style="border:none;">as for #1, but also removes the lag involved in an additional hop between nova services||
  ||<style="border:none;">-||<style="border:none;">calling RPC directly would expose ceilo to a much less stable (i.e. rapidly rev'd) API than would be the case for #1||

3. Have nova itself emit metering messages directly onto the ceilo message bus, encompassing both lifecycle events and usage stats, to be picked up and persisted by the ceilo collector or other agent.

  ||<style="border:none;">-||<style="border:none;">leaks ceilo concerns into nova||
  ||<style="border:none;">-||<style="border:none;">requires message bus usage, probably inappropriate for time-sensitive measurements feeding into near-realtime metrics.||

4. Invert control and have the nova compute service itself call into a ceilo-provided API that abstracts the conduit used for publication (could be via the message bus, or UDP, or a direct call to a CW API)

  ||<style="border:none;">-||<style="border:none;">a loaded nova compute service may fall behind in this periodic task, especially if the reporting cadence is configured high||

4a. Rename ceilometer-compute-agent to nova-compute-metering and move it into nova with its pollster. Make it uses the multi-publisher code from Ceilometer so it's able to publish to a variety of destination (ceilometer-collector, CW…) according to configuration, and polling on interval that is configured via the publisher (as already discussed on the multi-publisher blueprints).

  ||<style="border:none;">+||<style="border:none;">no request/reply (like option #1 and #2)||
  ||<style="border:none;">+||<style="border:none;">maintained by nova, so doesn't break||
  ||<style="border:none;">+||<style="border:none;">no need to have hypervizor specific code, possible to abstract||
  ||<style="border:none;">+||<style="border:none;">no lag||

5. nova packages a consumable library layered over the hypervisor driver, that just exposes the diagnostics available from libvirt et al. The ceilo compute agent continues to exist under the ceilo umbrella, but talks to the hypervisor directly via this stable, versioned nova library.

  ||<style="border:none;">+||<style="border:none;">no remote calls required from ceilo-->nova-{api|compute}||
  ||<style="border:none;">-||<style="border:none;">needs an independent versioning scheme||
  ||<style="border:none;">-||<style="border:none;">still stuck in the "implicit trust" model?||

The discussion has not yet reached a definitive conlusion, but there was definite push-back from the nova domain expert on direct use of nova RPC by the ceilo agent (as this is considered an internal API). We await further feedback from the nova team on their attitude to accepting the ceilometer compute agent into nova as a separate daemon to run on nova compute nodes.