Jump to: navigation, search

Difference between revisions of "EfficientMetering/FutureNovaInteractionModel"

Line 4: Line 4:
 
A discussion is [http://lists.openstack.org/pipermail/openstack-dev/2012-November/002791.html ongoing] on the openstack-dev mailing list, with a view to reaching a consensus with the nova domain experts on the best (most stable/supportable) approach for ceilometer to interact with nova going forward.
 
A discussion is [http://lists.openstack.org/pipermail/openstack-dev/2012-November/002791.html ongoing] on the openstack-dev mailing list, with a view to reaching a consensus with the nova domain experts on the best (most stable/supportable) approach for ceilometer to interact with nova going forward.
  
 +
<span id="option_1"></span>
 
'''1.''' Extend the existing os-server-diagnostics API extension to expose any additional stats that ceilo needs.
 
'''1.''' Extend the existing os-server-diagnostics API extension to expose any additional stats that ceilo needs.
  
Line 10: Line 11:
 
   ||<style="border:none;">'''-'''||<style="border:none;">the additional nova-api-->nova-compute RPC call would add lag and impact timeliness for metrics gathering||
 
   ||<style="border:none;">'''-'''||<style="border:none;">the additional nova-api-->nova-compute RPC call would add lag and impact timeliness for metrics gathering||
  
 +
<span id="option_2"></span>
 
'''2.''' Call the nova get_diagnostics RPC directly (as per the experimental [https://review.openstack.org/15952 patch] proposed by Yunhong Jiang), or use a new RPC message specifically designed for this purpose.
 
'''2.''' Call the nova get_diagnostics RPC directly (as per the experimental [https://review.openstack.org/15952 patch] proposed by Yunhong Jiang), or use a new RPC message specifically designed for this purpose.
  
Line 15: Line 17:
 
   ||<style="border:none;">'''-'''||<style="border:none;">calling RPC directly would expose ceilo to a much less stable (i.e. rapidly rev'd) API than would be the case for #1||
 
   ||<style="border:none;">'''-'''||<style="border:none;">calling RPC directly would expose ceilo to a much less stable (i.e. rapidly rev'd) API than would be the case for #1||
  
 +
<span id="option_3"></span>
 
'''3.''' Have nova itself emit metering messages directly onto the ceilo message bus, encompassing both lifecycle events and usage stats, to be picked up and persisted by the ceilo collector or other agent.
 
'''3.''' Have nova itself emit metering messages directly onto the ceilo message bus, encompassing both lifecycle events and usage stats, to be picked up and persisted by the ceilo collector or other agent.
  
Line 20: Line 23:
 
   ||<style="border:none;">'''-'''||<style="border:none;">requires message bus usage, probably inappropriate for time-sensitive measurements feeding into near-realtime metrics.||
 
   ||<style="border:none;">'''-'''||<style="border:none;">requires message bus usage, probably inappropriate for time-sensitive measurements feeding into near-realtime metrics.||
  
 +
<span id="option_4"></span>
 
'''4.''' Invert control and have the nova compute service itself call into a ceilo-provided API that abstracts the conduit used for publication (could be via the message bus, or UDP, or a direct call to a CW API)
 
'''4.''' Invert control and have the nova compute service itself call into a ceilo-provided API that abstracts the conduit used for publication (could be via the message bus, or UDP, or a direct call to a CW API)
  
 
   ||<style="border:none;">'''-'''||<style="border:none;">a loaded nova compute service may fall behind in this periodic task, especially if the reporting cadence is configured high||
 
   ||<style="border:none;">'''-'''||<style="border:none;">a loaded nova compute service may fall behind in this periodic task, especially if the reporting cadence is configured high||
  
 +
<span id="option_4a"></span>
 
'''4a.''' Rename ceilometer-compute-agent to nova-compute-metering and move it into nova with its pollster. Make it uses the multi-publisher code from Ceilometer so it's able to publish to a variety of destination (ceilometer-collector, CW…) according to configuration, and polling on interval that is configured via the publisher (as already discussed on the [https://blueprints.launchpad.net/ceilometer/+spec/multi-publisher multi-publisher] blueprints).
 
'''4a.''' Rename ceilometer-compute-agent to nova-compute-metering and move it into nova with its pollster. Make it uses the multi-publisher code from Ceilometer so it's able to publish to a variety of destination (ceilometer-collector, CW…) according to configuration, and polling on interval that is configured via the publisher (as already discussed on the [https://blueprints.launchpad.net/ceilometer/+spec/multi-publisher multi-publisher] blueprints).
  
Line 31: Line 36:
 
   ||<style="border:none;">'''+'''||<style="border:none;">no lag||
 
   ||<style="border:none;">'''+'''||<style="border:none;">no lag||
  
 +
<span id="option_5"></span>
 
'''5.''' nova packages a consumable library layered over the hypervisor driver, that just exposes the diagnostics available from libvirt ''et al''. The ceilo compute agent continues to exist under the ceilo umbrella, but talks to the hypervisor directly via this stable, versioned nova library.
 
'''5.''' nova packages a consumable library layered over the hypervisor driver, that just exposes the diagnostics available from libvirt ''et al''. The ceilo compute agent continues to exist under the ceilo umbrella, but talks to the hypervisor directly via this stable, versioned nova library.
  

Revision as of 16:49, 20 November 2012

Future ceilometer/nova interaction model

A discussion is ongoing on the openstack-dev mailing list, with a view to reaching a consensus with the nova domain experts on the best (most stable/supportable) approach for ceilometer to interact with nova going forward.

1. Extend the existing os-server-diagnostics API extension to expose any additional stats that ceilo needs.

  ||<style="border:none;">+||<style="border:none;">would allow the ceilo compute agent to be scaled independently of the nova-compute node (i.e. no need for a 1:1 correspondence)||
  ||<style="border:none;">-||<style="border:none;">the diagnostics returned are currently hypervisor-specific||
  ||<style="border:none;">-||<style="border:none;">the additional nova-api-->nova-compute RPC call would add lag and impact timeliness for metrics gathering||

2. Call the nova get_diagnostics RPC directly (as per the experimental patch proposed by Yunhong Jiang), or use a new RPC message specifically designed for this purpose.

  ||<style="border:none;">+/-||<style="border:none;">as for #1, but also removes the lag involved in an additional hop between nova services||
  ||<style="border:none;">-||<style="border:none;">calling RPC directly would expose ceilo to a much less stable (i.e. rapidly rev'd) API than would be the case for #1||

3. Have nova itself emit metering messages directly onto the ceilo message bus, encompassing both lifecycle events and usage stats, to be picked up and persisted by the ceilo collector or other agent.

  ||<style="border:none;">-||<style="border:none;">leaks ceilo concerns into nova||
  ||<style="border:none;">-||<style="border:none;">requires message bus usage, probably inappropriate for time-sensitive measurements feeding into near-realtime metrics.||

4. Invert control and have the nova compute service itself call into a ceilo-provided API that abstracts the conduit used for publication (could be via the message bus, or UDP, or a direct call to a CW API)

  ||<style="border:none;">-||<style="border:none;">a loaded nova compute service may fall behind in this periodic task, especially if the reporting cadence is configured high||

4a. Rename ceilometer-compute-agent to nova-compute-metering and move it into nova with its pollster. Make it uses the multi-publisher code from Ceilometer so it's able to publish to a variety of destination (ceilometer-collector, CW…) according to configuration, and polling on interval that is configured via the publisher (as already discussed on the multi-publisher blueprints).

  ||<style="border:none;">+||<style="border:none;">no request/reply (like option #1 and #2)||
  ||<style="border:none;">+||<style="border:none;">maintained by nova, so doesn't break||
  ||<style="border:none;">+||<style="border:none;">no need to have hypervizor specific code, possible to abstract||
  ||<style="border:none;">+||<style="border:none;">no lag||

5. nova packages a consumable library layered over the hypervisor driver, that just exposes the diagnostics available from libvirt et al. The ceilo compute agent continues to exist under the ceilo umbrella, but talks to the hypervisor directly via this stable, versioned nova library.

  ||<style="border:none;">+||<style="border:none;">no remote calls required from ceilo-->nova-{api|compute}||
  ||<style="border:none;">-||<style="border:none;">needs an independent versioning scheme||
  ||<style="border:none;">-||<style="border:none;">still stuck in the "implicit trust" model?||

The discussion has not yet reached a definitive conlusion, but there was definite push-back from the nova domain expert on direct use of nova RPC by the ceilo agent (as this is considered an internal API). We await further feedback from the nova team on their attitude to accepting the ceilometer compute agent into nova as a separate daemon to run on nova compute nodes.