Jump to: navigation, search

Difference between revisions of "Meetings/Neutron-DVR"

 
(168 intermediate revisions by 5 users not shown)
Line 1: Line 1:
 +
The OpenStack Networking L3 DVR Sub-team holds public meetings as advertised on [http://eavesdrop.openstack.org/#Neutron_Distributed_Virtual_Router_Meeting OpenStack IRC Meetings Calendar]. If you are unable to attend, please check the most recent [http://eavesdrop.openstack.org/meetings/neutron_dvr/ logs.]
  
 
= Meetings =
 
= Meetings =
  
* Weekly on Wednesday at 1500 UTC
+
* Weekly on Thursday combined with L3 Meeting at 1500 UTC
* IRC channel: <code><nowiki>#openstack-meeting-alt</nowiki></code> on freenode
+
* IRC channel: <code><nowiki>#openstack-meeting-3</nowiki></code> on freenode
 
* Chair: Brian Haley (haleyb), Swaminathan Vasudevan (Swami)
 
* Chair: Brian Haley (haleyb), Swaminathan Vasudevan (Swami)
 
* Meetings, with their notes and logs, will be found under http://eavesdrop.openstack.org/meetings/neutron_dvr
 
* Meetings, with their notes and logs, will be found under http://eavesdrop.openstack.org/meetings/neutron_dvr
  
 
= Agenda =
 
= Agenda =
== Meeting December 9th, 2015 ==
+
 
 +
== Meeting June 29th, 2017 ==
  
 
=== Announcements (haleyb) ===
 
=== Announcements (haleyb) ===
 +
* When adding items below I'd like to try to get feedback on whether they are a MUST FIX, SHOULD FIX, or GOOD TO HAVE, NEEDS TRIAGE and TRIAGED
 +
 +
== Topics for Discussion ==
  
 
=== Bugs (Swami) ===
 
=== Bugs (Swami) ===
  
== New Bugs this week ==
+
All DVR bugs should be tagged and listed here: https://bugs.launchpad.net/neutron/+bugs?field.tag=l3-dvr-backlog
 +
 
 +
==== Bugs That Need to be Closed ====
 +
* None
 +
 
 +
==== New Bugs this week ====
 +
 
 +
* https://bugs.launchpad.net/neutron/+bug/1701288 (NEEDS TRIAGE) - rpc loop timeout when l2pop calls _create_agent_fdb in large scale deployments.
 +
* https://bugs.launchpad.net/neutron/+bug/1698517 (CONFIRMED) (MUST FIX) - Issue is only seen in Newton and Ocata and not in master.
 +
** https://review.openstack.org/#/c/475108/
 +
* https://bugs.launchpad.net/neutron/+bug/1695140 (NEEDS TRIAGE) - Floatingip VM cannot ping to non floatingip VM. ( Might be due to the regression that was introduced).
 +
 
 +
==== High Priority Bugs in progress ====
 +
 
 +
* https://bugs.launchpad.net/neutron/+bug/1682228 - Cross Address scopes traffic with floatingip limitation in DVR
 +
 
 +
* https://bugs.launchpad.net/neutron/+bug/1682345 - Neutron Metering agent RPC need to send notification to specific hosts that are configured
 +
 
 +
* https://bugs.launchpad.net/neutron/+bug/1672345 (NEEDS TRIAGE) - Symptoms look like it is a duplicate of 'allowed-address-pairs'
 +
 
 +
* https://bugs.launchpad.net/neutron/+bug/1657981 (Needs Triage) - FloatingIP not reachable after the compute node restarts while the agent is down
 +
 
 +
* https://bugs.launchpad.net/neutron/+bug/1644231 (TRIAGED) - fip router config is not created if the vm ports attached to FIPs have no device_owner
 +
** Needs help message update saying "compute:" prefix needed in ports manually-created for instances
 +
** https://review.openstack.org/#/c/425919/ (NEEDS REVIEW)
 +
 
 +
* https://bugs.launchpad.net/neutron/+bug/1606741 (SHOULD FIX) - Metadata error with dvr_snat on compute hosts.
 +
** Lowered to Medium
 +
** https://review.openstack.org/352686 (Expired)
 +
 
 +
==== Categorized Bugs ====
 +
 
 +
==== RFE ====
 +
* https://bugs.launchpad.net/neutron/+bug/1563879 (RFE) - DVR should route packets to Instances behind the L2 Gateway
 +
 
 +
* https://bugs.launchpad.net/neutron/+bug/1577488 (RFE) - "Fast exit" for compute node egress flows when using DVR
 +
** https://review.openstack.org/#/c/283757/ (Merged)
 +
** https://review.openstack.org/#/c/355062/ (Needs Review)
 +
 
 +
* https://bugs.launchpad.net/neutron/+bug/1583694 (RFE) - DVR support for Allowed_address_pair port that are bound to multiple ACTIVE VM ports
 +
** https://review.openstack.org/#/c/437970/ (Needs review) - Server side patch
 +
** https://review.openstack.org/#/c/437986/ (Needs review) - Agent side patch
 +
 
 +
==== Existing Functionality Broken Bugs ====
 +
* https://bugs.launchpad.net/neutron/+bug/1526855 - (SHOULD FIX) - VMs fail to get metadata in large scale environments
 +
 
 +
* https://bugs.launchpad.net/neutron/+bug/1447227 (GOOD TO HAVE) - Connecting two or more distributed routers to a subnet. (No one is working on it)
 +
 
 +
* https://bugs.launchpad.net/neutron/+bug/1552070 (GOOD TO HAVE) - Functional test changes to address dual stack for DVR
 +
 
 +
==== Scale and Performance Impact Bugs ====
 +
 
 +
==== New Features Bugs ====
 +
 
 +
==== Refactor or Cleanup Bugs ====
 +
* https://bugs.launchpad.net/neutron/+bug/1538369 - Refactor request (Not a real bug)
 +
 
 +
==== WishList Bugs ====
 +
* https://bugs.launchpad.net/neutron/+bug/1518819 - Ability to specify a gateway when adding a subnet without cidr (GOOD TO HAVE)
 +
* https://bugs.launchpad.net/neutron/+bug/1450067 (Low) - ML2 and L3 plugin exposes dvr extension even if OVS is unused. (No owners yet) (GOOD TO HAVE)
 +
* https://bugs.launchpad.net/neutron/+bug/1476469 (TRIAGE ME) - with DVR, a VM can't use floating IP and VPN at the same time
 +
** This is not a bug, since floating IP can't be used with the VPN
 +
 
 +
==== WatchList Bugs ====
 +
 
 +
==== Bugs Closed Recently====
 +
 
 +
* https://bugs.launchpad.net/neutron/+bug/1632540 (NEEDS TRIAGE) - L3 agent logs fills up quick (stable/mitaka tag 8.1.2 - need more info on reproducer)
 +
** https://review.openstack.org/434863 (Merged) - this will fix some of the issue
 +
 
 +
* https://bugs.launchpad.net/neutron/+bug/1504039 - LinuxBridge DVR (Spec & code) - Expired
 +
 
 +
* https://bugs.launchpad.net/neutron/+bug/1620824 (TRIAGED) - Neutron DVR(SNAT) steals FIP traffic
 +
** https://review.openstack.org/#/c/366297/ (Merged) (Currently not using reference L2pop and using tcp_loose=0)
 +
 
 +
* https://bugs.launchpad.net/neutron/+bug/1612804 (MUST FIX) - test_shelve_instance fails with sshtimeout (One instance of failure seen in gate)
 +
** Closed since not seen recently
 +
 
 +
* https://bugs.launchpad.net/neutron/+bug/1597461 (FIXED) - Two masters after reboot of controller when HA enabled. Also seen with DVR
 +
** https://review.openstack.org/357458 (merged)
 +
 
 +
* https://bugs.launchpad.net/neutron/+bug/1675187 (MUST FIX) - Floating IPs not removed on rfp interface in qrouter
 +
** https://review.openstack.org/#/c/451859/ (Merged)
 +
 
 +
* https://bugs.launchpad.net/neutron/+bug/1647432 (SHOULD FIX) - Multiple SIGHUPs to keepalived might trigger re-election
 +
** https://review.openstack.org/#/c/407099/ (Merged)
 +
 
 +
* https://bugs.launchpad.net/neutron/+bug/1653633 (EXPIRED) - fwaas v1 with DVR: l3 agent can't restore the NAT rules for floatingIP (Can't be reproduced in the master branch)
 +
** not sure it's FWaaS-specific
 +
 
 +
* https://bugs.launchpad.net/neutron/+bug/1668277 (SHOULD FIX) - When enable l2pop, after deleting the DVR port, the associated flow entries still exists
 +
** https://review.openstack.org/#/c/438516 (Merged)
 +
 
 +
* https://bugs.launchpad.net/neutron/+bug/1629539 (NEEDS TRIAGE) - Broken distributed virtual router w/ lbaas v1
 +
** Against Mitaka, expired
 +
 
 +
* https://bugs.launchpad.net/neutron/+bug/1658060 (Needs triage) - FirewallNotFound exceptions when deleting the firewall in FWaaS-DVR
 +
 
 +
* https://bugs.launchpad.net/neutron/+bug/1541406 - IPv6 Prefix Delegation does not work with DVR (SHOULD FIX)
 +
** https://review.openstack.org/#/c/277657/ (MERGED)
 +
 
 +
* https://bugs.launchpad.net/neutron/+bug/1571676 (Re-opened) - After binding a floating IP to VM, the static route can't work in DVR
 +
** https://review.openstack.org/#/c/308068/ (Patch Merged)
 +
 
 +
* https://bugs.launchpad.net/neutron/+bug/1557290 (SHOULD FIX) - DVR not able to forward traffic to private IP from FIP namespace
 +
** Duplicate of Fast-exit RFE
 +
 
 +
* https://bugs.launchpad.net/neutron/+bug/1654991 (SHOULD FIX) Allow all migration of routers
 +
** https://review.openstack.org/#/c/376550/ (Patch merged)
 +
 
 +
* https://bugs.launchpad.net/neutron/+bug/1506567 (SHOULD FIX) - No information from metering agent
 +
** https://review.openstack.org/377108 (Patch merged)
 +
 
 +
* https://bugs.launchpad.net/neutron/+bug/1644415 (SHOULD FIX) - DVR-SNAT mode with HA does not clean up the fip route rule in namespace
 +
** https://review.openstack.org/#/c/404571/ (Patch merged)
 +
 
 +
* https://bugs.launchpad.net/neutron/+bug/1414559 (OVS drops RARP packets)
 +
** in neutron https://review.openstack.org/#/c/246898/ (MERGED) Notify vif plugged event
 +
** and nova https://review.openstack.org/#/c/246910/ (not making progress) Notify vif plugged event
 +
 
 +
* https://bugs.launchpad.net/neutron/+bug/1403455 (MUST FIX) - neutron netns cleanup does not cleanup all subprocesses
 +
** https://review.openstack.org/402140
 +
 
 +
* Live migration
 +
** https://review.openstack.org/#/c/286855/ (Tempest change) - Needs work - must merge before Nova patch
 +
** https://review.openstack.org/275420 (Merged) - Neutron Server side DVR change
 +
** https://review.openstack.org/#/c/260738/ (Merged) - L3 Agent side patch
 +
 
 +
* https://bugs.launchpad.net/neutron/+bug/1631513 (MUST FIX) - Race condition in update_gateway_port when two simultaneous router update occurs for the same router
 +
** https://review.openstack.org/#/c/385617/ (Merged)
 +
 
 +
* https://bugs.launchpad.net/neutron/+bug/1641535 (MUST FIX) - FIP failed to remove in router's standby node
 +
** https://review.openstack.org/#/c/397092/
 +
 
 +
* https://bugs.launchpad.net/neutron/+bug/1612192 (MUST FIX) - L3 DVR: Unable to complete operation on subnet (Related to HA)
 +
** Not sure this is critical any more, as it's mostly seen in the SFC and OVN jobs, ~0 in the dvr-multinode-full job
 +
 
 +
* https://bugs.launchpad.net/neutron/+bug/1612192 (MUST FIX) - L3 DVR: Unable to complete operation on subnet (Related to HA)
 +
** Not sure this is critical any more, as it's mostly seen in the SFC and OVN jobs, ~0 in the dvr-multinode-full job
 +
 
 +
*  https://bugs.launchpad.net/neutron/+bug/1593354 (MUST FIX) - SNAT HA failed due to missing NAT rule and sg- interface. (May be a duplicate)
 +
** Not reproducible with Newton, might be only Mitaka (Not Seen in Mitaka so closed)
  
* https://bugs.launchpad.net/neutron/+bug/1524291 - code duplication (low)
+
* https://bugs.launchpad.net/neutron/+bug/1531065 - (Expired) duplicate fetch subnet_id in get_subnet_for_dvr (GOOD TO HAVE)
* https://bugs.launchpad.net/neutron/+bug/1524020 - DVR ARP update call count can be reduced (medium)
 
* https://bugs.launchpad.net/neutron/+bug/1522824 - test_shelve_instance test failure
 
** https://review.openstack.org/#/c/253569/ - Set PENDING_BUILD status for ports after update in DB - needs another +2
 
  
== Bugs from last week ==
+
* https://bugs.launchpad.net/neutron/+bug/1585165 (SHOULD FIX) - Floating ip not reachable after VM live migration (Seen in Liberty with DVR) - Seems to me like a duplicate but should triage it more on the symptoms.
  
* https://bugs.launchpad.net/neutron/+bug/1521815 - intermittent failure, hard to reproduce
+
* https://bugs.launchpad.net/neutron/+bug/1625333 (TRIAGED) - FloatingIP GARP fails. Log seen in the l3 agent and VM not able to ping
* https://bugs.launchpad.net/neutron/+bug/1521846 - metering failure, fixed
+
** Need a reproducer, only on RHEL
* https://bugs.launchpad.net/neutron/+bug/1521524 - metadata failure, fixed
 
* https://bugs.launchpad.net/neutron/+bug/1521820 - "leaking" of FIP namespace in tests, fixed
 
  
== Categorized Bugs ==
+
* https://bugs.launchpad.net/neutron/+bug/1580648 (FIXED) - Two HA routers in master state during functional test
== Gate Test Failures ==
+
** Patch already pushed in but the issue is still seen. (Not able to reproduce)
**https://bugs.launchpad.net/neutron/+bug/1522824 ( High)
 
**https://bugs.launchpad.net/neutron/+bug/1521815 ( Low)
 
**https://bugs.launchpad.net/neutron/+bug/1450604 (Medium)
 
  
== Existing Functionality Broken Bugs ==
+
* https://bugs.launchpad.net/neutron/+bug/1602614 (MUST FIX) - DVR+L3 HA Loss during failover is higher
**https://bugs.launchpad.net/neutron/+bug/1456073 ( High)
 
**https://bugs.launchpad.net/neutron/+bug/1462154 (High)
 
**https://bugs.launchpad.net/neutron/+bug/1445255 ( Low)
 
**https://bugs.launchpad.net/neutron/+bug/1499785 ( Low)
 
**https://bugs.launchpad.net/neutron/+bug/1499787 (Low)
 
  
== Scale and Performance Impact Bugs ==
+
* https://bugs.launchpad.net/neutron/+bug/1456073 (High) - Block Migration with FIP breaks with DVR (MUST FIX)
**https://bugs.launchpad.net/neutron/+bug/1513678 (High)
+
** https://review.openstack.org/275073 (Nova side change) - Needs review
**https://bugs.launchpad.net/neutron/+bug/1524020 ( Medium)
 
  
== New Features Bugs ==
+
* https://bugs.launchpad.net/neutron/+bug/1462154 (With DVR Pings to floating IPs replied with fixed-ips if VMs are on the same network)
**https://bugs.launchpad.net/neutron/+bug/1365473 (WishList ) - Needs attention
+
** https://review.openstack.org/246855 (Merged) (Reverted)
  **https://review.openstack.org/#/c/143169/ ( Needs core blessings)
 
**https://bugs.launchpad.net/neutron/+bug/1450067 ( Low)
 
  
== Refactor or Cleanup Bugs ==
+
=== Performance/Scalability ===
**https://bugs.launchpad.net/neutron/+bug/1524291
 
====High Priority in-progress====
 
  
* https://bugs.launchpad.net/neutron/+bug/1462154 With DVR Pings to floating IPs replied with fixed-ips
+
=== Gate (haleyb) ===
** https://review.openstack.org/246855
+
==== Failures ====
** https://review.openstack.org/246894
+
Neutron Failure Rate dashboard - http://grafana.openstack.org/dashboard/db/neutron-failure-rate
** Doesn't seem to be making progress?
+
* The DVR/LinuxBridge/Multi-node failure rate was high yesterday for the non-voting jobs (April 4th)
* https://bugs.launchpad.net/neutron/+bug/1505575 Fatal memory consumption by neutron-server with DVR at scale
+
** Need to investigate why to make sure nothing is lurking (might cover in L3 meeting as well)
** https://review.openstack.org/#/c/234067/ (sync in chunks)
 
* https://bugs.launchpad.net/neutron/+bug/1513678 At scale router scheduling takes a long time with DVR routers with multiple compute nodes hosting thousands of VMs
 
** https://review.openstack.org/241843 already merged
 
** two other related changes
 
  
====Other in-progress====
+
* Bugs in Nova:
 +
** https://bugs.launchpad.net/nova/+bug/1524898 (Volume based live migration aborted unexpectedly)
 +
** https://bugs.launchpad.net/nova/+bug/1535232 (live-migration ci failure on nfs shared storage)
  
* https://bugs.launchpad.net/neutron/+bug/1508869 - handle port host change (live migration)
+
==== Jobs ====
** https://review.openstack.org/#/c/238478/
+
* DVR+HA multinode job is now run for all patches, currently non-voting
 +
** https://review.openstack.org/#/c/455406/
 +
** Will monitor and make sure it's stable
  
* https://bugs.launchpad.net/neutron/+bug/1456073 - Live Migration Bug
+
=== Stable backports (haleyb) ===
** related bug https://bugs.launchpad.net/neutron/+bug/1414559 with fixes:
+
==== General information ====
*** in neutron https://review.openstack.org/#/c/246898/
+
Ihar created a page tracking all the potential backports from Mitaka to the stable releases. I have been going through it with the help of Swami to get stable/liberty, well, more stable. Bugs are removed from list as they merge.
*** and nova https://review.openstack.org/#/c/246910/
+
* https://etherpad.openstack.org/p/stable-bug-candidates-from-master
  
=== Discuss a plan to address the gate failures ===
+
We need to continue to be aggressive at proactively backporting fixes to the stable branches
* Current failure rate: https://goo.gl/L1WODG
+
* http://docs.openstack.org/project-team-guide/stable-branches.html#proactive-backports
** Multinode was broken by https://review.openstack.org/#/c/233711/ and  fixed by https://review.openstack.org/#/c/245697/
 
** All the jobs except multinode-dvr appear to have stabalized at similar failure rates.  regXboi to consult with armax, mestery and dougwig about returning these jobs to voting next week.
 
* Debug the gate in the current failure state (Identify owners who can own this)
 
* https://bugs.launchpad.net/neutron/+bug/1515360 - Tempest "SSHTimeout" failures
 
** https://review.openstack.org/#/c/247748/ (add debugging code to l3-agent)
 
*https://drive.google.com/file/d/0B4kh-7VVPWlPMkdrQWFFdjNsSnM/view?usp=sharing - Logstash failures captured. Any logstash errors can be captured here.
 
  
=== Performance/Scalability (obondarev) ===
+
This is a list of bugs with a fix that has been committed to the master branch, and are tagged with 'neutron-proactive-backport-potential+l3-dvr-backlog'
* https://blueprints.launchpad.net/neutron/+spec/improve-dvr-l3-agent-binding
+
* https://googl/sx0KL5
** Started https://review.openstack.org/#/c/254837/  
 
  
 
=== Open Discussion ===
 
=== Open Discussion ===
 +
* ???
  
 
== Meeting commands ==
 
== Meeting commands ==
 
<nowiki>/join #openstack-meeting-alt</nowiki><br />
 
<nowiki>/join #openstack-meeting-alt</nowiki><br />
 
<nowiki>#startmeeting neutron_dvr</nowiki><br />
 
<nowiki>#startmeeting neutron_dvr</nowiki><br />
 +
<nowiki>#chair Swami</nowiki><br />
 
<nowiki>#topic Announcements</nowiki><br />
 
<nowiki>#topic Announcements</nowiki><br />
 
<nowiki>#undo topic</nowiki><br />
 
<nowiki>#undo topic</nowiki><br />
 
<nowiki>#link https://wiki.openstack.org/wiki/Meetings/Neutron-DVR</nowiki><br />
 
<nowiki>#link https://wiki.openstack.org/wiki/Meetings/Neutron-DVR</nowiki><br />
<nowiki> #action haleyb will get something specific done this week</nowiki><br />
+
<nowiki>#action haleyb will get something specific done this week</nowiki><br />
<nowiki> #chair Swami</nowiki>
 
<br />
 
 
...
 
...
 
<br />
 
<br />
 
<nowiki>#endmeeting</nowiki><br />
 
<nowiki>#endmeeting</nowiki><br />

Latest revision as of 14:55, 29 June 2017

The OpenStack Networking L3 DVR Sub-team holds public meetings as advertised on OpenStack IRC Meetings Calendar. If you are unable to attend, please check the most recent logs.

Meetings

  • Weekly on Thursday combined with L3 Meeting at 1500 UTC
  • IRC channel: #openstack-meeting-3 on freenode
  • Chair: Brian Haley (haleyb), Swaminathan Vasudevan (Swami)
  • Meetings, with their notes and logs, will be found under http://eavesdrop.openstack.org/meetings/neutron_dvr

Agenda

Meeting June 29th, 2017

Announcements (haleyb)

  • When adding items below I'd like to try to get feedback on whether they are a MUST FIX, SHOULD FIX, or GOOD TO HAVE, NEEDS TRIAGE and TRIAGED

Topics for Discussion

Bugs (Swami)

All DVR bugs should be tagged and listed here: https://bugs.launchpad.net/neutron/+bugs?field.tag=l3-dvr-backlog

Bugs That Need to be Closed

  • None

New Bugs this week

High Priority Bugs in progress

Categorized Bugs

RFE

Existing Functionality Broken Bugs

Scale and Performance Impact Bugs

New Features Bugs

Refactor or Cleanup Bugs

WishList Bugs

WatchList Bugs

Bugs Closed Recently

  • https://bugs.launchpad.net/neutron/+bug/1612192 (MUST FIX) - L3 DVR: Unable to complete operation on subnet (Related to HA)
    • Not sure this is critical any more, as it's mostly seen in the SFC and OVN jobs, ~0 in the dvr-multinode-full job
  • https://bugs.launchpad.net/neutron/+bug/1612192 (MUST FIX) - L3 DVR: Unable to complete operation on subnet (Related to HA)
    • Not sure this is critical any more, as it's mostly seen in the SFC and OVN jobs, ~0 in the dvr-multinode-full job

Performance/Scalability

Gate (haleyb)

Failures

Neutron Failure Rate dashboard - http://grafana.openstack.org/dashboard/db/neutron-failure-rate

  • The DVR/LinuxBridge/Multi-node failure rate was high yesterday for the non-voting jobs (April 4th)
    • Need to investigate why to make sure nothing is lurking (might cover in L3 meeting as well)

Jobs

Stable backports (haleyb)

General information

Ihar created a page tracking all the potential backports from Mitaka to the stable releases. I have been going through it with the help of Swami to get stable/liberty, well, more stable. Bugs are removed from list as they merge.

We need to continue to be aggressive at proactively backporting fixes to the stable branches

This is a list of bugs with a fix that has been committed to the master branch, and are tagged with 'neutron-proactive-backport-potential+l3-dvr-backlog'

Open Discussion

  •  ???

Meeting commands

/join #openstack-meeting-alt
#startmeeting neutron_dvr
#chair Swami
#topic Announcements
#undo topic
#link https://wiki.openstack.org/wiki/Meetings/Neutron-DVR
#action haleyb will get something specific done this week
...
#endmeeting