- 1 Meetings
- 2 Agenda
- 2.1 Meeting August 31st, 2016
- 2.2 Topics for Discussion
- 2.2.1 Bugs (Swami)
- 126.96.36.199 New Bugs this week
- 188.8.131.52 Categorized Bugs
- 184.108.40.206 Gate Test Failures
- 220.127.116.11 RFE
- 18.104.22.168 Existing Functionality Broken Bugs
- 22.214.171.124 Scale and Performance Impact Bugs
- 126.96.36.199 New Features Bugs
- 188.8.131.52 Refactor or Cleanup Bugs
- 184.108.40.206 WishList Bugs
- 220.127.116.11 WatchList Bugs
- 18.104.22.168 Bugs Closed Recently
- 2.2.2 Performance/Scalability
- 2.2.3 Gate failures (haleyb)
- 2.2.4 Stable backports (haleyb)
- 2.2.5 Open Discussion
- 2.2.1 Bugs (Swami)
- 2.3 Meeting commands
- Weekly on Wednesday at 1500 UTC
- IRC channel:
- Chair: Brian Haley (haleyb), Swaminathan Vasudevan (Swami)
- Meetings, with their notes and logs, will be found under http://eavesdrop.openstack.org/meetings/neutron_dvr
Meeting August 31st, 2016
- When adding items below I'd like to try to get feedback on whether they are a MUST FIX, SHOULD FIX, or GOOD TO HAVE
Topics for Discussion
All DVR bugs should be tagged and listed here: https://bugs.launchpad.net/neutron/+bugs?field.tag=l3-dvr-backlog
New Bugs this week
No new bugs this week.
- https://bugs.launchpad.net/neutron/+bug/1612192 (MUST FIX) - L3 DVR: Unable to complete operation on subnet (Related to HA)
- https://bugs.launchpad.net/neutron/+bug/1612804 (MUST FIX) - test_shelve_instance fails with sshtimeout ( One instance of failure seen in gate)
- https://bugs.launchpad.net/neutron/+bug/1403455 (MUST FIX) - neutron netns cleanup does not cleanup all subprocess.
- https://bugs.launchpad.net/neutron/+bug/1597461 (MUST FIX) - Two masters after reboot of controller when HA enabled. Also seen with DVR
- https://bugs.launchpad.net/neutron/+bug/1602320 (SHOULD FIX) - Keepalive process kill vrrp child process with l3/dvr/ha
- https://bugs.launchpad.net/neutron/+bug/1606741 (SHOULD FIX) - Metadata error with dvr_snat on compute hosts.
- https://review.openstack.org/352686 (Needs review)
- https://bugs.launchpad.net/neutron/+bug/1595043 (SHOULD FIX) - Generic DVR portbinding useful for HA ports.
- https://review.openstack.org/#/c/255237 (Needs review)
- https://bugs.launchpad.net/neutron/+bug/1602614 (MUST FIX) - DVR+L3 HA Loss during failover is higher (Need to triage)
- https://bugs.launchpad.net/neutron/+bug/1593354 (MUST FIX) - SNAT HA failed due to missing NAT rule and sg- interface. (May be a duplicate)
- https://bugs.launchpad.net/neutron/+bug/1596473 (MUST FIX) - Packet loss with DVR and IPv6 (Need to triage)
- https://bugs.launchpad.net/neutron/+bug/1506567 (SHOULD FIX) - No information from metering agent
Gate Test Failures
More Functional test failures seen in Gate.
- https://bugs.launchpad.net/neutron/+bug/1563879 (RFE)
- https://bugs.launchpad.net/neutron/+bug/1557290 (RFE)
- https://bugs.launchpad.net/neutron/+bug/1577488 (RFE)
- https://bugs.launchpad.net/neutron/+bug/1583694 (RFE) (This RFE may not move forward in the newton time frame, but we might have further discussion during the mid-cycle meetup)
Existing Functionality Broken Bugs
- https://bugs.launchpad.net/neutron/+bug/1590041 (SHOULD-FIX) - Regression, snat_namespace object creation ahead of time does not solve some external_gateway_update condition.
- https://bugs.launchpad.net/neutron/+bug/1583266 (SHOULD FIX) - watch_log_file=True causes fip to slow down. Symptom seen more with DVR routers.
- https://bugs.launchpad.net/neutron/+bug/1585165 (SHOULD FIX) - Floating ip not reachable after VM live migration (Seen in Liberty with DVR) - Seems to me like a duplicate but should triage it more on the symptoms.
- https://bugs.launchpad.net/neutron/+bug/1571676 (MUST FIX) - Static routes not added to FloatingIP Namespace.
- This approach can't really work because adding a static route in the FIP namespace for one tenant could impact traffic for another tenant.
- https://bugs.launchpad.net/neutron/+bug/1564776 - (MUST FIX) (Snat Namespace error when namespace deleted)
- https://bugs.launchpad.net/neutron/+bug/1564757 - (MUST FIX) (FloatingIPError while agent restarts)
- https://bugs.launchpad.net/neutron/+bug/1526855 - (SHOULD FIX) (Metadata error)
- https://bugs.launchpad.net/neutron/+bug/1541406 - IPv6 Prefix Delegation does not work with DVR (SHOULD FIX)
- https://review.openstack.org/#/c/277657/ (Patch) - Needs update to create a better abstraction that allows clients to get the namespace for the right part of the router.
- https://bugs.launchpad.net/neutron/+bug/1456073 (High) - Block Migration with FIP breaks with DVR (MUST FIX)
- https://review.openstack.org/275073 (Nova side change) - Needs review
- https://review.openstack.org/#/c/286855/ (Tempest change) - Needs work - must merge before Nova patch
- https://review.openstack.org/275420 (Merged) - Neutron Server side DVR change
- https://review.openstack.org/#/c/260738/ (Merged) - L3 Agent side patch
- https://bugs.launchpad.net/neutron/+bug/1414559 (OVS drops RARP packets)
- https://bugs.launchpad.net/neutron/+bug/1557290 - (MUST FIX) DVR FIP agent gateway does not pass traffic directed at fixed IP
- https://bugs.launchpad.net/neutron/+bug/1447227 (Low) - Connecting two or more distributed routers to a subnet. (No one is working on it) (GOOD TO HAVE)
- https://bugs.launchpad.net/neutron/+bug/1552070 (GOOD TO HAVE) - Functional test changes to address dual stack for DVR
Scale and Performance Impact Bugs
New Features Bugs
- https://bugs.launchpad.net/neutron/+bug/1557290 (SHOULD FIX) - DVR not able to forward traffic to private IP from FIP namespace
- https://bugs.launchpad.net/neutron/+bug/1504039 - LinuxBridge DVR (Spec & code)
Refactor or Cleanup Bugs
- https://bugs.launchpad.net/neutron/+bug/1531065 - (Low) duplicate fetch subnet_id in get_subnet_for_dvr (GOOD TO HAVE)
- https://review.openstack.org/#/c/263563/ - (Needs review)
- https://bugs.launchpad.net/neutron/+bug/1538369 - Refactor request (Not a real bug)
- https://bugs.launchpad.net/neutron/+bug/1518819 - Ability to specify a gateway when adding a subnet without cidr (GOOD TO HAVE)
- https://bugs.launchpad.net/neutron/+bug/1529439 - unify validate_agent_router_combination exceptions for dvr agent_mode (GOOD TO HAVE)
- https://bugs.launchpad.net/neutron/+bug/1450067 (Low) - ML2 and L3 plugin exposes dvr extension even if ovs is unused. (No owners yet) (GOOD TO HAVE)
- https://bugs.launchpad.net/neutron/+bug/1444014 (Incomplete) (StaleDataError) - on Watch
- https://bugs.launchpad.net/neutron/+bug/1521815 (Low)- (DVR functional Job failure) - on Watch state
Bugs Closed Recently
- https://bugs.launchpad.net/neutron/+bug/1599287 (MUST FIX) - Cleanup of ip rule and tables for stale snat
- https://review.openstack.org/337855 (merged)
- https://bugs.launchpad.net/neutron/+bug/1609540 (MUST FIX) - CSNAT port fails due to no fixed ips. (partial workaround proposed).
- https://review.openstack.org/350783 (Merged)
- https://bugs.launchpad.net/neutron/+bug/1602794 (MUST FIX) - Itemallocator class can throw a ValueError
- https://bugs.launchpad.net/neutron/+bug/1599089 (MUST FIX) - Floatingip move within the same node does not handle fixed ip properly
- https://bugs.launchpad.net/neutron/+bug/1597561 (MUST FIX) - Occurences of Duplicate fg port in fip namespace
- https://bugs.launchpad.net/neutron/+bug/1569918 (MUST FIX) (Allowed_address_pair delayed FIP association)
- https://bugs.launchpad.net/neutron/+bug/1462154 (With DVR Pings to floating IPs replied with fixed-ips if VMs are on the same network)
- https://review.openstack.org/246855 (Merged) (Reverted)
Gate failures (haleyb)
- Current failure rate: https://goo.gl/L1WODG
- Functional test failures seen in the Check queue, root cause yet to be identified.
- Single node check job failure rates are more when compared to neutron full job.
- VM failing to get DHCP address typical problem
- Bugs in Nova:
- Old bug to help in debugging the gate
- https://bugs.launchpad.net/neutron/+bug/1515360 - Tempest "SSHTimeout" failures (only for debugging)
Stable backports (haleyb)
Ihar created a page tracking all the potential backports from Mitaka to the stable releases. I have been going through it with the help of Swami to get stable/liberty, well, more stable. Bugs are removed from list as they merge.
We need to continue to be aggressive at proactively backporting fixes to the stable branches
This is a list of bugs with a fix that has been committed to the master branch, and are tagged with 'neutron-proactive-backport-potential+l3-dvr-backlog' (the wiki thinks the link is spam so I had to put a space in it)
- https://goo. gl/sx0KL5
Wanted to talk about some related patches, so we come up with a good answer going forward These two both do similar things around ip_lib.get_devices() code:
- https://review.openstack.org/#/c/348372 (L3 agent: check router namespace existence before delete)
- https://review.openstack.org/#/c/309050/ (Check if namespace exists before getting devices)
And this is a change that also checks existence so not throw exceptions:
- https://review.openstack.org/#/c/326729/ (DVR: Clean stale snat-ns by checking its existence when agent restarts)
Need to escalate the nova patch for the live migration. Since we have the tempest test running right now, we need someone from the nova team to take a look at the nova live migration patch.
#action haleyb will get something specific done this week