Jump to: navigation, search

Difference between revisions of "Meetings/Neutron-DVR"

 
(28 intermediate revisions by 2 users not shown)
Line 3: Line 3:
 
= Meetings =
 
= Meetings =
  
* Weekly on Wednesday at 1500 UTC
+
* Weekly on Thursday combined with L3 Meeting at 1500 UTC
* IRC channel: <code><nowiki>#openstack-meeting-alt</nowiki></code> on freenode
+
* IRC channel: <code><nowiki>#openstack-meeting-3</nowiki></code> on freenode
 
* Chair: Brian Haley (haleyb), Swaminathan Vasudevan (Swami)
 
* Chair: Brian Haley (haleyb), Swaminathan Vasudevan (Swami)
 
* Meetings, with their notes and logs, will be found under http://eavesdrop.openstack.org/meetings/neutron_dvr
 
* Meetings, with their notes and logs, will be found under http://eavesdrop.openstack.org/meetings/neutron_dvr
Line 10: Line 10:
 
= Agenda =
 
= Agenda =
  
== Meeting January 11th, 2017 ==
+
== Meeting June 29th, 2017 ==
  
 
=== Announcements (haleyb) ===
 
=== Announcements (haleyb) ===
Line 25: Line 25:
  
 
==== New Bugs this week ====
 
==== New Bugs this week ====
* https://bugs.launchpad.net/neutron/+bug/1653633 (INCOMPLETE) - fwaas v1 with DVR: l3 agent can't restore the NAT rules for floatingIP (Can't be reproduced in the master branch)
+
 
** not sure it's FWaaS-specific
+
* https://bugs.launchpad.net/neutron/+bug/1701288 (NEEDS TRIAGE) - rpc loop timeout when l2pop calls _create_agent_fdb in large scale deployments.
 +
* https://bugs.launchpad.net/neutron/+bug/1698517 (CONFIRMED) (MUST FIX) - Issue is only seen in Newton and Ocata and not in master.
 +
** https://review.openstack.org/#/c/475108/
 +
* https://bugs.launchpad.net/neutron/+bug/1695140 (NEEDS TRIAGE) - Floatingip VM cannot ping to non floatingip VM. ( Might be due to the regression that was introduced).
  
 
==== High Priority Bugs in progress ====
 
==== High Priority Bugs in progress ====
* https://bugs.launchpad.net/neutron/+bug/1647432 (SHOULD FIX) - Multiple SIGHUPs to keepalived might trigger re-election
 
** https://review.openstack.org/#/c/407099/
 
  
* https://bugs.launchpad.net/neutron/+bug/1644231 (TRIAGED) - fip router config is not created if the vm ports attached to FIPs have no device_owner
+
* https://bugs.launchpad.net/neutron/+bug/1682228 - Cross Address scopes traffic with floatingip limitation in DVR
** Needs doc update saying "compute:" prefix needed in ports manually-created for instances
 
  
* https://bugs.launchpad.net/neutron/+bug/1644415 (SHOULD FIX) - DVR-SNAT mode with HA does not clean up the fip route rule in namespace
+
* https://bugs.launchpad.net/neutron/+bug/1682345 - Neutron Metering agent RPC need to send notification to specific hosts that are configured
** https://review.openstack.org/#/c/404571/ (Needs review)
 
  
* https://bugs.launchpad.net/neutron/+bug/1632540 (NEEDS TRIAGE) - L3 agent logs fills up quick (stable/mitaka tag 8.1.2 - need more info on reproducer)
+
* https://bugs.launchpad.net/neutron/+bug/1672345 (NEEDS TRIAGE) - Symptoms look like it is a duplicate of 'allowed-address-pairs'
  
* https://bugs.launchpad.net/neutron/+bug/1629539 (NEEDS TRIAGE) - Reported against Mitaka and LBaaSv1
+
* https://bugs.launchpad.net/neutron/+bug/1657981 (Needs Triage) - FloatingIP not reachable after the compute node restarts while the agent is down
  
* https://bugs.launchpad.net/neutron/+bug/1612804 (MUST FIX) - test_shelve_instance fails with sshtimeout (One instance of failure seen in gate)
+
* https://bugs.launchpad.net/neutron/+bug/1644231 (TRIAGED) - fip router config is not created if the vm ports attached to FIPs have no device_owner
** Lowered to High since not seen recently
+
** Needs help message update saying "compute:" prefix needed in ports manually-created for instances
 +
** https://review.openstack.org/#/c/425919/ (NEEDS REVIEW)
  
 
* https://bugs.launchpad.net/neutron/+bug/1606741 (SHOULD FIX) - Metadata error with dvr_snat on compute hosts.
 
* https://bugs.launchpad.net/neutron/+bug/1606741 (SHOULD FIX) - Metadata error with dvr_snat on compute hosts.
 
** Lowered to Medium
 
** Lowered to Medium
 
** https://review.openstack.org/352686 (Expired)
 
** https://review.openstack.org/352686 (Expired)
 
* https://bugs.launchpad.net/neutron/+bug/1506567 (SHOULD FIX) - No information from metering agent
 
** https://review.openstack.org/377108 (Needs review)
 
 
* https://bugs.launchpad.net/neutron/+bug/1571676 (Re-opened) - After binding a floating IP to VM, the static route can't work in DVR
 
** https://review.openstack.org/#/c/308068/ (Needs review)
 
  
 
==== Categorized Bugs ====
 
==== Categorized Bugs ====
Line 59: Line 53:
 
==== RFE ====
 
==== RFE ====
 
* https://bugs.launchpad.net/neutron/+bug/1563879 (RFE) - DVR should route packets to Instances behind the L2 Gateway
 
* https://bugs.launchpad.net/neutron/+bug/1563879 (RFE) - DVR should route packets to Instances behind the L2 Gateway
* https://bugs.launchpad.net/neutron/+bug/1557290 (RFE) - DVR FIP agent gateway does not pass traffic directed at fixed IP
 
  
 
* https://bugs.launchpad.net/neutron/+bug/1577488 (RFE) - "Fast exit" for compute node egress flows when using DVR
 
* https://bugs.launchpad.net/neutron/+bug/1577488 (RFE) - "Fast exit" for compute node egress flows when using DVR
** https://review.openstack.org/#/c/283757/ (Needs Review)
+
** https://review.openstack.org/#/c/283757/ (Merged)
 
** https://review.openstack.org/#/c/355062/ (Needs Review)
 
** https://review.openstack.org/#/c/355062/ (Needs Review)
  
 
* https://bugs.launchpad.net/neutron/+bug/1583694 (RFE) - DVR support for Allowed_address_pair port that are bound to multiple ACTIVE VM ports
 
* https://bugs.launchpad.net/neutron/+bug/1583694 (RFE) - DVR support for Allowed_address_pair port that are bound to multiple ACTIVE VM ports
** https://review.openstack.org/#/c/320669/ (WIP)(Server side patch)
+
** https://review.openstack.org/#/c/437970/ (Needs review) - Server side patch
** https://review.openstack.org/#/c/323618/ (WIP)(Agent side patch)
+
** https://review.openstack.org/#/c/437986/ (Needs review) - Agent side patch
  
 
==== Existing Functionality Broken Bugs ====
 
==== Existing Functionality Broken Bugs ====
 
* https://bugs.launchpad.net/neutron/+bug/1526855 - (SHOULD FIX) - VMs fail to get metadata in large scale environments
 
* https://bugs.launchpad.net/neutron/+bug/1526855 - (SHOULD FIX) - VMs fail to get metadata in large scale environments
 
* https://bugs.launchpad.net/neutron/+bug/1541406 - IPv6 Prefix Delegation does not work with DVR (SHOULD FIX)
 
** https://review.openstack.org/#/c/277657/ (Patch) - Needs update to create a better abstraction that allows clients to get the namespace for the right part of the router.
 
 
* https://bugs.launchpad.net/neutron/+bug/1414559 (OVS drops RARP packets)
 
** in neutron https://review.openstack.org/#/c/246898/ (MERGED) Notify vif plugged event
 
** and nova https://review.openstack.org/#/c/246910/ (not making progress) Notify vif plugged event
 
  
 
* https://bugs.launchpad.net/neutron/+bug/1447227 (GOOD TO HAVE) - Connecting two or more distributed routers to a subnet. (No one is working on it)
 
* https://bugs.launchpad.net/neutron/+bug/1447227 (GOOD TO HAVE) - Connecting two or more distributed routers to a subnet. (No one is working on it)
Line 86: Line 72:
  
 
==== New Features Bugs ====
 
==== New Features Bugs ====
* https://bugs.launchpad.net/neutron/+bug/1557290 (SHOULD FIX) - DVR not able to forward traffic to private IP from FIP namespace
 
* https://bugs.launchpad.net/neutron/+bug/1504039 - LinuxBridge DVR (Spec & code)
 
  
 
==== Refactor or Cleanup Bugs ====
 
==== Refactor or Cleanup Bugs ====
Line 94: Line 78:
 
==== WishList Bugs ====
 
==== WishList Bugs ====
 
* https://bugs.launchpad.net/neutron/+bug/1518819 - Ability to specify a gateway when adding a subnet without cidr (GOOD TO HAVE)
 
* https://bugs.launchpad.net/neutron/+bug/1518819 - Ability to specify a gateway when adding a subnet without cidr (GOOD TO HAVE)
* https://bugs.launchpad.net/neutron/+bug/1450067 (Low) - ML2 and L3 plugin exposes dvr extension even if ovs is unused. (No owners yet) (GOOD TO HAVE)
+
* https://bugs.launchpad.net/neutron/+bug/1450067 (Low) - ML2 and L3 plugin exposes dvr extension even if OVS is unused. (No owners yet) (GOOD TO HAVE)
 
* https://bugs.launchpad.net/neutron/+bug/1476469 (TRIAGE ME) - with DVR, a VM can't use floating IP and VPN at the same time
 
* https://bugs.launchpad.net/neutron/+bug/1476469 (TRIAGE ME) - with DVR, a VM can't use floating IP and VPN at the same time
 
** This is not a bug, since floating IP can't be used with the VPN
 
** This is not a bug, since floating IP can't be used with the VPN
 +
 +
==== WatchList Bugs ====
 +
 +
==== Bugs Closed Recently====
 +
 +
* https://bugs.launchpad.net/neutron/+bug/1632540 (NEEDS TRIAGE) - L3 agent logs fills up quick (stable/mitaka tag 8.1.2 - need more info on reproducer)
 +
** https://review.openstack.org/434863 (Merged) - this will fix some of the issue
 +
 +
* https://bugs.launchpad.net/neutron/+bug/1504039 - LinuxBridge DVR (Spec & code) - Expired
  
 
* https://bugs.launchpad.net/neutron/+bug/1620824 (TRIAGED) - Neutron DVR(SNAT) steals FIP traffic
 
* https://bugs.launchpad.net/neutron/+bug/1620824 (TRIAGED) - Neutron DVR(SNAT) steals FIP traffic
** https://review.openstack.org/#/c/366297/ (needs review) (Currently not using reference L2pop and using tcp_loose=0).
+
** https://review.openstack.org/#/c/366297/ (Merged) (Currently not using reference L2pop and using tcp_loose=0)
 +
 
 +
* https://bugs.launchpad.net/neutron/+bug/1612804 (MUST FIX) - test_shelve_instance fails with sshtimeout (One instance of failure seen in gate)
 +
** Closed since not seen recently
  
==== WatchList Bugs ====
+
* https://bugs.launchpad.net/neutron/+bug/1597461 (FIXED) - Two masters after reboot of controller when HA enabled. Also seen with DVR
 +
** https://review.openstack.org/357458 (merged)
 +
 
 +
* https://bugs.launchpad.net/neutron/+bug/1675187 (MUST FIX) - Floating IPs not removed on rfp interface in qrouter
 +
** https://review.openstack.org/#/c/451859/ (Merged)
 +
 
 +
* https://bugs.launchpad.net/neutron/+bug/1647432 (SHOULD FIX) - Multiple SIGHUPs to keepalived might trigger re-election
 +
** https://review.openstack.org/#/c/407099/ (Merged)
 +
 
 +
* https://bugs.launchpad.net/neutron/+bug/1653633 (EXPIRED) - fwaas v1 with DVR: l3 agent can't restore the NAT rules for floatingIP (Can't be reproduced in the master branch)
 +
** not sure it's FWaaS-specific
 +
 
 +
* https://bugs.launchpad.net/neutron/+bug/1668277 (SHOULD FIX) - When enable l2pop, after deleting the DVR port, the associated flow entries still exists
 +
** https://review.openstack.org/#/c/438516 (Merged)
 +
 
 +
* https://bugs.launchpad.net/neutron/+bug/1629539 (NEEDS TRIAGE) - Broken distributed virtual router w/ lbaas v1
 +
** Against Mitaka, expired
 +
 
 +
* https://bugs.launchpad.net/neutron/+bug/1658060 (Needs triage) - FirewallNotFound exceptions when deleting the firewall in FWaaS-DVR
 +
 
 +
* https://bugs.launchpad.net/neutron/+bug/1541406 - IPv6 Prefix Delegation does not work with DVR (SHOULD FIX)
 +
** https://review.openstack.org/#/c/277657/ (MERGED)
 +
 
 +
* https://bugs.launchpad.net/neutron/+bug/1571676 (Re-opened) - After binding a floating IP to VM, the static route can't work in DVR
 +
** https://review.openstack.org/#/c/308068/ (Patch Merged)
 +
 
 +
* https://bugs.launchpad.net/neutron/+bug/1557290 (SHOULD FIX) - DVR not able to forward traffic to private IP from FIP namespace
 +
** Duplicate of Fast-exit RFE
 +
 
 +
* https://bugs.launchpad.net/neutron/+bug/1654991 (SHOULD FIX) Allow all migration of routers
 +
** https://review.openstack.org/#/c/376550/ (Patch merged)
 +
 
 +
* https://bugs.launchpad.net/neutron/+bug/1506567 (SHOULD FIX) - No information from metering agent
 +
** https://review.openstack.org/377108 (Patch merged)
 +
 
 +
* https://bugs.launchpad.net/neutron/+bug/1644415 (SHOULD FIX) - DVR-SNAT mode with HA does not clean up the fip route rule in namespace
 +
** https://review.openstack.org/#/c/404571/ (Patch merged)
 +
 
 +
* https://bugs.launchpad.net/neutron/+bug/1414559 (OVS drops RARP packets)
 +
** in neutron https://review.openstack.org/#/c/246898/ (MERGED) Notify vif plugged event
 +
** and nova https://review.openstack.org/#/c/246910/ (not making progress) Notify vif plugged event
  
==== Bugs Closed Recently====
 
 
* https://bugs.launchpad.net/neutron/+bug/1403455 (MUST FIX) - neutron netns cleanup does not cleanup all subprocesses
 
* https://bugs.launchpad.net/neutron/+bug/1403455 (MUST FIX) - neutron netns cleanup does not cleanup all subprocesses
 
** https://review.openstack.org/402140
 
** https://review.openstack.org/402140
Line 113: Line 148:
  
 
* https://bugs.launchpad.net/neutron/+bug/1631513 (MUST FIX) - Race condition in update_gateway_port when two simultaneous router update occurs for the same router
 
* https://bugs.launchpad.net/neutron/+bug/1631513 (MUST FIX) - Race condition in update_gateway_port when two simultaneous router update occurs for the same router
** https://review.openstack.org/#/c/385617/ ( Merged)
+
** https://review.openstack.org/#/c/385617/ (Merged)
  
 
* https://bugs.launchpad.net/neutron/+bug/1641535 (MUST FIX) - FIP failed to remove in router's standby node
 
* https://bugs.launchpad.net/neutron/+bug/1641535 (MUST FIX) - FIP failed to remove in router's standby node
Line 136: Line 171:
 
* https://bugs.launchpad.net/neutron/+bug/1580648 (FIXED) - Two HA routers in master state during functional test
 
* https://bugs.launchpad.net/neutron/+bug/1580648 (FIXED) - Two HA routers in master state during functional test
 
** Patch already pushed in but the issue is still seen. (Not able to reproduce)
 
** Patch already pushed in but the issue is still seen. (Not able to reproduce)
 
* https://bugs.launchpad.net/neutron/+bug/1602320 (FIXED) - Keepalive process kill vrrp child process with l3/dvr/ha
 
** https://review.openstack.org/#/c/342730/
 
  
 
* https://bugs.launchpad.net/neutron/+bug/1602614 (MUST FIX) - DVR+L3 HA Loss during failover is higher
 
* https://bugs.launchpad.net/neutron/+bug/1602614 (MUST FIX) - DVR+L3 HA Loss during failover is higher
Line 144: Line 176:
 
* https://bugs.launchpad.net/neutron/+bug/1456073 (High) - Block Migration with FIP breaks with DVR (MUST FIX)
 
* https://bugs.launchpad.net/neutron/+bug/1456073 (High) - Block Migration with FIP breaks with DVR (MUST FIX)
 
** https://review.openstack.org/275073 (Nova side change) - Needs review
 
** https://review.openstack.org/275073 (Nova side change) - Needs review
 
* https://bugs.launchpad.net/neutron/+bug/1597461 (FIXED) - Two masters after reboot of controller when HA enabled. Also seen with DVR
 
** https://review.openstack.org/357458 (merged)
 
  
 
* https://bugs.launchpad.net/neutron/+bug/1462154 (With DVR Pings to floating IPs replied with fixed-ips if VMs are on the same network)
 
* https://bugs.launchpad.net/neutron/+bug/1462154 (With DVR Pings to floating IPs replied with fixed-ips if VMs are on the same network)
Line 156: Line 185:
 
==== Failures ====
 
==== Failures ====
 
Neutron Failure Rate dashboard - http://grafana.openstack.org/dashboard/db/neutron-failure-rate
 
Neutron Failure Rate dashboard - http://grafana.openstack.org/dashboard/db/neutron-failure-rate
* DVR multinode failure rates have lowered - very infrequent in gate queue, 15% in check queue
+
* The DVR/LinuxBridge/Multi-node failure rate was high yesterday for the non-voting jobs (April 4th)
** VM failing to get DHCP address typical problem
+
** Need to investigate why to make sure nothing is lurking (might cover in L3 meeting as well)
 +
 
 
* Bugs in Nova:
 
* Bugs in Nova:
 
** https://bugs.launchpad.net/nova/+bug/1524898 (Volume based live migration aborted unexpectedly)
 
** https://bugs.launchpad.net/nova/+bug/1524898 (Volume based live migration aborted unexpectedly)
Line 163: Line 193:
  
 
==== Jobs ====
 
==== Jobs ====
* DVR-multinode job has been pretty stable, let's make it voting
+
* DVR+HA multinode job is now run for all patches, currently non-voting
** https://review.openstack.org/410973
+
** https://review.openstack.org/#/c/455406/
 +
** Will monitor and make sure it's stable
  
 
=== Stable backports (haleyb) ===
 
=== Stable backports (haleyb) ===
Line 176: Line 207:
 
This is a list of bugs with a fix that has been committed to the master branch, and are tagged with 'neutron-proactive-backport-potential+l3-dvr-backlog'
 
This is a list of bugs with a fix that has been committed to the master branch, and are tagged with 'neutron-proactive-backport-potential+l3-dvr-backlog'
 
* https://googl/sx0KL5
 
* https://googl/sx0KL5
 
==== L3 scheduler backports (jschwarz) ====
 
Potential candidates for backport to Newton branch for the L3 scheduler:
 
* https://review.openstack.org/#/c/317949/
 
* https://review.openstack.org/#/c/417089/
 
* https://review.openstack.org/#/c/417854/
 
* https://review.openstack.org/#/c/357966/
 
* https://review.openstack.org/#/c/418777/
 
  
 
=== Open Discussion ===
 
=== Open Discussion ===
* Allow all migration of routers - does this impact VNNaaS?
+
* ???
** https://review.openstack.org/#/c/376550/
 
  
 
== Meeting commands ==
 
== Meeting commands ==

Latest revision as of 14:55, 29 June 2017

The OpenStack Networking L3 DVR Sub-team holds public meetings as advertised on OpenStack IRC Meetings Calendar. If you are unable to attend, please check the most recent logs.

Meetings

  • Weekly on Thursday combined with L3 Meeting at 1500 UTC
  • IRC channel: #openstack-meeting-3 on freenode
  • Chair: Brian Haley (haleyb), Swaminathan Vasudevan (Swami)
  • Meetings, with their notes and logs, will be found under http://eavesdrop.openstack.org/meetings/neutron_dvr

Agenda

Meeting June 29th, 2017

Announcements (haleyb)

  • When adding items below I'd like to try to get feedback on whether they are a MUST FIX, SHOULD FIX, or GOOD TO HAVE, NEEDS TRIAGE and TRIAGED

Topics for Discussion

Bugs (Swami)

All DVR bugs should be tagged and listed here: https://bugs.launchpad.net/neutron/+bugs?field.tag=l3-dvr-backlog

Bugs That Need to be Closed

  • None

New Bugs this week

High Priority Bugs in progress

Categorized Bugs

RFE

Existing Functionality Broken Bugs

Scale and Performance Impact Bugs

New Features Bugs

Refactor or Cleanup Bugs

WishList Bugs

WatchList Bugs

Bugs Closed Recently

  • https://bugs.launchpad.net/neutron/+bug/1612192 (MUST FIX) - L3 DVR: Unable to complete operation on subnet (Related to HA)
    • Not sure this is critical any more, as it's mostly seen in the SFC and OVN jobs, ~0 in the dvr-multinode-full job
  • https://bugs.launchpad.net/neutron/+bug/1612192 (MUST FIX) - L3 DVR: Unable to complete operation on subnet (Related to HA)
    • Not sure this is critical any more, as it's mostly seen in the SFC and OVN jobs, ~0 in the dvr-multinode-full job

Performance/Scalability

Gate (haleyb)

Failures

Neutron Failure Rate dashboard - http://grafana.openstack.org/dashboard/db/neutron-failure-rate

  • The DVR/LinuxBridge/Multi-node failure rate was high yesterday for the non-voting jobs (April 4th)
    • Need to investigate why to make sure nothing is lurking (might cover in L3 meeting as well)

Jobs

Stable backports (haleyb)

General information

Ihar created a page tracking all the potential backports from Mitaka to the stable releases. I have been going through it with the help of Swami to get stable/liberty, well, more stable. Bugs are removed from list as they merge.

We need to continue to be aggressive at proactively backporting fixes to the stable branches

This is a list of bugs with a fix that has been committed to the master branch, and are tagged with 'neutron-proactive-backport-potential+l3-dvr-backlog'

Open Discussion

  •  ???

Meeting commands

/join #openstack-meeting-alt
#startmeeting neutron_dvr
#chair Swami
#topic Announcements
#undo topic
#link https://wiki.openstack.org/wiki/Meetings/Neutron-DVR
#action haleyb will get something specific done this week
...
#endmeeting