Jump to: navigation, search

Swift/Fixing-rebalance-and-golang

symptom:

  1. rebalance is slow, especially for dense servers
  2. uncertain latency for end-user requests
  3. hard to monitor and requires a lot of intervention to get out of bad situations (eg cluster full)

problem:

  • swift is not in the transport data path for rsync
  • too much walking the disk
  • poor job scheduling/finding the work to be done
  • eventlet hub can't touch disk
    • mitigation: use lots of processes -- "easy" in python but hard to coordinate work
    • solution: use nonblocking io -- "hard rewrite" but efficiently solves the problem

things in-progress to fix these problems:

  • tsync protocol for data moving
    • puts swift in the data path (more efficient for actual transport and writing to disk (as opposed to rsync))
    • use an external and supported data transport and wire protocol instead of something we invent (http2+grpc vs repconn or ssync)
    • see also https://etherpad.openstack.org/p/swift-rebalance
  • better scheduling of work in reconstructor and replicator
    • threads not eventlet
    • more concurrency == more faster (to HW limits)
    • identifying the work to be done (rebuilds vs rebalance; includes backpressure from tsync)
  • fix proxy<->storage protocol (can't depend on bespoke features in our current framework)
  • golang object server itself to more efficiently take network data and write it to disk

how do we get there (subject to change):

   0. hummingbird branch is an interesting R&D reference but not going to be merged (done)
   1. make replication/reconstruction tolerable to the point that we can make it fast by changing a config value (more workers, more connections, etc) (nearly done)
   2. build a better scheduler for consistency engine work
   2. build the tsync protocol
   now: build a feature-complete golang object server (might or might not borrow from hummingbird)
   now: infra/devstack CI work (ie swift consumable in the gate)
   now: ask other deployment projects what needs to be done to make them happy with swift as a golang thing (eg kolla, ansible, tripleo, etc)