High-latency media / Tape support for Swift
What is it?
A design and exemplary for integrating Swift, in a generic way, with high-latency storage backends (tape, optical disc, MAID, ...).
Overview presentation (Tokyo summit): http://www.slideshare.net/hseipp/adapting-swift-for-tape-storage-or-other-highlatency-media and session replay: https://youtu.be/6nWMRCKBs-o Updated overview presentation: http://www.slideshare.net/SlavisaSarafijanovic/swift-extensions-for-tape-storage-or-other-highlatency-media
Code Repository: https://github.com/ibm-research/SwiftHLM
What problem does it solve?
The problem that is being solved is making a high-latency storage backend work well and practically usable when performing bulk operations of data tiering within a Swift data ring. The function can seen as orthogonal and complementary to ring-to-ring data tiering as described in the Swift data tiering specification, as further discussed in later sections of the document.
The problem with high latency media it that it does not work well when serving many independent request, which is exactly the case with the workload from Swift proxy layer to Swift storage nodes. A similar problem is faced and addressed in file systems, and the typical solution is to:
- use a low-capacity low-latency storage tier (e.g. disk) on top of a large-capacity and high-latency but typically cheap storage tier (e.g. tape or optical disk), and
- provide a function for explicit bulk operations for moving data between the tiers. Bulk operation allows for optimizing order of requests and use of resources (e.g. tape mounting) which is crucial for making such a system usable. The second aspect, i.e. bulk function has shown to be crucial for a practically usable system. It is worth to mention that Amazon Glacier also provides a bulk operation interface for pre-fetching objects before those can be accessed.
Swift Local Data Tiering (SLDT or LDT) is used for tiering within a Swift data ring, and Swift Data Tiering (SDT) is used for ring-to-ring data tiering. A more appropriate terminology might be introduced if good suggestions are received.
How could the problem be solved?
A function for explicit bulk operations for moving data between tiers can and should be implemented in Swift in a generic way that could support different LDT-capable backends based on tape, optical disc or other high-latency media. Examples of existing backends that can be integrated for LDT are BDT's Tape Library Connector and IBM's Spectrum Archive. In a summary, it is proposed to introduce a Swift interface extension for Swift Local Data Tiering, implemented within a Swift middleware. The same Swift middleware also implements a generic function and interface for controlling the local tiering operations within a data ring that is built on top of an LDT capable HLM backend. The concept is depicted in Figure 1.
Figure 1. SwiftHLM middleware provides a Swift interface extension for within-a-ring data tiering support and allows to plug-in LDT capable HLM backends. This allows to control data tiering within a data ring built on top of the backend through the extended Swift interface. The data path to store objects is unchanged, apart from passing through the middleware. One of two control paths can be configured: 1) LDT operations are controlled by writing/reading file system EAs (extended attributes); 2) LDT middleware controls backend through a standardized, e.g. command line based interface.
Basic set of supported LDT operations is:
- MIGRATE migrate objects data (or a complete container) from backend's low latency storage (e.g. disk) to high latency storage (e.g. tape or optical disk)
- RECALL prefetch object data (or a container) from backend's high latency storage to low latency storage
- STATUS get object status
Swift API extension for LDT operations could be realized by adding new request types or by adding modifiers to existing URL based requests. E.g. to migrate an object or a container the "?MIGRATE" modifier could be added to the URL of the POST request:
Similar can be done for recall e.g. adding the "?RECALL" modifier to the URL of the POST request, and for status e.g. adding the "?STATUS" modifier to the URL of the GET request.
Extending Swift interface to support such explicit bulk LDT operations within a data ring should result in same benefits as those seen in file systems that integrate the high latency media. Further extensions such as setting LDT policies, to e.g. migrate data to cheap hight latency tier based on usage, age, size, etc., could also be of interest.
Regarding the LDT operations control path, since object EAs are normally stored by OpenStack Swift as part of the backend file that stores object data, this is one way of communicating LDT requests to the backend. Other backends integration might be easier if Swift middleware can be configured to communicate bulk LDT requests directly to the backend, e.g. by invoking a backend control program (CP) that is provided as part of backend software package but uses a standardized interface. Supporting both options and making them configurable is relatively simple but could allow more parties to easier integrate their storage solution that support hight latency media, BDT's Tape Library Connector (open source solution) and IBM's Spectrum Archive (proprietary enterprise grade solution) being the examples.
For the second configuration option, only the interface between SwiftLDT middleware and the backend LDT control program will be defined, the internals of the backend and its CP (if used) may be proprietary. As an example, the CP may be accompanied with a server running on proxy nodes and controlling LDT operations on storage nodes.
In order to better understand potential impact of absence of bulk operation integration into Swift, lets shortly look into tape use case, as it is one of the most prominent media for long term cheap storage. Independent file accesses such as those originating from Swift data access will result in lengthy tape seeks averaging 10s of seconds, and actual response may be much longer (order of minutes) due to queueing and sequential serving of requests. Even for objects as large as 5GB tape drive utilization will be very low because the drive is likely to spend more time in seeks than in serving the data, increasing significantly number and cost of drives needed to achieve a specified effective data bandwidth, which is yet another problem. The latency is further increased due to tape mounts and unmounts, which will be many due to independent requests originating from Swift. Without integrating and using backend's capability for bulk operations, the system would be hardly usable. Playing with Swift timeouts for object access w/o prefetch also does not seem to be an option, see VancouverDiscussion.
We see two high-level options to integrate bulk operations into Swift.
Option 1: LDT / Local Data Tiering
One option is using the above explained LDT. In that way a low latency tier can be used standalone, or it can be used as target for ring-to-ring Swift data tiering (SDT) as specified in RingToRingTiering. Figure 2 shows co-existence and complementary use of Swift data data tiering (between data rings) and Swift local data tiering (inside a data ring).
Figure 2. Swift ring-to-ring data tiering (SDT) and Swift archiving/local data tiering (LDT) functions can be implemented and used as orthogonal and complementary. SDT moves data from one data ring to another, and that might not work well when accessing or tiering back the data currently stored on a high latency media of the second data ring. LDT allows to move data between low latency storage and high latency storage within the same data ring. Once LDT moved data to low latency media the object can be accessed from data ring it was originally stored to.
One advantage of this option is modular design. LDT based data ring can be used standalone or as a target for ring-to-ring tiering. The function can be invoked by ring to ring tiering function or externally by user.
Option 2: SDT / Ring-to-Ring Tiering
Second option for integrating bulk operations for data tiering within a data ring is to further enhance Swift data tiering as proposed in RingToRingTiering, so that it can control the backend and its bulk operations. In that case only the interface for ring to ring tiering would be externally available. The hight latency media data ring could not be used standalone in that case. Whether that is drawback or advantage may depend on use case. Also, if the first option is implemented its function can be transparently changed to ring-to-ring based tiering if user only wants to manage data in that way.
It is also important to understand that bulk LDT function cannot be added transparently e.g. by providing just a different DiskFile implementation for a high latency media backend, because a bulk LDT request needs to be communicated in some way to the backend.
Swift middleware backend integration
Figures 3 and 4 show the exemplary backend integration, configuring and using EA or CP control path, respectively.
Figure 3. Exemplary integration of Swift and LDT capable backend, configured to use dedicated Swift extended attributed for controlling LDT. SwiftHLM is the new middleware component added to Swift that extends Swift's interface for LDT, and the rest of Swift is used unmodified. The Swift data path remains unmodified, target being the low latency storage. In this configuration LDT control path is implemented by use of dedicated EAs to store LDT requests and object's LDT state. The backend LDT Data Mover Component (B-LDT-DMC) looks at the LDT dedicated attributes (scan or write intercept) and moves data between low latency and hight latency storage, within the storage node or within the data ring.
Figure 4. Exemplary integration of Swift and LDT capable backend, configured to use Backend LDT Control Program (B-LDT-CP) for controlling LDT operations. SwiftHLM is the new middleware component added to Swift that extends Swift's interface for LDT, and the rest of Swift is used unmodified. The Swift data path remains unmodified, target being the low latency storage. In this configuration LDT control path is implemented having SwiftLDT invoke the Backend LDT Control Program (B-LDT-CP) which then invokes the Backend LDT Data Mover Component (B-LDT-DMC) to move data between low latency and hight latency storage, within the storage node or within the data ring