Title Towards Hybrid Storage Devices with Block and DAX Interface Authors Daniel Habicht, Yussuf Khalil, Lukas Werling, Frank Bellosa Affiliation Karlsruhe Institute of Technology Abstract For decades, operating systems used an asynchronous block interface for accessing persistent storage, as this model best fit the characteristics of traditional storage devices such as HDDs. In recent years, byte-addressable storage-class memory (SCM) solutions that exhibit performance close to DRAM, like Intel’s Optane Persistent Memory Modules, joined the storage stack, allowing for low-latency synchronous I/O. This made SCM especially interesting for performance-critical systems with high persistence requirements such as databases. SCMs, however, are no direct replacement for storage technologies like Flash as they typically have a significantly higher cost and lower density. The combination of high-density Flash with fast and byte-addressable SCM in one hybrid device might seem like a logical next step in the development of new storage devices. However, the currently available interconnect for attaching high-end storage, namely PCIe, does not allow caching of device-attached memory. This is further exacerbated by PCIe not being designed for low-latency load/store semantics. Consequently, the deployment of such devices, as well as the adoption of new storage abstractions that fit this hybrid approach, failed to materialize. With the growing industry-wide adoption of the cache-coherent Compute Express Link (CXL), this situation is about to change and first commercial hybrid SSD offerings, like Samsung’s CXL Memory Module Hybrid (CMM-H), are on the horizon. Hardware details of upcoming commercial offerings, however, are not publicly available and it remains unclear how operating systems can support such devices in a way that makes the best use of both storage paradigms. In this talk, we discuss implementation details of future hybrid SSDs that feature a SCM cache. On this basis, we outline the development of our hybrid SSD support in the Linux kernel and demonstrate why Linux’s existing direct-access (DAX) abstractions are unsuitable for upcoming hybrid CXL SSDs. Further, we propose how Linux’s user-facing DAX interface can be extended to support hybrid SSDs, as well as seamlessly integrate into POSIX APIs. As users of the DAX interface may require guaranteed non-blocking synchronous access for maintaining low tail latencies, we propose a mechanism for pinning pages to SCM. For file contents that exhibit heavy I/O activity, i.e., frequently synced or accessed using read/write, we propose to transparently bypass the volatile page cache and migrate them to otherwise unused parts of the on-device SCM cache. This allows us to eliminate the overhead of synchronous write-back for applications with heavy use of fsync on file contents cached in SCM since all modifications in SCM are guaranteed to be persisted. Building further upon the idea of transparently bypassing the volatile page cache, we introduce the concept of Transparent DAX Mappings (TDM). TDMs are memory mappings dynamically established by the kernel that make file contents cached in SCM directly accessible to user space. We discuss the use of TDMs for implementing a zero-copy read/write kernel bypass aiming to increase the performance and energy efficiency of unmodified applications.