Efficient ordered in-memory key-value (KV-)maps are paramount for the scalability of modern data platforms. In managed languages like Java, KV-maps face unique challenges due to the high overhead of garbage collection (GC).
We present Oak, a scalable concurrent KV-map for environments with managed memory. Oak offloads data from the managed heap, thereby reducing GC overheads and improving memory utilization. An important consideration in this context is the programming model since a standard object-based API entails moving data between the on- and off-heap spaces. In order to avoid the cost associated with such movement, Oak introduces a novel zero-copy (ZC) API alongside the traditional one (e.g., Java’s ConcurrentNavigableMap). Oak allows concurrency among all map operations, offering atomic get, put, and various conditional put operations such as compute (in-situ update) and put-if-absent.
We have released an open-source Java implementation of Oak. We further present a prototype Oak-based implementation of the internal multidimensional index in Apache Druid – a popular open-source in-memory real-time analytics system. Our experiments show that Oak can be in many cases 2x faster than Java’s state-of-the-art concurrent skiplist.
Mon 24 Feb Times are displayed in time zone: Tijuana, Baja California change
09:35 - 10:25
|Kite: Efficient and Available Release Consistency for the Datacenter|
|Oak: A Scalable Off-Heap Allocated Key-Value Map|
Hagar MeirIBM Haifa Research Lab, Edward BortnikovYahoo Research, Anastasia BraginskyYahoo Research, Dmitry BasinYahoo Research, Yonatan GottesmanYahoo Research, Eshcar HillelYahoo Research, Oath, Idit KeidarTechnion - Israel institute of technology, Eran MeirYahoo Research, Gali SheffiTechnion - Israel