Hybrid.Poly: A Consolidated Interactive Analytical Polystore System

Maksim Podkorytov, Michael Gubanov

April 2019

Abstract:

Anecdotal evidence suggests the Variety of Big data is one of the most challenging problems in Computer Science research today. First, Big data arrives from a myriad of data sources, hence its shape and flavor differ. Second, hundreds of different Big data management systems support different APIs, storage/indexing schemes, and expose data to the users through their data model lens, each specific to their own system. All of these offer a significant impediment for Big data users who just want an easy to use interface to all relevant data regardless of its shape, format, size, and a backend system used to store it. Naturally, these differences also complicate development of any analytical algorithms on top of large-scale, heterogeneous datasets. Here we describe HYBRID.POLY, a consolidated in-memory polystore engine, designed to support heterogeneous large-scale data and interactively process complex analytical workloads. We execute and evaluate several popular analytical workloads including Data Fusion, Machine Learning, and Music search at scale.

Full text:

Please refer to IEEE.

Maksim Podkorytov

Ph.D. Student

Learning to be a researcher