Project

HTAP benchmarking with an HTAP/IoT workload

How does commonly used time series systems handle an HTAP/IoT use-case?
Students: Marcus Winding Quistgaard
Supervisor: Pınar Tözün
Level: MSc, Semester: Spring 2020
Tags: HTAP, IoT, benchmarking

The need for large-scale real-time analytics applications that need to execute both fast concurrent transactions and real-time analytical queries in the same system are rising, and both new and existing systems are being created to meet this demand. These are called Hybrid Transactional/Analytical Processing (HTAP) systems and while many systems claim to support a HTAP-workload, their performance is unknown and they often have no standard set of capabilities that they support. Benchmarking these systems with a representative workload is therefore required to know which system would be the best choice before a business decision is made.

Seen from a query-type perspective, HTAP systems may be a good storage-solution for an IoT system since these systems may execute both traditional IoT queries as well as queries typical of HTAP systems. However, the huge storage- and ingest-pressure of an IoT workload will stress systems intended for HTAP-workloads in ways that their developers may not have expected, so benchmarking is even more important when a system needs to be selected for which the workload exists in the intersection between HTAP and IoT.

This project will benchmark and analyze a selection of database systems to quantify their performance under a combined HTAP/IoT workload. The benchmarking suite chosen for this analysis supports some of the functionality that this project is interested in already, and will be extended further to support any functionality required for this quantification.

The benchmark that will be used for this purpose is an existing HTAP/IoT benchmark that developed during the precursor ITU project. This benchmark will be extended with support for the chosen database systems, and with additional functionality that enables it to better model the scenarios that are interesting for this project.

Method:

A selection of three database systems have been chosen for this project, and will be compared using the chosen benchmark. If time allows, more database systems may be added to the comparison.

The benchmark will initially be extended with support for all the chosen database systems. Afterward, it’s workload configuration options will be extended to better support the workloads that are deemed interesting for cross-system comparisons.