Seminar - Modern Query Engines


Please contact us using i3mqe@db.cit.tum.de.

Sessions take place on Thursdays (12:30 - 14:00) in 02.09.014

Content

In this seminar, we will study techniques for building modern query engines.

Prerequisites

This seminar is geared towards the top students in database systems.
  • Fundamentals of Databases (Grundlagen Datenbanken, GDB) or similar course.
  • Additional database implementation courses are highly recommended.

Kickoff

  • Meeting on Thursday, 13.02.2025 from 12:30 (sharp!) online at this BBB room

Slides

DateTopic
13.02.2025Kickoff Meeting
24.04.2025Organization Details
24.04.2025Intro: Query Engines
08.05.2025How to write a paper

Schedule

Date Session Notes/Deadline Topics
Thu 08.05.2025 Intro Session #2 How to write a paper
Thu 15.05.2025 Intro Session #3 How to give presentations
Thu 22.05.2025 Session 01 Read papers for Fundamentals of Query Execution
  1. Tidy Tuples and Flying Start: Fast Compilation and Fast Execution of Relational Queries in Umbra
  2. Excalibur: A Virtual Machine for Adaptive Fine-grained JIT-Compiled Query Execution based on VOILA
Thu 29.05.2025 Holiday
Thu 05.06.2025 Session 02
  1. Incremental Fusion: Unifying Compiled and Vectorized Query Execution
  2. Building Advanced SQL Analytics From Low-Level Plan Operators
Thu 12.06.2025 Session 03 Read papers for Deep Dive: Implementing Operators, Submit report draft
  1. The Complete Story of Joins (in HyPer)
  2. Simple, Efficient and Robust Hash Tables for Join Processing
Thu 19.06.2025 Holiday
Thu 26.06.2025 No session
Thu 03.07.2025 Session 04 Submit peer reviews
  1. Efficient Processing of Window Functions in Analytical SQL Queries
  2. Implementing Operators: A scalable and generic approach to range joins
Thu 10.07.2025 Session 05
  1. These Rows Are Made for Sorting and That’s Just What We’ll Do
  2. Implementing Operators: A practical approach to groupjoin and nested aggregates
Thu 17.07.2025 Session 06 Read papers for Examples of Modern Query Engines
  1. Examples of Modern Query Engines: Photon: A fast query engine for lakehouse systems
  2. Query processing on tensor computation runtimes
Thu 24.07.2025 Session 07 Submit final report Alternative date in the case of session/presentation cancellations

Papers

All: Read by every student before the topic block starts

Topics: Will be assigned to students

Fundamentals of Query Execution

All
MonetDB/X100: Hyper-Pipelining Query Execution.
Efficiently compiling efficient query plans for modern hardware
Everything you always wanted to know about compiled and vectorized queries but were afraid to ask

Topics
Relaxed operator fusion for in-memory databases: Making compilation, vectorization, and prefetching work together at last
Incremental Fusion: Unifying Compiled and Vectorized Query Execution
Tidy Tuples and Flying Start: fast compilation and fast execution of relational queries in Umbra
Building advanced SQL analytics from low-level plan operators
Excalibur: A Virtual Machine for Adaptive Fine-grained JIT-Compiled Query Execution based on VOILA

Deep Dive: Implementing Operators

All
Morsel-driven parallelism: a NUMA-aware query evaluation framework for the many-core age
Micro adaptivity in vectorwise

Topics
Efficient processing of window functions in analytical SQL queries
These Rows Are Made for Sorting and That's Just What We'll Do
A practical approach to groupjoin and nested aggregates
Simple, Efficient, and Robust Hash Tables for Join Processing
A scalable and generic approach to range joins
High-Performance Query Processing with NVMe Arrays: Spilling without Killing Performance
Robust External Hash Aggregation in the Solid State Age
The Complete Story of Joins (in HyPer)

Examples of Modern Query Engines

All
Duckdb: an embeddable analytical database
The Composable Data Management System Manifesto

Topics
Photon: A fast query engine for lakehouse systems
Query processing on tensor computation runtimes
Apache Arrow DataFusion: A Fast, Embeddable, Modular Analytic Query Engine
Designing an open framework for query optimization and compilation
DB2 with BLU Acceleration: So Much More than Just a Column Store