Data Compression With and Without Deep Probabilistic Models
(Summer Term 2023)

Course by Prof. Robert Bamler at University of Tübingen.

Additional Resources

Links

Schedule & Course Materials

Download links for lecture notes, problem sets, solutions, and video summaries for each session will be added as we progress with the course.

Overview and Symbol Codes

What is the connection between data compression, probabilistic models, and error correction? We answer these questions with some concrete examples of so-called symbol codes.
1
Lecture
19 April
Tutorial
19. & 26.

Theoretical Bounds of Lossless Compression (feat. Entropy)

We prove the Source Coding The­o­rem. This cornerstone of information theory quantifies information content, and it states a fundamental lower bound for the bit rate of lossless compression.
2
Lecture
26 April
Tutorial
3 May

Proof of Optimality of Huffman Coding

We prove that the famous Huffman Coding algorithm constructs an opti­mal symbol code. Then we introduce an information theoretical measure for model mismatch, the “KL-divergence”.
3
Lecture
3 May
Tutorial
10 May

Mutual Information and Tax­on­o­my of Probabilistic Models

How can we build powerful prob­a­bi­lis­tic models without sacrificing ef­fi­cien­cy? We'll discuss various designs after introducing important concepts from probability and information theory.
4
Lecture
10 May
Tutorial
17 May

Stream Codes I: Arithmetic Coding And Range Coding

We've proved in Lecture 3 that Huffman Codes are optimal symbol codes. But it turns out that we can do better than symbol codes—by thinking in fractional bits.
5
Lecture
17 May
Tutorial
24 May

Stream Codes II: Asym­met­ric Numeral Systems (ANS)

This recently invented stream code is as performant as range coding while being much easier to implement. But it has a caveat—or is it a feature?
(Yes, that's a clickbait teaser; sue me.)
6
Lecture
24 May
Tutorial
7 June

Bits-Back Coding

We generalize the so-called “bits-back trick” from the ANS algorithm to arbi­trary latent variable models. This allows us to use latent variables for data compression without paying for them. Think of it as “short selling” bits.
7
Lecture
7 June
Tutorial
14 June

Variational Inference

This method for approximate Bayesian inference is a mainstay of modern probabilistic machine learning. And—curiously—its most natural derivation actually builds on the bits-back coding algorithm.
8
Lecture
14 June
Tutorial
21 June

Variational Autoencoders

We extend variational inference and learn both the generative model from training data, and how to quickly do inference in the learned model. This results in a popular class of models for neural image and video compression.
9
Lecture
21 June
Tutorial
28 June

Lossy Compression: From VAEs to Rate/Distortion Theory

We implement our first lossy compres­sion method and observe a lower bit rate than with lossless compression. We then derive a theoretical bound for the bit rate lossy compression.
10
Lecture
28 June
Tutorial
5 July

Channel Coding Theorem and Source-Channel Separation

We take a step back from com­pres­sion and consider the wider problem of effi­cient communication. Our discussion also reveals how meaningful the theoretical bound for lossy compression is.
11
Lecture
5 July
Tutorial
12 July

Recent Advances in Machine-Learning Based Data Compression

Diffusion models, flows, quantization methods, prob­a­bi­lis­tic models for videos, ... — modern compression research is a vibrant field.
12
Lecture
12 July
Tutorial
19 July

Guest Star: Lucas Theis (Google Research)

I am excited that Lucas Theis agreed to give a guest lecture. He is a pioneer of ML-based compression, and at the fore­front of novel methods like diffusion models & universal quantization.
13
Lecture
19 July
Talk
19 July

Exam

Tentatively scheduled for the last week of the semester; I'm open to rescheduling based on students' preferences and scheduling constraints.
14
26 July (tenta­tively)