Felix Sarnthein

I am a doctoral fellow within the Max Planck ETH Center for Learning Systems (CLS) advised by Antonio Orvieto and Thomas Hofmann. As such, I am a member of both the Data Analytics Lab at ETH Zürich and the Deep Models and Optimization Group at the newly established ELLIS Institute and the MPI-IS in Tübingen.

After obtaining my bachelor’s and master’s degrees in Computer Science from ETH Zürich, I pursued research internships with Thomas Hofmann in Zürich, Nicolas Flammarion at EPFL Lausanne, and Antonio Orvieto in Tübingen.

I am generally fascinated by the learning dynamics of neural networks and the interaction of parameterization, initialization, objective, and optimization in deep learning. In particular, I’m interested in potentially self-supervised methods for long-range modeling and feature learning of sequential data. To that end, I’m currently investigating fundamental aspects of linear recurrent neural networks.

News

Jul 01, 2024	I started my PhD in the Deep Models and Optimization Group at the ELLIS Insitute Tübingen.

Blogposts

Dec 17, 2024	Linear Recurrences Accessible to Everyone: Investigating linear RNNs such as Mamba, can be challenging because they are currently not efficiently expressible in PyTorch. We propose the abstraction of linear recurrences to gain intuition for the computational structure of these emerging deep learning architectures. After deriving their parallel algorithm, we gradually build towards a simple template CUDA extension for PyTorch. We hope that making linear recurrences accessible to a wider audience inspires further research on linear-time sequence mixing.

Dec 17, 2024

Linear Recurrences Accessible to Everyone: Investigating linear RNNs such as Mamba, can be challenging because they are currently not efficiently expressible in PyTorch. We propose the abstraction of linear recurrences to gain intuition for the computational structure of these emerging deep learning architectures. After deriving their parallel algorithm, we gradually build towards a simple template CUDA extension for PyTorch. We hope that making linear recurrences accessible to a wider audience inspires further research on linear-time sequence mixing.