Invitation to the First Talk of the Vienna Circuits Series: "Understanding Transformer Generalization by Design?"

The NLP Group of the subunit Data Mining and Machine Learning (DM) warmly invites you to a talk by Yupei Du (Saarland University) on November 21 at 1:00 PM. The talk is part of the new Vienna Circuits series, a forum for research at the intersection of NLP, Computational Linguistics, and Cognitive Science.

When? Friday, November 21, 2025, 1:00 PM
Where? Kolingasse 14–16, Room No. 2.38 (live stream viewing)
Zoom Link: https://univienna.zoom.us/j/62781955052?pwd=BzvFVbt1fsjcKXVQ5fndbihrBINucK.1

 

Title: Understanding Transformer Generalization by Design?

Abstract
This talk presents two recent studies on how transformers acquire and express generalizable reasoning ability. The first investigates implicit multi-hop reasoning, showing that models can perform multi-step inference without explicit chains—but only when the training data grows exponentially with reasoning depth and the number of layers scales linearly. The second examines memorization under label noise and finds that transformers can build on generalization to memorize noise. Together, these results illustrate how architectural depth, data scale, training dynamics, and inductive bias jointly shape the boundary between generalization and memorization in language models.

Speaker
Yupei Du is a postdoctoral researcher at Saarland University, advised by Prof. Alexander Koller. He works on NLP and ML, with a current focus on understanding and improving LLM reasoning. He finished his PhD at Utrecht University, advised by Dr. Dong Nguyen and Prof. Albert Gatt. Before that, he received his master’s (Computer Science, advised by Dr. Yuanbin Wu) and bachelor’s (Psychology) degrees both from East China Normal University.

 

If you would like to be updated about future talks in the Vienna Circuits series, you can subscribe to the series'  » mailing list.

Portrait of Yupei Du holding a dog

© Yupei Du