Einladung zum ersten Talk der Vienna Circuits-Reihe: "Understanding Transformer Generalization by Design?"

Die NLP Group der Subeinheit Data Mining and Machine Learning (DM) lädt sehr herzlich zum Talk von Yupei Du (Universität des Saarlandes) am 21. November, um 13:00 Uhr, ein. Der Vortrag findet im Rahmen der neuen Vienna Circuits-Reihe statt, einem Forum für Forschung an der Schnittstelle von NLP, Computerlinguistik und Kognitionswissenschaft.

Wann? Freitag, 21.11.2025, 13:00 Uhr
Wo? Kolingasse 14–16, Raum Nr. 2.38 (live stream viewing)
Zoom-Link: https://univienna.zoom.us/j/62781955052?pwd=BzvFVbt1fsjcKXVQ5fndbihrBINucK.1

 

Titel: Understanding Transformer Generalization by Design?

Abstract

This talk presents two recent studies on how transformers acquire and express generalizable reasoning ability. The first investigates implicit multi-hop reasoning, showing that models can perform multi-step inference without explicit chains—but only when the training data grows exponentially with reasoning depth and the number of layers scales linearly. The second examines memorization under label noise and finds that transformers can build on generalization to memorize noise. Together, these results illustrate how architectural depth, data scale, training dynamics, and inductive bias jointly shape the boundary between generalization and memorization in language models.

Vortragender

Yupei Du is a postdoctoral researcher at Saarland University, advised by Prof. Alexander Koller. He works on NLP and ML, with a current focus on understanding and improving LLM reasoning. He finished his PhD at Utrecht University, advised by Dr. Dong Nguyen and Prof. Albert Gatt. Before that, he received his master’s (Computer Science, advised by Dr. Yuanbin Wu) and bachelor’s (Psychology) degrees both from East China Normal University.


Wenn Sie über zukünftige Vorträge der Vienna Circuits-Reihe informiert werden möchten, können Sie sich in die » Mailingliste zur Reihe eintragen.

Portrait von Yupei Du mit kleinem Hund

© Yupei Du