BEGIN:VCALENDAR
VERSION:2.0
PRODID:-// - ECPv6.16.2//NONSGML v1.0//EN
CALSCALE:GREGORIAN
METHOD:PUBLISH
X-ORIGINAL-URL:https://tilos.ai
X-WR-CALDESC:Events for 
REFRESH-INTERVAL;VALUE=DURATION:PT1H
X-Robots-Tag:noindex
X-PUBLISHED-TTL:PT1H
BEGIN:VTIMEZONE
TZID:America/Los_Angeles
BEGIN:DAYLIGHT
TZOFFSETFROM:-0800
TZOFFSETTO:-0700
TZNAME:PDT
DTSTART:20250309T100000
END:DAYLIGHT
BEGIN:STANDARD
TZOFFSETFROM:-0700
TZOFFSETTO:-0800
TZNAME:PST
DTSTART:20251102T090000
END:STANDARD
BEGIN:DAYLIGHT
TZOFFSETFROM:-0800
TZOFFSETTO:-0700
TZNAME:PDT
DTSTART:20260308T100000
END:DAYLIGHT
BEGIN:STANDARD
TZOFFSETFROM:-0700
TZOFFSETTO:-0800
TZNAME:PST
DTSTART:20261101T090000
END:STANDARD
BEGIN:DAYLIGHT
TZOFFSETFROM:-0800
TZOFFSETTO:-0700
TZNAME:PDT
DTSTART:20270314T100000
END:DAYLIGHT
BEGIN:STANDARD
TZOFFSETFROM:-0700
TZOFFSETTO:-0800
TZNAME:PST
DTSTART:20271107T090000
END:STANDARD
END:VTIMEZONE
BEGIN:VEVENT
DTSTART;TZID=America/Los_Angeles:20260313T100000
DTEND;TZID=America/Los_Angeles:20260313T110000
DTSTAMP:20260525T031833
CREATED:20251014T200527Z
LAST-MODIFIED:20260313T183553Z
UID:7665-1773396000-1773399600@tilos.ai
SUMMARY:Optimization for ML and AI Seminar: Transformers Learn Generalizable Chain-of-Thought Reasoning via Gradient Descent
DESCRIPTION:Yuejie Chi\, Yale \nAbstract: Transformers have demonstrated remarkable chain-of-thought reasoning capabilities\, yet\, the underlying mechanisms by which they acquire and extrapolate these capabilities remain limited. This talk presents a theoretical analysis of transformers trained via gradient descent for symbolic reasoning and state tracking tasks with increasing problem complexity. Our analysis reveals the coordination of multi-head attention to solve multiple subtasks in a single autoregressive path\, and the bootstrapping of inherently sequential reasoning through recursive self-training curriculum. Our optimization-based guarantees demonstrate that even shallow multi-head transformers\, with chain-of-thought\, can be trained to effectively solve problems that would otherwise require deeper architectures. \n\nYuejie Chi is the Charles C. and Dorothea S. Dilley Professor of Statistics and Data Science at Yale University\, with a secondary appointment in Computer Science\, and a member of the Yale Institute for Foundations of Data Science. Before joining Yale\, Dr. Chi was the Sense of Wonder Group Endowed Professor of Electrical and Computer Engineering in AI Systems at Carnegie Melon University\, with affiliation in MLD and CyLab. She also spent some time as a visiting researcher at Meta’s Fundamental AI Research (FAIR). Dr. Yue’s research interests lie in the theoretical and algorithmic foundations of data science\, generative AI\, reinforcement learning\, and signal processing\, motivated by applications in scientific and engineering domains. Her current focus is on improving the performance\, efficiency and reliability of generative AI and decision making\, driven by data-intensive but resource-constrained scenarios.
URL:https://tilos.ai/event/optimization-for-ml-and-ai-seminar-transformers-learn-generalizable-chain-of-thought-reasoning-via-gradient-descent/
LOCATION:HDSI 123 and Virtual\, 3234 Matthews Ln\, La Jolla\, CA\, 92093\, United States
CATEGORIES:TILOS Seminar Series,TILOS Sponsored Event
ATTACH;FMTTYPE=image/jpeg:https://tilos.ai/wp-content/uploads/2025/10/chi-yuejie-e1760472307997.jpeg
END:VEVENT
END:VCALENDAR