BEGIN:VCALENDAR
VERSION:2.0
PRODID:-// - ECPv6.15.18//NONSGML v1.0//EN
CALSCALE:GREGORIAN
METHOD:PUBLISH
X-ORIGINAL-URL:https://tilos.ai
X-WR-CALDESC:Events for 
REFRESH-INTERVAL;VALUE=DURATION:PT1H
X-Robots-Tag:noindex
X-PUBLISHED-TTL:PT1H
BEGIN:VTIMEZONE
TZID:America/Los_Angeles
BEGIN:DAYLIGHT
TZOFFSETFROM:-0800
TZOFFSETTO:-0700
TZNAME:PDT
DTSTART:20240310T100000
END:DAYLIGHT
BEGIN:STANDARD
TZOFFSETFROM:-0700
TZOFFSETTO:-0800
TZNAME:PST
DTSTART:20241103T090000
END:STANDARD
BEGIN:DAYLIGHT
TZOFFSETFROM:-0800
TZOFFSETTO:-0700
TZNAME:PDT
DTSTART:20250309T100000
END:DAYLIGHT
BEGIN:STANDARD
TZOFFSETFROM:-0700
TZOFFSETTO:-0800
TZNAME:PST
DTSTART:20251102T090000
END:STANDARD
BEGIN:DAYLIGHT
TZOFFSETFROM:-0800
TZOFFSETTO:-0700
TZNAME:PDT
DTSTART:20260308T100000
END:DAYLIGHT
BEGIN:STANDARD
TZOFFSETFROM:-0700
TZOFFSETTO:-0800
TZNAME:PST
DTSTART:20261101T090000
END:STANDARD
END:VTIMEZONE
BEGIN:VEVENT
DTSTART;TZID=America/Los_Angeles:20251024T110000
DTEND;TZID=America/Los_Angeles:20251024T120000
DTSTAMP:20260404T033334
CREATED:20250925T175700Z
LAST-MODIFIED:20260304T210610Z
UID:7611-1761303600-1761307200@tilos.ai
SUMMARY:Optimization for ML and AI Seminar: High-dimensional Optimization with Applications to Compute-Optimal Neural Scaling Laws
DESCRIPTION:Courtney Paquette\, McGill University \nAbstract: Given the massive scale of modern ML models\, we now only get a single shot to train them effectively. This restricts our ability to test multiple architectures and hyper-parameter configurations. Instead\, we need to understand how these models scale\, allowing us to experiment with smaller problems and then apply those insights to larger-scale models. In this talk\, I will present a framework for analyzing scaling laws in stochastic learning algorithms using a power-law random features model (PLRF)\, leveraging high-dimensional probability and random matrix theory. I will then use this scaling law to address the compute-optimal question: How should we choose model size and hyper-parameters to achieve the best possible performance in the most compute-efficient manner? Then using this PLRF model\, I will devise a new momentum-based algorithm that (provably) improves the scaling law exponent. Finally\, I will present some numerical experiments on LSTMs that show how this new stochastic algorithm can be applied to real data to improve the compute-optimal exponent. \n\nCourtney Paquette is an assistant professor at McGill University in the Mathematics and Statistics department\, a CIFAR AI Chair (MILA)\, and an active member of the Montreal Machine Learning Optimization Group (MTL MLOpt) at MILA. Her research broadly focuses on designing and analyzing algorithms for large-scale optimization problems\, motivated by applications in data science\, and using techniques that draw from a variety of fields\, including probability\, complexity theory\, and convex and nonsmooth analysis. Dr. Paquette is a lead organizer of the OPT-ML Workshop at NeurIPS since 2020\, and a lead organizer (and original creator) of the High-dimensional Learning Dynamics (HiLD) Workshop at ICML.
URL:https://tilos.ai/event/optimization-for-ml-and-ai-seminar-with-courtney-paquette-mcgill-university/
LOCATION:CSE 1242 and Virtual\, 3235 Voigt Dr\, La Jolla\, CA\, 92093\, United States
CATEGORIES:TILOS Seminar Series,TILOS Sponsored Event
ATTACH;FMTTYPE=image/jpeg:https://tilos.ai/wp-content/uploads/2025/09/paquette-courtney-scaled-e1758822988381.jpg
END:VEVENT
END:VCALENDAR