
Optimization for ML and AI Seminar with Courtney Paquette (McGill University): High-dimensional Optimization with Applications to Compute-Optimal Neural Scaling Laws
CSE 1242 and Virtual 3235 Voigt Dr, La Jolla, CA, United StatesCourtney Paquette, McGill University Abstract: Given the massive scale of modern ML models, we now only get a single shot to train them effectively. This restricts our ability to test multiple architectures and hyper-parameter configurations. Instead, we need to understand how these models scale, allowing us to experiment with smaller problems and then apply those […]