Nonconvex Optimization in Deep Learning

Nonconvex Optimization in Deep Learning Sharpness-Aware Minimization (SAM) is a recently proposed gradient-based optimizer (Foret et al., ICLR 2021) that greatly improves the prediction performance of deep neural networks. Consequently, there has been a surge of interest in explaining its empirical success. In their work on the crucial role of normalization in sharpness-aware minimization, Suvrit […]

Read More

Dynamic Decisions Under Uncertainty

Dynamic Decisions Under Uncertainty The Effect of Delayed Feedback for Reinforcement Learning with Function Approximation Recent studies in reinforcement learning (RL) have made significant progress by leveraging function approximation to alleviate the sample complexity hurdle for better performance. Despite the success, existing provably efficient algorithms typically rely on the accessibility of immediate feedback upon taking […]

Read More

Learning Ultrametric Trees for Optimal Transport Regression

Learning Ultrametric Trees for Optimal Transport Regression Optimal transport provides a metric which quantifies the dissimilarity between probability measures. For measures supported in discrete metric spaces, finding optimal transport distance has cubic time complexity in the size of the space. However, measures supported on trees admit a closed-form optimal transport which can be computed in […]

Read More

Optimization for Overparametrized Models

Optimization for Overparametrized Models Recently, there has been a surge in interest in developing optimization algorithms for overparameterized models as achieving generalization is believed to require algorithms with suitable biases. This interest centers on minimizing sharpness of the original loss function; the Sharpness-Aware Minimization (SAM) algorithm has proven effective. However, existing literature focuses on only […]

Read More

Nonconvex Optimization and Transformer Architectures

Nonconvex Optimization and Transformer Architectures Deciding whether saddle points exist or are approximable for nonconvex-nonconcave problems is usually intractable. Zhang, Zhang & Sra [SIAM Journal on Optimization 2023] takes a step toward understanding a broad class of nonconvex-nonconcave minimax problems that do remain tractable. Specifically, it studies minimax problems in geodesic metric spaces. The first […]

Read More

Symplectic Optimization

Symplectic Optimization Geometric numerical integration has recently been exploited to design symplectic accelerated optimization algorithms by simulating the Bregman Lagrangian and Hamiltonian systems from the variational framework introduced by Wibisono et al. In Duruisseaux & Leok [OMS 2023] the authors discuss practical considerations that can significantly boost the computational performance of these optimization algorithms, and […]

Read More

Sampling for Constrained Distributions with Applications

Sampling for Constrained Distributions with Applications Mangoubi and Vishnoi [COLT 2023] considers the problem of approximating a d×d covariance matrix M with a rank-k matrix under differential privacy constraint. The authors present and analyze a complex variant of the Gaussian mechanism and give the optimal bound on the Frobenius norm of the difference between the […]

Read More

FedCE: Federated Certainty Equivalence Control for Linear Gaussian Systems

FedCE: Federated Certainty Equivalence Control for Linear Gaussian Systems Decentralized multi-agent systems are ubiquitous across various applications, such as decentralized control of robots and drones, decentralized autonomous vehicles, and non-cooperative games, among others. Extensive research in the literature has focused on decentralized multi-agent systems with known system dynamics, exploring various frameworks, such as decentralized optimal […]

Read More

Personalized Federated Learning via Data-centric Regularization

Personalized Federated Learning via Data-centric Regularization Federated learning is a large-scale machine learning training paradigm where data is distributed across clients, and can be highly heterogeneous from one client to another. To ensure personalization in client models, and at the same time to ensure that the local models have enough commonality (i.e., prevent “client-drift”), it […]

Read More

Extrapolation

Extrapolation An important question, in learning for optimization and deep learning more generally, is the question of extrapolation, e.g. the behavior of the model under distribution shifts. We analyzed conditions under which graph neural networks for sparse graphs extrapolate to larger graphs, and we drew connections between in-context learning and adaptation to different environments. Can […]

Read More