Constant Regret in Online Decision-Making

Abstract: I will present a class of finite-horizon control problems, where we see a random stream of arrivals, need to select actions in each step, and where the final objective depends only on the aggregate type-action counts; this includes many widely-studied control problems including online resource-allocation, dynamic pricing, generalized assignment, online bin packing, and bandits with knapsacks. For such settings, I will introduce a unified algorithmic paradigm, and provide a simple yet general condition under which these algorithms achieve constant regret, i.e., additive loss compared to the hindsight optimal solution which is independent of the horizon and state-space. These results stem from an elementary coupling argument, which may prove useful for many other questions in online decision-making. Time permitting, I will illustrate this by showing how we can use this technique to incorporate side information and historical data in these settings, and achieve constant regret with as little as a single data trace.

The event is finished.

Date

May 11 2022
Expired!

Time

8:00 am - 6:00 pm

Leave A Reply

Your email address will not be published. Required fields are marked *