Planning & Learning in Robotics
This course covers optimal control fundamentals and their application to motion planning and decision making in robotics. Topics include Markov decision processes (MDPs), dynamic programming, search-based and sampling-based motion planning, value and policy iteration, linear quadratic regulation (LQR), and model-free reinforcement learning.
- Introduction
- Topic 1: Markov Chains
- Absorbing Markov chains
- Ergodic Markov chains
- Topic 2: Markov Decision Processes
- Open-loop vs. closed-loop control
- Partially observable models
- Topic 3: Dynamic Programming
- Dynamic programming algorithm
- Example: chess
- Example: nonlinear system control
- Topic 4: Deterministic Shortest Path
- The deterministic shortest path (DSP) problem
- Label correcting methods for the DSP problem
- Topic 5: Configuration Space
- Motion planning
- Configuration space
- Graph construction for motion planning

- Topic 6: Search-based Motion Planning
- Label correcting algorithm
- Dijkstra's algorithm
- A* algorithm
- Jump point search
- Topic 7: Anytime, Incremental, and Agent-centered Search
- Agent-centered search
- Anytime search
- Incremental search
- Topic 8: Sampling-based Motion Planning
- Search-based vs. sampling-based planning
- Probabilistic roadmaps
- Rapidly exploring random tree (RRT)
- RRT*
- Topic 9: Infinite-Horizon Optimal Control
- Bellman equations
- Policy evaluation
- Value iteration
- Policy iteration
- Linear programming
- Topic 10: Model-Free Prediction
- Model-free policy evaluation
- Monte Carlo policy evaluation
- Temporal difference policy evaluation
- Topic 11: Model-Free Control
- Model-free policy iteration
- Monte Carlo policy iteration
- Temporal difference policy iteration
- Batch Q-value iteration
- Topic 12: Value Function Approximation
- Incremental methods
- Batch methods
- Topic 13: Linear Quadratic Control
- Pontryagin's minimum principle (PMP)
- Linear quadratic regulator (LQR)
- Linear quadratic gaussian
- LQR methods for deterministic control
- Topic 14: Continuous-Time Optimal Control
- Continuous-time PMP
- Continuous-time LQR