How can a computer make decisions? This course will explore the theory and algorithms behind automated decision making. We will address a number of different settings of the problem: planning sequences of actions when the world behaves in a known and deterministic way, constructing a policy that specifies reactions to different situations when the world behavioes probabilistically, and learning a policy or model when the the way the world behaves is unknown in advance. We will consider applications in areas such as robot motion planning, transportation scheduling, and computer games.
| Date | Subject | Reading | Notes | Assignment |
|---|---|---|---|---|
| 9/9 | Introduction; discrete spaces, atomic representation | AIMA3E 2, 4.1 (AIMA2E 2, 4.3); Simulated Annealing | L1 | Proj 0 |
| 9/14 | Discrete spaces, factored representation | AIMA3E 6 (AIMA2E 5) | L2 | |
| 9/16 | Continuous spaces | AIMA3E 4.2 (AIMA2E 4.4); Conjugate Gradient | L3 | Proj 1 |
| 9/21 | Path search discrete atomic spaces: review of dynamic programming, A* | AIMA3E 3; PA 2.2, 2.3 | L4 | |
| 9/23 | Path search continuous spaces: configuration space and exact methods | AIMA3E 25.4; PA 4.2, 4.3, 6 | L56 | |
| 9/28 | Path search continuous spaces: sampling-based approaches | PA 5 | ||
| 9/30 | Logic: factored and relational representations of big discrete spaces | AIMA3E 7, 8 | PS1 solutions | |
| 10/5 | Planning: situation calculus and PDDL | AIMA3E 10.1, 10.2 (AIMA2E 11.1, 11.2); PA 2.4, 2.5 | Proj 2 | |
| 10/7 | Planning: progression, regression, graphplan | AIMA3E 10.3 - 10.6 (AIMA2E 11.3 - 11.7) | PS2 solutions | |
| 10/12 | Planning: Other topics | AIMA3E 11 (AIMA2E 12) | ||
| 10/14 | Exam | |||
| 10/19 | Uncertainty about outcomes: lotteries, decision trees | AIMA3E 16.1 - 16.5 | ||
| 10/21 | Markov decision processes | AIMA3E 17.1 - 17.2 | ||
| 10/26 | More MDPs, plus factored discrete representation | AIMA3E 17.1 - 17.2; Papers | ||
| 10/28 | Reinforcement learning | AIMA3E21, Sutton and Barto | Proj 3 | |
| 11/2 | Function and policy approximation in RL | Non-linear TD, Linear Q, Policy gradient, PSDP | ||
| 11/4 | Uncertainty about state: value of information | AIMA3E 16.6, 4.4, 17.4 (AIMA2E 16.6, 3.6, 17.4) | ||
| 11/9 | POMDPs | LPK paper, Smith thesis, AIMA3E 15 | ||
| 11/11 | Holiday | |||
| 11/16 | POMDP algorithms | |||
| 11/18 | More POMDP algorithms | Papers | ||
| 11/23 | Multi-agent decision making: Frans Oliehoek | slides | PS3 (revised) | |
| 11/25 | Holiday | |||
| 11/30 | Search and learning in games: David Silver | PS3 solutions | ||
| 12/2 | Exam | |||
| 12/7 | Final project presentations | |||
| 12/9 | Final project presentations |