6.882: Planning and Decision Making


How can a computer make decisions? This course will explore the theory and algorithms behind automated decision making. We will address a number of different settings of the problem: planning sequences of actions when the world behaves in a known and deterministic way, constructing a policy that specifies reactions to different situations when the world behavioes probabilistically, and learning a policy or model when the the way the world behaves is unknown in advance. We will consider applications in areas such as robot motion planning, transportation scheduling, and computer games.


Grading


Calendar

Exam dates are firm; topics subject to change

Date Subject Reading Notes Assignment
9/9 Introduction; discrete spaces, atomic representation AIMA3E 2, 4.1 (AIMA2E 2, 4.3); Simulated Annealing L1 Proj 0
9/14 Discrete spaces, factored representation AIMA3E 6 (AIMA2E 5) L2
9/16 Continuous spaces AIMA3E 4.2 (AIMA2E 4.4); Conjugate Gradient L3 Proj 1
9/21 Path search discrete atomic spaces: review of dynamic programming, A* AIMA3E 3; PA 2.2, 2.3 L4
9/23 Path search continuous spaces: configuration space and exact methods AIMA3E 25.4; PA 4.2, 4.3, 6 L56
9/28 Path search continuous spaces: sampling-based approaches PA 5
9/30 Logic: factored and relational representations of big discrete spaces AIMA3E 7, 8 PS1 solutions
10/5 Planning: situation calculus and PDDL AIMA3E 10.1, 10.2 (AIMA2E 11.1, 11.2); PA 2.4, 2.5 Proj 2
10/7 Planning: progression, regression, graphplan AIMA3E 10.3 - 10.6 (AIMA2E 11.3 - 11.7) PS2 solutions
10/12 Planning: Other topics AIMA3E 11 (AIMA2E 12)
10/14 Exam
10/19 Uncertainty about outcomes: lotteries, decision trees AIMA3E 16.1 - 16.5
10/21 Markov decision processes AIMA3E 17.1 - 17.2
10/26 More MDPs, plus factored discrete representation AIMA3E 17.1 - 17.2; Papers
10/28 Reinforcement learning AIMA3E21, Sutton and Barto Proj 3
11/2 Function and policy approximation in RL Non-linear TD, Linear Q, Policy gradient, PSDP
11/4 Uncertainty about state: value of information AIMA3E 16.6, 4.4, 17.4 (AIMA2E 16.6, 3.6, 17.4)
11/9 POMDPs LPK paper, Smith thesis, AIMA3E 15
11/11 Holiday
11/16 POMDP algorithms
11/18 More POMDP algorithms Papers
11/23 Multi-agent decision making: Frans Oliehoek slides PS3 (revised)
11/25 Holiday
11/30 Search and learning in games: David Silver PS3 solutions
12/2 Exam
12/7 Final project presentations
12/9 Final project presentations