6.882: Planning and Decision Making

How can a computer make decisions? This course will explore the theory and algorithms behind automated decision making. We will address a number of different settings of the problem: planning sequences of actions when the world behaves in a known and deterministic way, constructing a policy that specifies reactions to different situations when the world behavioes probabilistically, and learning a policy or model when the the way the world behaves is unknown in advance. We will consider applications in areas such as robot motion planning, transportation scheduling, and computer games.

Time: T, Th: 1 - 2:30
Location: 36-156
Prereqs:6.041 and 6.006 (or permission of the instructor)
Credit: 12 Units, Grad H
Instructor: Leslie Pack Kaelbling (lpk@mit.edu)
Textbook:
- Artificial Intelligence: A Modern Approach, Russell and Norvig. Third edition is desirable, but second is okay.
- Planning Algorithms, Lavalle. Available online

Grading

3 small projects: 10% each
1 large project: 30%
2 exams: 20% each

Calendar

Exam dates are firm; topics subject to change

Date Subject Reading Notes Assignment

9/9 Introduction; discrete spaces, atomic representation AIMA3E 2, 4.1 (AIMA2E 2, 4.3); Simulated Annealing L1 Proj 0

9/14 Discrete spaces, factored representation AIMA3E 6 (AIMA2E 5) L2

9/16 Continuous spaces AIMA3E 4.2 (AIMA2E 4.4); Conjugate Gradient L3 Proj 1

9/21 Path search discrete atomic spaces: review of dynamic programming, A* AIMA3E 3; PA 2.2, 2.3 L4

9/23 Path search continuous spaces: configuration space and exact methods AIMA3E 25.4; PA 4.2, 4.3, 6 L56

9/28 Path search continuous spaces: sampling-based approaches PA 5

9/30 Logic: factored and relational representations of big discrete spaces AIMA3E 7, 8 PS1 solutions

10/5 Planning: situation calculus and PDDL AIMA3E 10.1, 10.2 (AIMA2E 11.1, 11.2); PA 2.4, 2.5 Proj 2

10/7 Planning: progression, regression, graphplan AIMA3E 10.3 - 10.6 (AIMA2E 11.3 - 11.7) PS2 solutions

10/12 Planning: Other topics AIMA3E 11 (AIMA2E 12)

10/14 Exam

10/19 Uncertainty about outcomes: lotteries, decision trees AIMA3E 16.1 - 16.5

10/21 Markov decision processes AIMA3E 17.1 - 17.2

10/26 More MDPs, plus factored discrete representation AIMA3E 17.1 - 17.2; Papers

10/28 Reinforcement learning AIMA3E21, Sutton and Barto Proj 3

11/2 Function and policy approximation in RL Non-linear TD, Linear Q, Policy gradient, PSDP

11/4 Uncertainty about state: value of information AIMA3E 16.6, 4.4, 17.4 (AIMA2E 16.6, 3.6, 17.4)

11/9 POMDPs LPK paper, Smith thesis, AIMA3E 15

11/11 Holiday

11/16 POMDP algorithms

11/18 More POMDP algorithms Papers

11/23 Multi-agent decision making: Frans Oliehoek slides PS3 (revised)

11/25 Holiday

11/30 Search and learning in games: David Silver PS3 solutions

12/2 Exam

12/7 Final project presentations

12/9 Final project presentations

Date	Subject	Reading	Notes	Assignment
9/9	Introduction; discrete spaces, atomic representation	AIMA3E 2, 4.1 (AIMA2E 2, 4.3); Simulated Annealing	L1	Proj 0
9/14	Discrete spaces, factored representation	AIMA3E 6 (AIMA2E 5)	L2
9/16	Continuous spaces	AIMA3E 4.2 (AIMA2E 4.4); Conjugate Gradient	L3	Proj 1
9/21	Path search discrete atomic spaces: review of dynamic programming, A*	AIMA3E 3; PA 2.2, 2.3	L4
9/23	Path search continuous spaces: configuration space and exact methods	AIMA3E 25.4; PA 4.2, 4.3, 6	L56
9/28	Path search continuous spaces: sampling-based approaches	PA 5
9/30	Logic: factored and relational representations of big discrete spaces	AIMA3E 7, 8		PS1 solutions
10/5	Planning: situation calculus and PDDL	AIMA3E 10.1, 10.2 (AIMA2E 11.1, 11.2); PA 2.4, 2.5		Proj 2
10/7	Planning: progression, regression, graphplan	AIMA3E 10.3 - 10.6 (AIMA2E 11.3 - 11.7)		PS2 solutions
10/12	Planning: Other topics	AIMA3E 11 (AIMA2E 12)
10/14	Exam
10/19	Uncertainty about outcomes: lotteries, decision trees	AIMA3E 16.1 - 16.5
10/21	Markov decision processes	AIMA3E 17.1 - 17.2
10/26	More MDPs, plus factored discrete representation	AIMA3E 17.1 - 17.2; Papers
10/28	Reinforcement learning	AIMA3E21, Sutton and Barto		Proj 3
11/2	Function and policy approximation in RL	Non-linear TD, Linear Q, Policy gradient, PSDP
11/4	Uncertainty about state: value of information	AIMA3E 16.6, 4.4, 17.4 (AIMA2E 16.6, 3.6, 17.4)
11/9	POMDPs	LPK paper, Smith thesis, AIMA3E 15
11/11	Holiday
11/16	POMDP algorithms
11/18	More POMDP algorithms	Papers
11/23	Multi-agent decision making: Frans Oliehoek		slides	PS3 (revised)
11/25	Holiday
11/30	Search and learning in games: David Silver			PS3 solutions
12/2	Exam
12/7	Final project presentations
12/9	Final project presentations