Due:April 24, 2008 23:59:59 EST
Please submit via tsquare.
The assignment is worth 8% of your final grade.
In some sense, we have spent the semester thinking about machine learning techniques for various forms of function approximation. It's now time to think about using what we've learned in order to allow an agent of some kind to act in the world more directly. This assignment asks you to use consider the application of some of the techniques we've learned from reinforcement learning to making decisions.
The same ground rules apply for programming languages as with the previous assignments.
Read everything below carefully!
You are being asked to explore Markov Decision Processes (MDPs) in the following way:
Come up with two interesting MDPs. Explain why they are interesting. They don't need to be overly complicated or directly grounded in a real situation, but it will be worthwhile if your MDPs are inspired by some process you are interested in or are familiar with. It's ok to keep it somewhat simple. For the purposes of this assignment, though, make sure one has a "small" number of states, and the other has a "large" number of states. Read below for more on how you should design the MDPs.
Solve each MDP using value iteration as well as policy iteration. How many iterations does it take to converge? Which one converges faster? Why? Do they converge to the same answer? How did the number of states affect things, if at all?
You must submit a tar or zip file named yourgtaccount.{zip,tar,tar.gz} that contains a single folder or directory named yourgtaccount that in turn contains:
The file analysis.pdf should contain:
As always you are being graded on your analysis more than anything else.