This document defines the mathematical framework of Markov decision processes (MDPs) and their solution via value iteration. It introduces MDPs as tuples defining states, actions, transition probabilities, and rewards. It defines the value function and optimal value function, and describes how value iteration iteratively applies an operator B* to converge to the optimal value function V* from any starting value function. The optimal policy is then derived from V*.