Abstract
Reinforcement learning solves optimal control and sequential decisionproblems widely found in control systems engineering, robotics, and artificialintelligence. This work investigates optimal control over a sequence of naturalimages. The problem is formalized, and general conditions are derived for animage to be sufficient for implementing an optimal policy. Reinforcementlearning is shown to be efficient only for certain types of imagerepresentations. This is demonstrated by developing a reinforcement learningbenchmark that scales easily with number of states and length of horizon, andhas optimal policies that are easily distinguished from suboptimal policies.Image representations given by overcomplete sparse codes are found to becomputationally efficient for optimal control, using fewer computationalresources to learn and evaluate optimal policies. For natural images of fixedsize, representing each image as an overcomplete sparse code in a linearnetwork is shown to increase network storage capacity by orders of magnitudebeyond that possible for any complete code, allowing larger tasks with manymore states to be solved. Sparse codes can be generated by devices with lowenergy requirements and low computational overhead.