Representations and Control in Atari Games Using Reinforcement Learning
Arcade Learning Environment (ALE) is a challenging framework composed of dozens of qualita- tively different Atari 2600 games and thus an ideal testbed for general AI competency. Similar to many other complex reinforcement learning domains, finding a good representation to predict ex- pected cumulative rewards has been proven to be the key in success in ALE. Many recent approaches utilize non-linear function approximations and neural networks, which incur a high computational cost. This thesis presents a simple, computationally practical, linear feature representation Blob- PROST (blob pairwise offsets in space and time) whose performance is competitive compared with the current state-of-the art results generated by Deep Q-Networks (DQN). In addition, we provide a simple and reproducible benchmark for the sake of comparison to future work in the ALE. More- over this thesis tries to address two major drawbacks inherent in linear function approximations: 1) finding “right” sets of features itself is challenging and 2) usually there exists a subset of features that captures the most representation power while others are just “useless”. In order to address the two issues, a new framework called A-BPROS (adaptive blob pairwise offsets in space) which is inspired and built upon Blob-PROST, is developed as a method for feature expansion in ALE. Initial results suggest in term of representation power more work needs to be done to make A-BPROS as competitive as Blob-PORST while in terms of memory saving, A-BPROS is promising. Finally, future works are discussed.
Franklin and Marshall College Archives, Undergraduate Honors Thesis 2016
- F&M Theses Collection