Phase 3: Model Experimentation
Introduction
This experimentation of RL models led us to a solution that can finish the first level of Sonic utilizing specifically tuned configurations of a Deep Q Learning Agent. The team then decided to tune our hyperparameters towards optimal generalization. We implemented a stochastic frame skipping to introduce randomness to the model and created several reward structures that better generalized to unseen environments. Further, we preprocessed our images using the OpenCV library to reduce our input dimensions to facilitate longer training across multiple levels. While these changes reduced performance on level 1, they align with real RL applications as the model now generalizes better to unseen environments.