Abstract
Faced with an ever-increasing complexity of their domains of application,artificial learning agents are now able to scale up in their ability to processan overwhelming amount of information coming from their interaction with anenvironment. However, this process of scaling does come with a cost of encodingand processing an increasing amount of redundant information that is notnecessarily beneficial to the learning process itself. This work exploits theproperties of the learning systems defined over partially observable domains byselectively focusing on the specific type of information that is more likely toexpress the causal interaction among the transitioning states of theenvironment. Adaptive masking of the observation space based on the\textit{temporal difference displacement} criterion enabled a significantimprovement in convergence of temporal difference algorithms defined over apartially observable Markov process.