Interest in reinforcement learning (RL) has recently surged due to theapplication of deep learning techniques, but these connectionist approaches areopaque compared with symbolic systems. Learning Classifier Systems (LCSs) areevolutionary machine learning systems that can be categorised as eXplainable AI(XAI) due to their rule-based nature. Michigan LCSs are commonly used in RLdomains as the alternative Pittsburgh systems (e.g. SAMUEL) suffer from complexalgorithmic design and high computational requirements; however they canproduce more compact/interpretable solutions than Michigan systems. We aim todevelop two novel Pittsburgh LCSs to address RL domains: PPL-DL and PPL-ST. Theformer acts as a "zeroth-level" system, and the latter revisits SAMUEL's coreMonte Carlo learning mechanism for estimating rule strength. We compare our twoPittsburgh systems to the Michigan system XCS across deterministic andstochastic FrozenLake environments. Results show that PPL-ST performs on-par orbetter than PPL-DL and outperforms XCS in the presence of high levels ofenvironmental uncertainty. Rulesets evolved by PPL-ST can achieve higherperformance than those evolved by XCS, but in a more parsimonious and thereforemore interpretable fashion, albeit with higher computational cost. Thisindicates that PPL-ST is an LCS well-suited to producing explainable policiesin RL domains.