Recently, deep reinforcement learning (RL) has been proposed to learn thetractography procedure and train agents to reconstruct the structure of thewhite matter without manually curated reference streamlines. While theperformances reported were competitive, the proposed framework is complex, andlittle is still known about the role and impact of its multiple parts. In thiswork, we thoroughly explore the different components of the proposed framework,such as the choice of the RL algorithm, seeding strategy, the input signal andreward function, and shed light on their impact. Approximately 7,400 modelswere trained for this work, totalling nearly 41,000 hours of GPU time. Our goalis to guide researchers eager to explore the possibilities of deep RL fortractography by exposing what works and what does not work with the category ofapproach. As such, we ultimately propose a series of recommendations concerningthe choice of RL algorithm, the input to the agents, the reward function andmore to help future work using reinforcement learning for tractography. We alsorelease the open source codebase, trained models, and datasets for users andresearchers wanting to explore reinforcement learning for tractography.