Traditional first-order logic (FOL) reasoning systems usually rely on manualheuristics for proof guidance. We propose TRAIL: a system that learns toperform proof guidance using reinforcement learning. A key design principle ofour system is that it is general enough to allow transfer to problems indifferent domains that do not share the same vocabulary of the training set. Todo so, we developed a novel representation of the internal state of a prover interms of clauses and inference actions, and a novel neural-based attentionmechanism to learn interactions between clauses. We demonstrate that thisapproach enables the system to generalize from training to test data acrossdomains with different vocabularies, suggesting that the neural architecture inTRAIL is well suited for representing and processing of logical formalisms.