We present ReasonBert, a pre-training method that augments language modelswith the ability to reason over long-range relations and multiple, possiblyhybrid contexts. Unlike existing pre-training methods that only harvestlearning signals from local contexts of naturally occurring texts, we propose ageneralized notion of distant supervision to automatically connect multiplepieces of text and tables to create pre-training examples that requirelong-range reasoning. Different types of reasoning are simulated, includingintersecting multiple pieces of evidence, bridging from one piece of evidenceto another, and detecting unanswerable cases. We conduct a comprehensiveevaluation on a variety of extractive question answering datasets ranging fromsingle-hop to multi-hop and from text-only to table-only to hybrid that requirevarious reasoning capabilities and show that ReasonBert achieves remarkableimprovement over an array of strong baselines. Few-shot experiments furtherdemonstrate that our pre-training method substantially improves sampleefficiency.