We propose a novel Siamese Natural Language Tracker (SNLT), which brings theadvancements in visual tracking to the tracking by natural language (NL)descriptions task. The proposed SNLT is applicable to a wide range of Siamesetrackers, providing a new class of baselines for the tracking by NL task andpromising future improvements from the advancements of Siamese trackers. Thecarefully designed architecture of the Siamese Natural Language Region ProposalNetwork (SNL-RPN), together with the Dynamic Aggregation of vision and languagemodalities, is introduced to perform the tracking by NL task. Empirical resultsover tracking benchmarks with NL annotations show that the proposed SNLTimproves Siamese trackers by 3 to 7 percentage points with a slight tradeoff ofspeed. The proposed SNLT outperforms all NL trackers to-date and is competitiveamong state-of-the-art real-time trackers on LaSOT benchmarks while running at50 frames per second on a single GPU.