Natural language inference (NLI) is known as one of the central tasks innatural language processing (NLP) which encapsulates many fundamental aspectsof language understanding. With the considerable achievements of data-hungrydeep learning methods in NLP tasks, a great amount of e ort has been devoted todevelop more diverse datasets for di erent languages. In this paper, we presenta new dataset for the NLI task in the Persian language, also known as Farsi,which is one of the dominant languages in the Middle East. This dataset, namedFarsTail, includes 10,367 samples which are provided in both the Persianlanguage as well as the indexed format to be useful for non-Persianresearchers. The samples are generated from 3,539 multiple-choice questionswith the least amount of annotator interventions in a way similar to theSciTail dataset. A carefully designed multi-step process is adopted to ensurethe quality of the dataset. We also present the results of traditional andstate-of-the-art methods on FarsTail including di erent embedding methods suchas word2vec, fastText, ELMo, BERT, and LASER, as well as di erent modelingapproaches such as DecompAtt, ESIM, HBMP, and ULMFiT to provide a solidbaseline for the future research. The best obtained test accuracy is 83.38%which shows that there is a big room for improving the current methods to beuseful for real-world NLP applications in di erent languages. We alsoinvestigate the extent to which the models exploit super cial clues, also knownas dataset biases, in FarsTail, and partition the test set into easy and hardsubsets according to the success of biased models. The dataset is available athttps://github.com/dml-qom/ FarsTail.