Recent advances in probabilistic modelling have led to a large number ofsimulation-based inference algorithms which do not require numerical evaluationof likelihoods. However, a public benchmark with appropriate performancemetrics for such 'likelihood-free' algorithms has been lacking. This has madeit difficult to compare algorithms and identify their strengths and weaknesses.We set out to fill this gap: We provide a benchmark with inference tasks andsuitable performance metrics, with an initial selection of algorithms includingrecent approaches employing neural networks and classical Approximate BayesianComputation methods. We found that the choice of performance metric iscritical, that even state-of-the-art algorithms have substantial room forimprovement, and that sequential estimation improves sample efficiency. Neuralnetwork-based approaches generally exhibit better performance, but there is nouniformly best algorithm. We provide practical advice and highlight thepotential of the benchmark to diagnose problems and improve algorithms. Theresults can be explored interactively on a companion website. All code is opensource, making it possible to contribute further benchmark tasks and inferencealgorithms.