Abstract
Machine learning interatomic potentials (MLIPs) have become increasinglyeffective at approximating quantum mechanical calculations at a fraction of thecomputational cost. However, lower errors on held out test sets do not alwaystranslate to improved results on downstream physical property prediction tasks.In this paper, we propose testing MLIPs on their practical ability to conserveenergy during molecular dynamic simulations. If passed, improved correlationsare found between test errors and their performance on physical propertyprediction tasks. We identify choices which may lead to models failing thistest, and use these observations to improve upon highly-expressive models. Theresulting model, eSEN, provides state-of-the-art results on a range of physicalproperty prediction tasks, including materials stability prediction, thermalconductivity prediction, and phonon calculations.