Information theory unifies atomistic machine learning, uncertainty quantification, and materials thermodynamics

  • 2024-04-18 18:50:15
  • Daniel Schwalbe-Koda, Sebastien Hamel, Babak Sadigh, Fei Zhou, Vincenzo Lordi
  • 0

Abstract

An accurate description of information is relevant for a range of problems inatomistic modeling, such as sampling methods, detecting rare events, analyzingdatasets, or performing uncertainty quantification (UQ) in machine learning(ML)-driven simulations. Although individual methods have been proposed foreach of these tasks, they lack a common theoretical background integratingtheir solutions. Here, we introduce an information theoretical framework thatunifies predictions of phase transformations, kinetic events, datasetoptimality, and model-free UQ from atomistic simulations, thus bridgingmaterials modeling, ML, and statistical mechanics. We first demonstrate that,for a proposed representation, the information entropy of a distribution ofatom-centered environments is a surrogate value for thermodynamic entropy.Using molecular dynamics (MD) simulations, we show that information entropydifferences from trajectories can be used to build phase diagrams, identifyrare events, and recover classical theories of nucleation. Building on theseresults, we use this general concept of entropy to quantify information indatasets for ML interatomic potentials (IPs), informing compression, explainingtrends in testing errors, and evaluating the efficiency of active learningstrategies. Finally, we propose a model-free UQ method for MLIPs usinginformation entropy, showing it reliably detects extrapolation regimes, scalesto millions of atoms, and goes beyond model errors. This method is madeavailable as the package QUESTS: Quick Uncertainty and Entropy via STructuralSimilarity, providing a new unifying theory for data-driven atomistic modelingand combining efforts in ML, first-principles thermodynamics, and simulations.

 

Quick Read (beta)

loading the full paper ...