SOTAVerified

EGraFFBench: Evaluation of Equivariant Graph Neural Network Force Fields for Atomistic Simulations

2023-10-03Unverified0· sign in to hype

Vaibhav Bihani, Utkarsh Pratiush, Sajid Mannan, Tao Du, Zhimin Chen, Santiago Miret, Matthieu Micoulaut, Morten M Smedskjaer, Sayan Ranu, N M Anoop Krishnan

Unverified — Be the first to reproduce this paper.

Reproduce

Abstract

Equivariant graph neural networks force fields (EGraFFs) have shown great promise in modelling complex interactions in atomic systems by exploiting the graphs' inherent symmetries. Recent works have led to a surge in the development of novel architectures that incorporate equivariance-based inductive biases alongside architectural innovations like graph transformers and message passing to model atomic interactions. However, thorough evaluations of these deploying EGraFFs for the downstream task of real-world atomistic simulations, is lacking. To this end, here we perform a systematic benchmarking of 6 EGraFF algorithms (NequIP, Allegro, BOTNet, MACE, Equiformer, TorchMDNet), with the aim of understanding their capabilities and limitations for realistic atomistic simulations. In addition to our thorough evaluation and analysis on eight existing datasets based on the benchmarking literature, we release two new benchmark datasets, propose four new metrics, and three challenging tasks. The new datasets and tasks evaluate the performance of EGraFF to out-of-distribution data, in terms of different crystal structures, temperatures, and new molecules. Interestingly, evaluation of the EGraFF models based on dynamic simulations reveals that having a lower error on energy or force does not guarantee stable or reliable simulation or faithful replication of the atomic structures. Moreover, we find that no model clearly outperforms other models on all datasets and tasks. Importantly, we show that the performance of all the models on out-of-distribution datasets is unreliable, pointing to the need for the development of a foundation model for force fields that can be used in real-world simulations. In summary, this work establishes a rigorous framework for evaluating machine learning force fields in the context of atomic simulations and points to open research challenges within this domain.

Tasks

Benchmark Results

DatasetModelMetricClaimedVerifiedStatus
3BPANequIPMAE3.15Unverified
3BPAMACEMAE4Unverified
3BPAAllegroMAE4.13Unverified
3BPABOTNetMAE5Unverified
AcetylacetoneAllegroMAE0.92Unverified
AcetylacetoneNequIPMAE1.38Unverified
AcetylacetoneMACEMAE2Unverified
AcetylacetoneBOTNetMAE2Unverified
AspirinAllegroMAE14.36Unverified
AspirinMACEMAE13.79Unverified
AspirinBOTNetMAE12.63Unverified
AspirinNequIPMAE9.27Unverified
EthanolMACEMAE209.96Unverified
EthanolBOTNetMAE203.83Unverified
EthanolAllegroMAE6.94Unverified
EthanolNequIPMAE4.99Unverified
GeTeBOTNetMAE3,034Unverified
GeTeMACEMAE2,670Unverified
GeTeNequIPMAE1,780.95Unverified
GeTeAllegroMAE1,009.4Unverified
LiPSNequIPMAE165.43Unverified
LiPSAllegroMAE31.75Unverified
LiPSMACEMAE30Unverified
LiPSBOTNetMAE28Unverified
LiPS20BOTNetMAE24.59Unverified
LiPS20NequIPMAE26.8Unverified
LiPS20AllegroMAE33.17Unverified
LiPS20MACEMAE14.05Unverified
NaphthaleneBOTNetMAE182.55Unverified
NaphthaleneNequIPMAE2.66Unverified
NaphthaleneAllegroMAE5.82Unverified
NaphthaleneMACEMAE161.74Unverified
Salicylic AcidBOTNetMAE153.06Unverified
Salicylic AcidMACEMAE165.29Unverified
Salicylic AcidAllegroMAE8.59Unverified
Salicylic AcidNequIPMAE6.29Unverified

Reproductions