EGraFFBench: Evaluation of Equivariant Graph Neural Network Force Fields for Atomistic Simulations

2023-10-03Unverified0· sign in to hype

Vaibhav Bihani, Utkarsh Pratiush, Sajid Mannan, Tao Du, Zhimin Chen, Santiago Miret, Matthieu Micoulaut, Morten M Smedskjaer, Sayan Ranu, N M Anoop Krishnan

arXiv PDF

Unverified — Be the first to reproduce this paper.

Reproduce

Abstract

Equivariant graph neural networks force fields (EGraFFs) have shown great promise in modelling complex interactions in atomic systems by exploiting the graphs' inherent symmetries. Recent works have led to a surge in the development of novel architectures that incorporate equivariance-based inductive biases alongside architectural innovations like graph transformers and message passing to model atomic interactions. However, thorough evaluations of these deploying EGraFFs for the downstream task of real-world atomistic simulations, is lacking. To this end, here we perform a systematic benchmarking of 6 EGraFF algorithms (NequIP, Allegro, BOTNet, MACE, Equiformer, TorchMDNet), with the aim of understanding their capabilities and limitations for realistic atomistic simulations. In addition to our thorough evaluation and analysis on eight existing datasets based on the benchmarking literature, we release two new benchmark datasets, propose four new metrics, and three challenging tasks. The new datasets and tasks evaluate the performance of EGraFF to out-of-distribution data, in terms of different crystal structures, temperatures, and new molecules. Interestingly, evaluation of the EGraFF models based on dynamic simulations reveals that having a lower error on energy or force does not guarantee stable or reliable simulation or faithful replication of the atomic structures. Moreover, we find that no model clearly outperforms other models on all datasets and tasks. Importantly, we show that the performance of all the models on out-of-distribution datasets is unreliable, pointing to the need for the development of a foundation model for force fields that can be used in real-world simulations. In summary, this work establishes a rigorous framework for evaluating machine learning force fields in the context of atomic simulations and points to open research challenges within this domain.

Tasks

Atomic Forces Benchmarking Formation Energy Graph Neural Network

Benchmark Results

Dataset	Model	Metric	Claimed	Verified	Status
3BPA	NequIP	MAE	3.15	—	Unverified
3BPA	MACE	MAE	4	—	Unverified
3BPA	Allegro	MAE	4.13	—	Unverified
3BPA	BOTNet	MAE	5	—	Unverified
Acetylacetone	Allegro	MAE	0.92	—	Unverified
Acetylacetone	NequIP	MAE	1.38	—	Unverified
Acetylacetone	MACE	MAE	2	—	Unverified
Acetylacetone	BOTNet	MAE	2	—	Unverified
Aspirin	Allegro	MAE	14.36	—	Unverified
Aspirin	MACE	MAE	13.79	—	Unverified
Aspirin	BOTNet	MAE	12.63	—	Unverified
Aspirin	NequIP	MAE	9.27	—	Unverified
Ethanol	MACE	MAE	209.96	—	Unverified
Ethanol	BOTNet	MAE	203.83	—	Unverified
Ethanol	Allegro	MAE	6.94	—	Unverified
Ethanol	NequIP	MAE	4.99	—	Unverified
GeTe	BOTNet	MAE	3,034	—	Unverified
GeTe	MACE	MAE	2,670	—	Unverified
GeTe	NequIP	MAE	1,780.95	—	Unverified
GeTe	Allegro	MAE	1,009.4	—	Unverified
LiPS	NequIP	MAE	165.43	—	Unverified
LiPS	Allegro	MAE	31.75	—	Unverified
LiPS	MACE	MAE	30	—	Unverified
LiPS	BOTNet	MAE	28	—	Unverified
LiPS20	BOTNet	MAE	24.59	—	Unverified
LiPS20	NequIP	MAE	26.8	—	Unverified
LiPS20	Allegro	MAE	33.17	—	Unverified
LiPS20	MACE	MAE	14.05	—	Unverified
Naphthalene	BOTNet	MAE	182.55	—	Unverified
Naphthalene	NequIP	MAE	2.66	—	Unverified
Naphthalene	Allegro	MAE	5.82	—	Unverified
Naphthalene	MACE	MAE	161.74	—	Unverified
Salicylic Acid	BOTNet	MAE	153.06	—	Unverified
Salicylic Acid	MACE	MAE	165.29	—	Unverified
Salicylic Acid	Allegro	MAE	8.59	—	Unverified
Salicylic Acid	NequIP	MAE	6.29	—	Unverified

EGraFFBench: Evaluation of Equivariant Graph Neural Network Force Fields for Atomistic Simulations

Abstract

Tasks

Benchmark Results

Reproductions