SOTAVerified

ProofWala: Multilingual Proof Data Synthesis and Theorem-Proving

2025-02-07Code Available1· sign in to hype

Amitayush Thakur, George Tsoukalas, Greg Durrett, Swarat Chaudhuri

Code Available — Be the first to reproduce this paper.

Reproduce

Code

Abstract

Neural networks have shown substantial promise at automatic theorem-proving in interactive proof assistants (ITPs) like Lean and Coq. However, most neural theorem-proving models are restricted to specific ITPs, leaving out opportunities for cross-lingual transfer between ITPs. We address this weakness with a multilingual proof framework, P ROOFW ALA, that allows a standardized form of interaction between neural theorem-provers and two established ITPs (Coq and Lean). It enables the collection of multilingual proof step data -- data recording the result of proof actions on ITP states -- for training neural provers. P ROOFW ALA allows the systematic evaluation of a model's performance across different ITPs and problem domains via efficient parallel proof search algorithms. We show that multilingual training enabled by P ROOFW ALA can lead to successful transfer across ITPs. Specifically, a model trained on a mix of P ROOFW ALA-generated Coq and Lean data outperforms Lean-only and Coq-only models on the standard prove-at-k metric. We open source all code including code for the P ROOFW ALA Framework (https://github.com/trishullab/proof-wala), and the Multilingual ITP interaction framework (https://github.com/trishullab/itp-interface).

Tasks

Reproductions