Ranking LLM-Generated Loop Invariants for Program Verification

2023-10-13Code Available1· sign in to hype

Saikat Chakraborty, Shuvendu K. Lahiri, Sarah Fakhoury, Madanlal Musuvathi, Akash Lal, Aseem Rastogi, Aditya Senthilnathan, Rahul Sharma, Nikhil Swamy

arXiv PDF

Code Available — Be the first to reproduce this paper.

Reproduce

Code

github.com/microsoft/NeuralInvariantRanker
OfficialIn paperpytorch★ 12

Abstract

Synthesizing inductive loop invariants is fundamental to automating program verification. In this work, we observe that Large Language Models (such as gpt-3.5 or gpt-4) are capable of synthesizing loop invariants for a class of programs in a 0-shot setting, yet require several samples to generate the correct invariants. This can lead to a large number of calls to a program verifier to establish an invariant. To address this issue, we propose a re-ranking approach for the generated results of LLMs. We have designed a ranker that can distinguish between correct inductive invariants and incorrect attempts based on the problem definition. The ranker is optimized as a contrastive ranker. Experimental results demonstrate that this re-ranking mechanism significantly improves the ranking of correct invariants among the generated candidates, leading to a notable reduction in the number of calls to a verifier. The source code and the experimental data for this paper are available in https://github.com/microsoft/NeuralInvariantRanker.

Tasks

Re-Ranking

Ranking LLM-Generated Loop Invariants for Program Verification

Code

Abstract

Tasks

Reproductions