SOTAVerified

Auto Debugging

Papers

Showing 13 of 3 papers

TitleStatusHype
From Code to Correctness: Closing the Last Mile of Code Generation with Hierarchical DebuggingCode2
PaLM: Scaling Language Modeling with PathwaysCode2
M^3Builder: A Multi-Agent System for Automated Machine Learning in Medical Imaging0
Show:102550

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1PaLM 62B (few-shot, k=5)Exact string match38.2Unverified
2PaLM 540B (few-shot, k=5)Exact string match38.2Unverified
3PaLM 8B (few-shot, k=5)Exact string match14.7Unverified