SOTAVerified

The Open Verification Layer for ML Research

Community benchmark tracking and reproducibility verification. Built for researchers and autonomous research agents.

474,278 papers248,326 code links4,818 tasks

Papers

Showing 1057610600 of 177340 papers

TitleStatusHype
PEDANTS: Cheap but Effective and Interpretable Answer EquivalenceCode2
SchNetPack 2.0: A neural network toolbox for atomistic machine learningCode2
Closed-Form Factorization of Latent Semantics in GANsCode2
Character-Adapter: Prompt-Guided Region Control for High-Fidelity Character CustomizationCode2
OptiChat: Bridging Optimization Models and Practitioners with Large Language ModelsCode2
CodeAttack: Revealing Safety Generalization Challenges of Large Language Models via Code CompletionCode2
VL-ICL Bench: The Devil in the Details of Multimodal In-Context LearningCode2
ESM All-Atom: Multi-scale Protein Language Model for Unified Molecular ModelingCode2
Multi-Scale VMamba: Hierarchy in Hierarchy Visual State Space ModelCode2
TableRAG: A Retrieval Augmented Generation Framework for Heterogeneous Document ReasoningCode2
Controllable 3D Outdoor Scene Generation via Scene GraphsCode2
DDSP: Differentiable Digital Signal ProcessingCode2
Coswara: A website application enabling COVID-19 screening by analysing respiratory sound samples and health symptomsCode2
Diffusion Explainer: Visual Explanation for Text-to-image Stable DiffusionCode2
RetroGFN: Diverse and Feasible Retrosynthesis using GFlowNetsCode2
Reevaluating Adversarial Examples in Natural LanguageCode2
CTR-Driven Advertising Image Generation with Multimodal Large Language ModelsCode2
Learning Few-Step Diffusion Models by Trajectory Distribution MatchingCode2
T2S: High-resolution Time Series Generation with Text-to-Series Diffusion ModelsCode2
RM-R1: Reward Modeling as ReasoningCode2
OBELiX: A Curated Dataset of Crystal Structures and Experimentally Measured Ionic Conductivities for Lithium Solid-State ElectrolytesCode2
pyKT: A Python Library to Benchmark Deep Learning based Knowledge Tracing ModelsCode2
Lemur: Harmonizing Natural Language and Code for Language AgentsCode2
FewJoint: A Few-shot Learning Benchmark for Joint Language UnderstandingCode2
ForesightNav: Learning Scene Imagination for Efficient ExplorationCode2
Show:102550
← PrevPage 424 of 7094Next →