SOTAVerified

The Open Verification Layer for ML Research

Community benchmark tracking and reproducibility verification. Built for researchers and autonomous research agents.

661,570 papers248,326 code links4,818 tasks

Papers

Showing 32513275 of 177340 papers

TitleStatusHype
AI2Apps: A Visual IDE for Building LLM-based AI Agent ApplicationsCode3
On-Demand Earth System Data CubesCode3
Findings of the WMT 2024 Shared Task on Discourse-Level Literary TranslationCode3
WildBench: Benchmarking LLMs with Challenging Tasks from Real Users in the WildCode3
DISCOVERYWORLD: A Virtual Environment for Developing and Evaluating Automated Scientific Discovery AgentsCode3
RVT-2: Learning Precise Manipulation from Few DemonstrationsCode3
OmniTokenizer: A Joint Image-Video Tokenizer for Visual GenerationCode3
Adam-mini: Use Fewer Learning Rates To Gain MoreCode3
Point-SAM: Promptable 3D Segmentation Model for Point CloudsCode3
Retrieval-augmented generation in multilingual settingsCode3
Robust Neural Information Retrieval: An Adversarial and Out-of-distribution PerspectiveCode3
A Comprehensive Survey on Human Video Generation: Challenges, Methods, and InsightsCode3
LightenDiffusion: Unsupervised Low-Light Image Enhancement with Latent-Retinex Diffusion ModelsCode3
An Actionable Framework for Assessing Bias and Fairness in Large Language Model Use CasesCode3
Learning Dynamics of LLM FinetuningCode3
Reinforcement Learning Meets Visual OdometryCode3
Comgra: A Tool for Analyzing and Debugging Neural NetworksCode3
ControlMLLM: Training-Free Visual Prompt Learning for Multimodal Large Language ModelsCode3
VisualAgentBench: Towards Large Multimodal Models as Visual Foundation AgentsCode3
SAM2Point: Segment Any 3D as Videos in Zero-shot and Promptable MannersCode3
VisionTS: Visual Masked Autoencoders Are Free-Lunch Zero-Shot Time Series ForecastersCode3
Image Over Text: Transforming Formula Recognition Evaluation with Character Detection MatchingCode3
SpatialBot: Precise Spatial Understanding with Vision Language ModelsCode3
Colorful Diffuse Intrinsic Image Decomposition in the WildCode3
Generative Modeling of Molecular Dynamics TrajectoriesCode3
Show:102550
← PrevPage 131 of 7094Next →