SOTAVerified

The Open Verification Layer for ML Research

Community benchmark tracking and reproducibility verification. Built for researchers and autonomous research agents.

659,984 papers248,105 code links4,818 tasks

Papers

Showing 17261750 of 659984 papers

TitleStatusHype
Dora: Sampling and Benchmarking for 3D Shape Variational Auto-EncodersCode4
Autonomous LLM-driven research from data to human-verifiable research papersCode4
Can We Generate Images with CoT? Let's Verify and Reinforce Image Generation Step by StepCode4
One-Shot Diffusion Mimicker for Handwritten Text GenerationCode4
SigmaRL: A Sample-Efficient and Generalizable Multi-Agent Reinforcement Learning Framework for Motion PlanningCode4
Mean Flows for One-step Generative ModelingCode4
Tag2Text: Guiding Vision-Language Model via Image TaggingCode4
In Search of Needles in a 11M Haystack: Recurrent Memory Finds What LLMs MissCode4
ImgEdit: A Unified Image Editing Dataset and BenchmarkCode4
Distil-Whisper: Robust Knowledge Distillation via Large-Scale Pseudo LabellingCode4
Prismatic VLMs: Investigating the Design Space of Visually-Conditioned Language ModelsCode4
Image Fusion via Vision-Language ModelCode4
Looking Backward: Streaming Video-to-Video Translation with Feature BanksCode4
Restructuring Vector Quantization with the Rotation TrickCode4
ViDoRAG: Visual Document Retrieval-Augmented Generation via Dynamic Iterative Reasoning AgentsCode4
SimpleDeepSearcher: Deep Information Seeking via Web-Powered Reasoning Trajectory SynthesisCode4
JoyVASA: Portrait and Animal Image Animation with Diffusion-Based Audio-Driven Facial Dynamics and Head Motion GenerationCode4
TrueTeacher: Learning Factual Consistency Evaluation with Large Language ModelsCode4
Parameter-Efficient Fine-Tuning for Pre-Trained Vision Models: A SurveyCode4
OpenAgents: An Open Platform for Language Agents in the WildCode4
Vocos: Closing the gap between time-domain and Fourier-based neural vocoders for high-quality audio synthesisCode4
A Survey on Diffusion Models for Time Series and Spatio-Temporal DataCode4
OmniMedVQA: A New Large-Scale Comprehensive Evaluation Benchmark for Medical LVLMCode4
Factorio Learning EnvironmentCode4
GraphGen: Enhancing Supervised Fine-Tuning for LLMs with Knowledge-Driven Synthetic Data GenerationCode4
Show:102550
← PrevPage 70 of 26400Next →