SOTAVerified

The Open Verification Layer for ML Research

Community benchmark tracking and reproducibility verification. Built for researchers and autonomous research agents.

659,983 papers248,104 code links4,818 tasks

Papers

Showing 23012350 of 659983 papers

TitleStatusHype
T2I-R1: Reinforcing Image Generation with Collaborative Semantic-level and Token-level CoTCode4
Tarsier2: Advancing Large Vision-Language Models from Detailed Video Description to Comprehensive Video UnderstandingCode4
Nomic Embed: Training a Reproducible Long Context Text EmbedderCode4
AutoSchemaKG: Autonomous Knowledge Graph Construction through Dynamic Schema Induction from Web-Scale CorporaCode4
A Survey on Large Language Model based Autonomous AgentsCode4
Discovering faster matrix multiplication algorithms with reinforcement learningCode4
SAM2Long: Enhancing SAM 2 for Long Video Segmentation with a Training-Free Memory TreeCode4
TotalSegmentator: robust segmentation of 104 anatomical structures in CT imagesCode4
Minigrid & Miniworld: Modular & Customizable Reinforcement Learning Environments for Goal-Oriented TasksCode4
R1-Zero's "Aha Moment" in Visual Reasoning on a 2B Non-SFT ModelCode4
Benchmarking Neural Network Training AlgorithmsCode4
Llasa: Scaling Train-Time and Inference-Time Compute for Llama-based Speech SynthesisCode4
Efficient Deformable ConvNets: Rethinking Dynamic and Sparse Operator for Vision ApplicationsCode4
Mask DINO: Towards A Unified Transformer-based Framework for Object Detection and SegmentationCode4
Baichuan 2: Open Large-scale Language ModelsCode4
SEED-Story: Multimodal Long Story Generation with Large Language ModelCode4
VisRAG: Vision-based Retrieval-augmented Generation on Multi-modality DocumentsCode4
Otter: A Multi-Modal Model with In-Context Instruction TuningCode4
Rankify: A Comprehensive Python Toolkit for Retrieval, Re-Ranking, and Retrieval-Augmented GenerationCode4
Safurai 001: New Qualitative Approach for Code LLM EvaluationCode4
The Optimal BERT Surgeon: Scalable and Accurate Second-Order Pruning for Large Language ModelsCode4
RePaint: Inpainting using Denoising Diffusion Probabilistic ModelsCode4
A Preview of XiYan-SQL: A Multi-Generator Ensemble Framework for Text-to-SQLCode4
MTEB: Massive Text Embedding BenchmarkCode4
R1-Searcher: Incentivizing the Search Capability in LLMs via Reinforcement LearningCode4
Identify Critical KV Cache in LLM Inference from an Output Perturbation PerspectiveCode4
Real-ESRGAN: Training Real-World Blind Super-Resolution with Pure Synthetic DataCode4
FinBen: A Holistic Financial Benchmark for Large Language ModelsCode4
SVDQuant: Absorbing Outliers by Low-Rank Components for 4-Bit Diffusion ModelsCode4
Toward Ultra-Long-Horizon Agentic Science: Cognitive Accumulation for Machine Learning Engineering3
Self-Distillation Enables Continual Learning3
Yunjue Agent Tech Report: A Fully Reproducible, Zero-Start In-Situ Self-Evolving Agent System for Open-Ended Tasks3
LiteReality: Graphics-Ready 3D Scene Reconstruction from RGB-D Scans3
CL-bench: A Benchmark for Context Learning3
LLM-in-Sandbox Elicits General Agentic Intelligence3
tttLRM: Test-Time Training for Long Context and Autoregressive 3D Reconstruction3
SceneSmith: Agentic Generation of Simulation-Ready Indoor Scenes3
GEM: A Gym for Agentic LLMs3
DeepGen 1.0: A Lightweight Unified Multimodal Model for Advancing Image Generation and Editing3
A Survey of Token Compression for Efficient Multimodal Large Language Models3
LoGeR: Long-Context Geometric Reconstruction with Hybrid Memory3
HY3D-Bench: Generation of 3D Assets3
Deep Delta Learning3
HY-WU (Part I): An Extensible Functional Neural Memory Framework and An Instantiation in Text-Guided Image Editing3
AI Can Learn Scientific Taste3
VolSplat: Rethinking Feed-Forward 3D Gaussian Splatting with Voxel-Aligned Prediction3
ActionMesh: Animated 3D Mesh Generation with Temporal 3D Diffusion3
JUST-DUB-IT: Video Dubbing via Joint Audio-Visual Diffusion3
A Survey of Data Agents: Emerging Paradigm or Overstated Hype?3
Generative Refocusing: Flexible Defocus Control from a Single Image3
Show:102550
← PrevPage 47 of 13200Next →