SOTAVerified

The Open Verification Layer for ML Research

Community benchmark tracking and reproducibility verification. Built for researchers and autonomous research agents.

661,570 papers248,326 code links4,818 tasks

Papers

Showing 25762600 of 661570 papers

TitleStatusHype
Geo4D: Leveraging Video Generators for Geometric 4D Scene ReconstructionCode3
PixelFlow: Pixel-Space Generative Models with FlowCode3
Dynamic Cheatsheet: Test-Time Learning with Adaptive MemoryCode3
Perception-R1: Pioneering Perception Policy with Reinforcement LearningCode3
VideoChat-R1: Enhancing Spatio-Temporal Perception via Reinforcement Fine-TuningCode3
FlashDepth: Real-time Streaming Video Depth Estimation at 2K ResolutionCode3
PromptHMR: Promptable Human Mesh RecoveryCode3
DDT: Decoupled Diffusion TransformerCode3
GPU-accelerated Evolutionary Many-objective Optimization Using Tensorized NSGA-IIICode3
SEA-LION: Southeast Asian Languages in One NetworkCode3
DFormerv2: Geometry Self-Attention for RGBD Semantic SegmentationCode3
Playing Non-Embedded Card-Based Games with Reinforcement LearningCode3
Video4DGen: Enhancing Video and 4D Generation through Mutual OptimizationCode3
TrafficLLM: Enhancing Large Language Models for Network Traffic Analysis with Generic Traffic RepresentationCode3
Multi-SWE-bench: A Multilingual Benchmark for Issue ResolvingCode3
Affordable AI Assistants with Knowledge Graph of ThoughtsCode3
GPT-ImgEval: A Comprehensive Benchmark for Diagnosing GPT4o in Image GenerationCode3
Scaling Analysis of Interleaved Speech-Text Language ModelsCode3
Audio-visual Controlled Video Diffusion with Masked Selective State Spaces Modeling for Natural Talking Head GenerationCode3
VARGPT-v1.1: Improve Visual Autoregressive Large Unified Model via Iterative Instruction Tuning and Reinforcement LearningCode3
End-to-End Driving with Online Trajectory Evaluation via BEV World ModelCode3
YourBench: Easy Custom Evaluation Sets for EveryoneCode3
MedReason: Eliciting Factual Medical Reasoning Steps in LLMs via Knowledge GraphsCode3
AnimeGamer: Infinite Anime Life Simulation with Next Game State PredictionCode3
Beyond Quacking: Deep Integration of Language Models and RAG into DuckDBCode3
Show:102550
← PrevPage 104 of 26463Next →