SOTAVerified

The Open Verification Layer for ML Research

Community benchmark tracking and reproducibility verification. Built for researchers and autonomous research agents.

659,983 papers248,104 code links4,818 tasks

Papers

Showing 24012450 of 659983 papers

TitleStatusHype
ArenaRL: Scaling RL for Open-Ended Agents via Tournament-based Relative Ranking3
Scaling Text-to-Image Diffusion Transformers with Representation Autoencoders3
ActionMesh: Animated 3D Mesh Generation with Temporal 3D Diffusion3
FourCastNet 3: A geometric approach to probabilistic machine-learning weather forecasting at scaleCode3
PhysX: Physical-Grounded 3D Asset GenerationCode3
Arctic Inference with Shift Parallelism: Fast and Efficient Open Source Inference System for Enterprise AICode3
A Survey on Latent ReasoningCode3
Agent KB: Leveraging Cross-Domain Experience for Agentic Problem SolvingCode3
DreamVLA: A Vision-Language-Action Model Dreamed with Comprehensive World KnowledgeCode3
RLVER: Reinforcement Learning with Verifiable Emotion Rewards for Empathetic AgentsCode3
No time to train! Training-Free Reference-Based Instance SegmentationCode3
Flash-VStream: Efficient Real-Time Understanding for Long Video StreamsCode3
L0: Reinforcement Learning to Become General AgentsCode3
Epona: Autoregressive Diffusion World Model for Autonomous DrivingCode3
Ovis-U1 Technical ReportCode3
FineWeb2: One Pipeline to Scale Them All -- Adapting Pre-Training Data Processing to Every LanguageCode3
The Ideation-Execution Gap: Execution Outcomes of LLM-Generated versus Human Research IdeasCode3
MMSearch-R1: Incentivizing LMMs to SearchCode3
Efficient and Generalizable Speaker Diarization via Structured Pruning of Self-Supervised ModelsCode3
ShareGPT-4o-Image: Aligning Multimodal Models with GPT-4o-Level Image GenerationCode3
Machine Mental Imagery: Empower Multimodal Reasoning with Latent Visual TokensCode3
Camera Calibration via Circular Patterns: A Comprehensive Framework with Measurement Uncertainty and Unbiased Projection ModelCode3
TabArena: A Living Benchmark for Machine Learning on Tabular DataCode3
Hunyuan3D 2.5: Towards High-Fidelity 3D Assets Generation with Ultimate DetailsCode3
AutoVLA: A Vision-Language-Action Model for End-to-End Autonomous Driving with Adaptive Reasoning and Reinforcement Fine-TuningCode3
Vine Copulas as Differentiable Computational GraphsCode3
Arctic Long Sequence Training: Scalable And Efficient Training For Multi-Million Token SequencesCode3
Discrete Diffusion in Large Language and Multimodal Models: A SurveyCode3
ANIRA: An Architecture for Neural Network Inference in Real-Time Audio ApplicationsCode3
FlexRAG: A Flexible and Comprehensive Framework for Retrieval-Augmented GenerationCode3
A Comprehensive Survey of Deep Research: Systems, Methodologies, and ApplicationsCode3
The Diffusion DualityCode3
Spurious Rewards: Rethinking Training Signals in RLVRCode3
AniMaker: Automated Multi-Agent Animated Storytelling with MCTS-Driven Clip GenerationCode3
TreeLoRA: Efficient Continual Learning via Layer-Wise LoRAs Guided by a Hierarchical Gradient-Similarity TreeCode3
JAFAR: Jack up Any Feature at Any ResolutionCode3
Cosmos-Drive-Dreams: Scalable Synthetic Driving Data Generation with World Foundation ModelsCode3
MagCache: Fast Video Generation with Magnitude-Aware CacheCode3
Highly Compressed Tokenizer Can Generate Without TrainingCode3
G-Memory: Tracing Hierarchical Memory for Multi-Agent SystemsCode3
Hierarchical Lexical Graph for Enhanced Multi-Hop RetrievalCode3
Real-Time Execution of Action Chunking Flow PoliciesCode3
Generalized Trajectory Scoring for End-to-end Multimodal PlanningCode3
When to use Graphs in RAG: A Comprehensive Analysis for Graph Retrieval-Augmented GenerationCode3
FlashDMoE: Fast Distributed MoE in a Single KernelCode3
SupeRANSAC: One RANSAC to Rule Them AllCode3
INP-Former++: Advancing Universal Anomaly Detection via Intrinsic Normal Prototypes and Residual LearningCode3
HtFLlib: A Comprehensive Heterogeneous Federated Learning Library and BenchmarkCode3
A Smart Multimodal Healthcare Copilot with Powerful LLM ReasoningCode3
Ultra-High-Resolution Image Synthesis: Data, Method and EvaluationCode3
Show:102550
← PrevPage 49 of 13200Next →