SOTAVerified

The Open Verification Layer for ML Research

Community benchmark tracking and reproducibility verification. Built for researchers and autonomous research agents.

661,570 papers248,326 code links4,818 tasks

Papers

Showing 33763400 of 661570 papers

TitleStatusHype
A Practical Probabilistic Benchmark for AI Weather ModelsCode3
emg2pose: A Large and Diverse Benchmark for Surface Electromyographic Hand Pose EstimationCode3
Tina: Tiny Reasoning Models via LoRACode3
UAVs Meet LLMs: Overviews and Perspectives Toward Agentic Low-Altitude MobilityCode3
Improved Denoising Diffusion Probabilistic ModelsCode3
Pareto Front Approximation for Multi-Objective Session-Based Recommender SystemsCode3
Goedel-Prover: A Frontier Model for Open-Source Automated Theorem ProvingCode3
Stonefish: Supporting Machine Learning Research in Marine RoboticsCode3
CLEAR: Conv-Like Linearization Revs Pre-Trained Diffusion Transformers UpCode3
Slamming: Training a Speech Language Model on One GPU in a DayCode3
AlphaAgent: LLM-Driven Alpha Mining with Regularized Exploration to Counteract Alpha DecayCode3
Emergent Misalignment: Narrow finetuning can produce broadly misaligned LLMsCode3
Baichuan-Audio: A Unified Framework for End-to-End Speech InteractionCode3
CrossOver: 3D Scene Cross-Modal AlignmentCode3
Harnessing Multiple Large Language Models: A Survey on LLM EnsembleCode3
BatteryLife: A Comprehensive Dataset and Benchmark for Battery Life PredictionCode3
GoalFlow: Goal-Driven Flow Matching for Multimodal Trajectories Generation in End-to-End Autonomous DrivingCode3
Reinforcement Learning Outperforms Supervised Fine-Tuning: A Case Study on Audio Question AnsweringCode3
Falcon: A Remote Sensing Vision-Language Foundation ModelCode3
A Survey on Latent ReasoningCode3
Vision-Speech Models: Teaching Speech Models to Converse about ImagesCode3
Free4D: Tuning-free 4D Scene Generation with Spatial-Temporal ConsistencyCode3
Vision-to-Music Generation: A SurveyCode3
A Survey of Efficient Reasoning for Large Reasoning Models: Language, Multimodality, and BeyondCode3
AI2Agent: An End-to-End Framework for Deploying AI Projects as Autonomous AgentsCode3
Show:102550
← PrevPage 136 of 26463Next →