SOTAVerified

The Open Verification Layer for ML Research

Community benchmark tracking and reproducibility verification. Built for researchers and autonomous research agents.

659,983 papers248,104 code links4,818 tasks

Papers

Showing 701750 of 659983 papers

TitleStatusHype
FireRedASR: Open-Source Industrial-Grade Mandarin Speech Recognition Models from Encoder-Decoder to LLM IntegrationCode5
TxAgent: An AI Agent for Therapeutic Reasoning Across a Universe of ToolsCode5
Vision-R1: Incentivizing Reasoning Capability in Multimodal Large Language ModelsCode5
OS-Copilot: Towards Generalist Computer Agents with Self-ImprovementCode5
Time-series attribution maps with regularized contrastive learningCode5
Are VLMs Ready for Autonomous Driving? An Empirical Study from the Reliability, Data, and Metric PerspectivesCode5
Mastering Text-to-Image Diffusion: Recaptioning, Planning, and Generating with Multimodal LLMsCode5
GEN3C: 3D-Informed World-Consistent Video Generation with Precise Camera ControlCode5
MobileSAMv2: Faster Segment Anything to EverythingCode5
Hallo3: Highly Dynamic and Realistic Portrait Image Animation with Video Diffusion TransformerCode5
BlackJAX: Composable Bayesian inference in JAXCode5
CodeGen2: Lessons for Training LLMs on Programming and Natural LanguagesCode5
Multimodal Autoregressive Pre-training of Large Vision EncodersCode5
Active Learning for Neural PDE SolversCode5
Cosmos World Foundation Model Platform for Physical AICode5
PyramidMamba: Rethinking Pyramid Feature Fusion with Selective Space State Model for Semantic Segmentation of Remote Sensing ImageryCode5
Pixel-SAIL: Single Transformer For Pixel-Grounded UnderstandingCode5
LongBench v2: Towards Deeper Understanding and Reasoning on Realistic Long-context MultitasksCode5
Information Flow Routes: Automatically Interpreting Language Models at ScaleCode5
Bridging Different Language Models and Generative Vision Models for Text-to-Image GenerationCode5
UniDepth: Universal Monocular Metric Depth EstimationCode5
Unleashing the Potential of SAM2 for Biomedical Images and Videos: A SurveyCode5
AssetOpsBench: Benchmarking AI Agents for Task Automation in Industrial Asset Operations and MaintenanceCode5
DeTikZify: Synthesizing Graphics Programs for Scientific Figures and Sketches with TikZCode5
Noisereduce: Domain General Noise Reduction for Time Series SignalsCode5
Evaluating Real-World Robot Manipulation Policies in SimulationCode5
LLaMA-Adapter V2: Parameter-Efficient Visual Instruction ModelCode5
Orbit: A Unified Simulation Framework for Interactive Robot Learning EnvironmentsCode5
ProRL: Prolonged Reinforcement Learning Expands Reasoning Boundaries in Large Language ModelsCode5
WizardMath: Empowering Mathematical Reasoning for Large Language Models via Reinforced Evol-InstructCode5
Break the Sequential Dependency of LLM Inference Using Lookahead DecodingCode5
Allegro: Open the Black Box of Commercial-Level Video Generation ModelCode5
Show-o: One Single Transformer to Unify Multimodal Understanding and GenerationCode5
VideoReTalking: Audio-based Lip Synchronization for Talking Head Video Editing In the WildCode5
XFeat: Accelerated Features for Lightweight Image MatchingCode5
Hunyuan-Large: An Open-Source MoE Model with 52 Billion Activated Parameters by TencentCode5
ELLA: Equip Diffusion Models with LLM for Enhanced Semantic AlignmentCode5
ShareGPT4Video: Improving Video Understanding and Generation with Better CaptionsCode5
Video Depth Anything: Consistent Depth Estimation for Super-Long VideosCode5
Fast Inference from Transformers via Speculative DecodingCode5
TrimTail: Low-Latency Streaming ASR with Simple but Effective Spectrogram-Level Length PenaltyCode5
Can Generalist Foundation Models Outcompete Special-Purpose Tuning? Case Study in MedicineCode5
NotaGen: Advancing Musicality in Symbolic Music Generation with Large Language Model Training ParadigmsCode5
OmniRe: Omni Urban Scene ReconstructionCode5
CogView3: Finer and Faster Text-to-Image Generation via Relay DiffusionCode5
QA-LoRA: Quantization-Aware Low-Rank Adaptation of Large Language ModelsCode5
GenCast: Diffusion-based ensemble forecasting for medium-range weatherCode5
Gaussian Opacity Fields: Efficient Adaptive Surface Reconstruction in Unbounded ScenesCode5
Moirai-MoE: Empowering Time Series Foundation Models with Sparse Mixture of ExpertsCode5
How to Design Translation Prompts for ChatGPT: An Empirical StudyCode5
Show:102550
← PrevPage 15 of 13200Next →