SOTAVerified

The Open Verification Layer for ML Research

Community benchmark tracking and reproducibility verification. Built for researchers and autonomous research agents.

474,278 papers248,326 code links4,818 tasks

Papers

Showing 65266550 of 474278 papers

TitleStatusHype
Kardia-R1: Unleashing LLMs to Reason toward Understanding and Empathy for Emotional Support via Rubric-as-Judge Reinforcement LearningCode0
A Large Scale Benchmark for Test Time Adaptation Methods in Medical Image SegmentationCode0
Towards Unification of Hallucination Detection and Fact Verification for Large Language ModelsCode0
Reasoning Path and Latent State Analysis for Multi-view Visual Spatial Reasoning: A Cognitive Science PerspectiveCode0
SeeNav-Agent: Enhancing Vision-Language Navigation with Visual Prompt and Step-Level Policy Optimization0
HouseLayout3D: A Benchmark and Training-Free Baseline for 3D Layout Estimation in the Wild0
Multilingual Pretraining for Pixel Language Models0
Evaluating LLMs on Sequential API Call Through Automated Test Generation0
Hyperdimensional Probe: Decoding LLM Representations via Vector Symbolic Architectures0
AutoSurvey2: Empowering Researchers with Next Level Automated Literature SurveysCode0
TabTune: A Unified Library for Inference and Fine-Tuning Tabular Foundation Models0
OpenREAD: Reinforced Open-Ended Reasoning for End-to-End Autonomous Driving with LLM-as-Critic0
From Atomic to Composite: Reinforcement Learning Enables Generalization in Complementary Reasoning0
WorldMM: Dynamic Multimodal Memory Agent for Long Video Reasoning0
YingVideo-MV: Music-Driven Multi-Stage Video Generation0
IACT: A Self-Organizing Recursive Model for General AI Agents: A Technical White Paper on the Architecture Behind kragent.ai0
Hear What Matters! Text-conditioned Selective Video-to-Audio Generation0
ReVSeg: Incentivizing the Reasoning Chain for Video Segmentation with Reinforcement Learning0
Fast-Decoding Diffusion Language Models via Progress-Aware Confidence Schedules0
In-Context Sync-LoRA for Portrait Video Editing0
ViSAudio: End-to-End Video-Driven Binaural Spatial Audio Generation0
MultiShotMaster: A Controllable Multi-Shot Video Generation Framework0
MagicQuillV2: Precise and Interactive Image Editing with Layered Visual Cues0
Astra: A Multi-Agent System for GPU Kernel Performance OptimizationCode0
Generative Action Tell-Tales: Assessing Human Motion in Synthesized Videos0
Show:102550
← PrevPage 262 of 18972Next →