SOTAVerified

The Open Verification Layer for ML Research

Community benchmark tracking and reproducibility verification. Built for researchers and autonomous research agents.

661,570 papers248,326 code links4,818 tasks

Papers

Showing 33013325 of 661570 papers

TitleStatusHype
Automatically Interpreting Millions of Features in Large Language ModelsCode3
GTSinger: A Global Multi-Technique Singing Corpus with Realistic Music Scores for All Singing TasksCode3
KIVI: A Tuning-Free Asymmetric 2bit Quantization for KV CacheCode3
AgentDojo: A Dynamic Environment to Evaluate Prompt Injection Attacks and Defenses for LLM AgentsCode3
AndroidLab: Training and Systematic Benchmarking of Android Autonomous AgentsCode3
HAC++: Towards 100X Compression of 3D Gaussian SplattingCode3
Show-1: Marrying Pixel and Latent Diffusion Models for Text-to-Video GenerationCode3
Deep Reasoning Translation via Reinforcement LearningCode3
Segment Anything in 3D with Radiance FieldsCode3
Consistency Flow Matching: Defining Straight Flows with Velocity ConsistencyCode3
PhotoDoodle: Learning Artistic Image Editing from Few-Shot Pairwise DataCode3
Deep Learning-Based Object Pose Estimation: A Comprehensive SurveyCode3
MotionFollower: Editing Video Motion via Lightweight Score-Guided DiffusionCode3
VideoMAE: Masked Autoencoders are Data-Efficient Learners for Self-Supervised Video Pre-TrainingCode3
AnimeGamer: Infinite Anime Life Simulation with Next Game State PredictionCode3
PE3R: Perception-Efficient 3D ReconstructionCode3
The Mighty ToRR: A Benchmark for Table Reasoning and RobustnessCode3
Baichuan-Omni Technical ReportCode3
Robot Utility Models: General Policies for Zero-Shot Deployment in New EnvironmentsCode3
RLVR-World: Training World Models with Reinforcement LearningCode3
Tool Learning with Large Language Models: A SurveyCode3
DragDiffusion: Harnessing Diffusion Models for Interactive Point-based Image EditingCode3
Step-level Value Preference Optimization for Mathematical ReasoningCode3
Middle Architecture CriteriaCode3
TinyGPT-V: Efficient Multimodal Large Language Model via Small BackbonesCode3
Show:102550
← PrevPage 133 of 26463Next →