SOTAVerified

The Open Verification Layer for ML Research

Community benchmark tracking and reproducibility verification. Built for researchers and autonomous research agents.

474,278 papers248,326 code links4,818 tasks

Papers

Showing 91019125 of 474278 papers

TitleStatusHype
GlotEval: A Test Suite for Massively Multilingual Evaluation of Large Language Models0
Robust Neural Rendering in the Wild with Asymmetric Dual 3D Gaussian Splatting0
Uncertainty-Aware Remaining Lifespan Prediction from Images0
Sotopia-RL: Reward Design for Social Intelligence0
Progressive Gaussian Transformer with Anisotropy-aware Sampling for Open Vocabulary Occupancy Prediction0
Ming-UniVision: Joint Image Understanding and Generation with a Unified Continuous Tokenizer0
Aligning Large Language Models via Fully Self-Synthetic DataCode0
DreamOmni2: Multimodal Instruction-based Editing and Generation0
Scaling LLM Multi-turn RL with End-to-end Summarization-based Context Management0
MeXtract: Light-Weight Metadata Extraction from Scientific Papers0
EDUMATH: Generating Standards-aligned Educational Math Word Problems0
Validation of Various Normalization Methods for Brain Tumor Segmentation: Can Federated Learning Overcome This Heterogeneity?Code0
WristWorld: Generating Wrist-Views via 4D World Models for Robotic Manipulation0
MATRIX: Mask Track Alignment for Interaction-aware Video Generation0
Can Vision Language Models Infer Human Gaze Direction? A Controlled Study0
D2RA: Dual Domain Regeneration AttackCode0
PickStyle: Video-to-Video Style Transfer with Context-Style Adapters0
TGM: a Modular and Efficient Library for Machine Learning on Temporal GraphsCode0
TalkPlayData 2: An Agentic Synthetic Data Pipeline for Multimodal Conversational Music Recommendation0
AISysRev -- LLM-based Tool for Title-abstract ScreeningCode0
BIRD-INTERACT: Re-imagining Text-to-SQL Evaluation for Large Language Models via Lens of Dynamic Interactions0
HBSplat: Robust Sparse-View Gaussian Reconstruction with Hybrid-Loss Guided Depth and Bidirectional WarpingCode0
VGGT-X: When VGGT Meets Dense Novel View Synthesis0
Scalable In-context Ranking with Generative Models0
X-Teaming Evolutionary M2S: Automated Discovery of Multi-turn to Single-turn Jailbreak TemplatesCode0
Show:102550
← PrevPage 365 of 18972Next →