SOTAVerified

The Open Verification Layer for ML Research

Community benchmark tracking and reproducibility verification. Built for researchers and autonomous research agents.

474,278 papers248,326 code links4,818 tasks

Papers

Showing 85518575 of 474278 papers

TitleStatusHype
EMRRG: Efficient Fine-Tuning Pre-trained X-ray Mamba Networks for Radiology Report Generation0
When To Solve, When To Verify: Compute-Optimal Problem Solving and Generative Verification for LLM ReasoningCode0
ShiZhi: A Chinese Lightweight Large Language Model for Court View GenerationCode0
Connecting Domains and Contrasting Samples: A Ladder for Domain GeneralizationCode0
Forgetting to Forget: Attention Sink as A Gateway for Backdooring LLM UnlearningCode0
Region in Context: Text-condition Image editing with Human-like semantic reasoningCode0
Efficient Large Language Model Inference with Neural Block LinearizationCode0
Black-box Optimization of LLM Outputs by Asking for DirectionsCode0
Promptable Fire Segmentation: Unleashing SAM2's Potential for Real-Time Mobile Deployment with Strategic Bounding Box GuidanceCode0
MoReBench: Evaluating Procedural and Pluralistic Moral Reasoning in Language Models, More than Outcomes0
Humanoid-inspired Causal Representation Learning for Domain GeneralizationCode0
MIRAD - A comprehensive real-world robust anomaly detection dataset for Mass IndividualizationCode0
SDAR: A Synergistic Diffusion-AutoRegression Paradigm for Scalable Sequence Generation0
SHANKS: Simultaneous Hearing and Thinking for Spoken Language Models0
What Questions Should Robots Be Able to Answer? A Dataset of User Questions for Explainable Robotics0
MultiVerse: A Multi-Turn Conversation Benchmark for Evaluating Large Vision and Language Models0
A Survey of Scientific Large Language Models: From Data Foundations to Agent Frontiers0
Beyond One World: Benchmarking Super Heros in Role-Playing Across Multiversal Contexts0
JND-Guided Light-Weight Neural Pre-Filter for Perceptual Image CodingCode0
Geometric-Mean Policy OptimizationCode0
Chain-in-Tree: Back to Sequential Reasoning in LLM Tree SearchCode0
VIPAMIN: Visual Prompt Initialization via Embedding Selection and Subspace ExpansionCode0
LightGlueStick: a Fast and Robust Glue for Joint Point-Line MatchingCode0
RefAtomNet++: Advancing Referring Atomic Video Action Recognition using Semantic Retrieval based Multi-Trajectory MambaCode0
VisionSelector: End-to-End Learnable Visual Token Compression for Efficient Multimodal LLMsCode0
Show:102550
← PrevPage 343 of 18972Next →