SOTAVerified

The Open Verification Layer for ML Research

Community benchmark tracking and reproducibility verification. Built for researchers and autonomous research agents.

474,278 papers248,326 code links4,818 tasks

Papers

Showing 68766900 of 474278 papers

TitleStatusHype
V2X-R: Cooperative LiDAR-4D Radar Fusion for 3D Object Detection with Denoising DiffusionCode2
Retrieval Augmented Time Series ForecastingCode2
GTA: Global Tracklet Association for Multi-Object Tracking in SportsCode2
Tucano: Advancing Neural Text Generation for PortugueseCode2
DPU: Dynamic Prototype Updating for Multimodal Out-of-Distribution DetectionCode2
Large Language Models Can Self-Improve in Long-context ReasoningCode2
Wavelet Latent Diffusion (Wala): Billion-Parameter 3D Generative Model with Compact Wavelet EncodingsCode2
RedCode: Risky Code Execution and Generation Benchmark for Code AgentsCode2
TIPO: Text to Image with Text Presampling for Prompt OptimizationCode2
ScaleKD: Strong Vision Transformers Could Be Excellent TeachersCode2
StoryTeller: Improving Long Video Description through Global Audio-Visual Character IdentificationCode2
Token Merging for Training-Free Semantic Binding in Text-to-Image SynthesisCode2
The Super Weight in Large Language ModelsCode2
AEROMamba: An efficient architecture for audio super-resolution using generative adversarial networks and state space modelsCode2
Is Your LLM Secretly a World Model of the Internet? Model-Based Planning for Web AgentsCode2
InvisMark: Invisible and Robust Watermarking for AI-generated Image ProvenanceCode2
Graph Neural Network Surrogates to leverage Mechanistic Expert Knowledge towards Reliable and Immediate Pandemic ResponseCode2
Reaction-conditioned De Novo Enzyme Design with GENzymeCode2
GFT: Graph Foundation Model with Transferable Tree VocabularyCode2
Community Research Earth Digital Intelligence Twin (CREDIT)Code2
Concept Bottleneck Language Models For protein designCode2
Reliable-loc: Robust sequential LiDAR global localization in large-scale street scenes based on verifiable cuesCode2
WorkflowLLM: Enhancing Workflow Orchestration Capability of Large Language ModelsCode2
Dynamic-SUPERB Phase-2: A Collaboratively Expanding Benchmark for Measuring the Capabilities of Spoken Language Models with 180 TasksCode2
DeepArUco++: Improved detection of square fiducial markers in challenging lighting conditionsCode2
Show:102550
← PrevPage 276 of 18972Next →