SOTAVerified

The Open Verification Layer for ML Research

Community benchmark tracking and reproducibility verification. Built for researchers and autonomous research agents.

474,278 papers248,326 code links4,818 tasks

Papers

Showing 69266950 of 474278 papers

TitleStatusHype
Adaptive Length Image Tokenization via Recurrent AllocationCode2
Combining Induction and Transduction for Abstract ReasoningCode2
EmoSphere++: Emotion-Controllable Zero-Shot Text-to-Speech via Emotion-Adaptive Spherical VectorCode2
Real-Time Polygonal Semantic Mapping for Humanoid Robot Stair ClimbingCode2
CRMArena: Understanding the Capacity of LLM Agents to Perform Professional CRM Tasks in Realistic EnvironmentsCode2
PPLLaVA: Varied Video Sequence Understanding With Prompt GuidanceCode2
Attacking Vision-Language Computer Agents via Pop-upsCode2
Exploiting Unlabeled Data with Multiple Expert Teachers for Open Vocabulary Aerial Object Detection and Its Orientation AdaptationCode2
Learning General-Purpose Biomedical Volume Representations using Randomized SynthesisCode2
Training on test proteins improves fitness, structure, and function predictionCode2
INQUIRE: A Natural World Text-to-Image Retrieval BenchmarkCode2
DeeR-VLA: Dynamic Inference of Multimodal Large Language Models for Efficient Robot ExecutionCode2
RAGViz: Diagnose and Visualize Retrieval-Augmented GenerationCode2
Mapping Global Floods with 10 Years of Satellite Radar DataCode2
GarmentLab: A Unified Simulation and Benchmark for Garment ManipulationCode2
Unlocking the Archives: Using Large Language Models to Transcribe Handwritten Historical DocumentsCode2
X-Drive: Cross-modality consistent multi-sensor data synthesis for driving scenariosCode2
On Deep Learning for Geometric and Semantic Scene Understanding Using On-Vehicle 3D LiDARCode2
A Survey of Financial AI: Architectures, Advances and Open ChallengesCode2
Communication Learning in Multi-Agent Systems from Graph Modeling PerspectiveCode2
Toward Automated Algorithm Design: A Survey and Practical Guide to Meta-Black-Box-OptimizationCode2
SLED: Self Logits Evolution Decoding for Improving Factuality in Large Language ModelsCode2
Plan-on-Graph: Self-Correcting Adaptive Planning of Large Language Model on Knowledge GraphsCode2
LLM-Inference-Bench: Inference Benchmarking of Large Language Models on AI AcceleratorsCode2
ImOV3D: Learning Open-Vocabulary Point Clouds 3D Object Detection from Only 2D ImagesCode2
Show:102550
← PrevPage 278 of 18972Next →