SOTAVerified

The Open Verification Layer for ML Research

Community benchmark tracking and reproducibility verification. Built for researchers and autonomous research agents.

659,983 papers248,104 code links4,818 tasks

Papers

Showing 401425 of 659983 papers

TitleStatusHype
PowerPM: Foundation Model for Power SystemsCode7
Segment Anything in Medical Images and Videos: Benchmark and DeploymentCode7
Lumina-mGPT: Illuminate Flexible Photorealistic Text-to-Image Generation with Multimodal Generative PretrainingCode7
Global Structure-from-Motion RevisitedCode7
RT-DETRv2: Improved Baseline with Bag-of-Freebies for Real-Time Detection TransformerCode7
Stable Audio OpenCode7
ECCO: Can We Improve Model-Generated Code Efficiency Without Sacrificing Functional Correctness?Code7
MMAU: A Holistic Benchmark of Agent Capabilities Across Diverse DomainsCode7
Qwen2-Audio Technical ReportCode7
EchoMimic: Lifelike Audio-Driven Portrait Animations through Editable Landmark ConditionsCode7
MambaVision: A Hybrid Mamba-Transformer Vision BackboneCode7
LLaVA-NeXT-Interleave: Tackling Multi-image, Video, and 3D in Large Multimodal ModelsCode7
PEER: Expertizing Domain-Specific Tasks with a Multi-Agent Framework and Tuning MethodsCode7
Agentless: Demystifying LLM-based Software Engineering AgentsCode7
ColPali: Efficient Document Retrieval with Vision Language ModelsCode7
RouteLLM: Learning to Route LLMs with Preference DataCode7
BricksRL: A Platform for Democratizing Robotics and Reinforcement Learning Research and Education with LEGOCode7
EAGLE-2: Faster Inference of Language Models with Dynamic Draft TreesCode7
Segment Any Text: A Universal Approach for Robust, Efficient and Adaptable Sentence SegmentationCode7
Mooncake: A KVCache-centric Disaggregated Architecture for LLM ServingCode7
Grants4Companies: Applying Declarative Methods for Recommending and Reasoning About Business Grants in the Austrian Public Administration (System Description)Code7
NAVSIM: Data-Driven Non-Reactive Autonomous Vehicle Simulation and BenchmarkingCode7
DataComp-LM: In search of the next generation of training sets for language modelsCode7
Grounding Image Matching in 3D with MASt3RCode7
MeshAnything: Artist-Created Mesh Generation with Autoregressive TransformersCode7
Show:102550
← PrevPage 17 of 26400Next →