SOTAVerified

The Open Verification Layer for ML Research

Community benchmark tracking and reproducibility verification. Built for researchers and autonomous research agents.

659,983 papers248,104 code links4,818 tasks

Papers

Showing 28012850 of 177339 papers

TitleStatusHype
DeDoDe v2: Analyzing and Improving the DeDoDe Keypoint DetectorCode3
SeeClick: Harnessing GUI Grounding for Advanced Visual GUI AgentsCode3
Curie: Toward Rigorous and Automated Scientific Experimentation with AI AgentsCode3
MEMORYLLM: Towards Self-Updatable Large Language ModelsCode3
BatchTopK Sparse AutoencodersCode3
On the Efficiency of NLP-Inspired Methods for Tabular Deep LearningCode3
Large Language Models Are Human-Level Prompt EngineersCode3
Zero-Shot Text-to-Image GenerationCode3
ShapeLLM: Universal 3D Object Understanding for Embodied InteractionCode3
MathCoder-VL: Bridging Vision and Code for Enhanced Multimodal Mathematical ReasoningCode3
LLaRA: Supercharging Robot Learning Data for Vision-Language PolicyCode3
The Unreasonable Effectiveness of Deep Features as a Perceptual MetricCode3
Cross-Modal Causal Intervention for Medical Report GenerationCode3
Evaluating Large Language Models for Radiology Natural Language ProcessingCode3
GigaSpeech 2: An Evolving, Large-Scale and Multi-domain ASR Corpus for Low-Resource Languages with Automated Crawling, Transcription and RefinementCode3
Neuron-Level Sequential Editing for Large Language ModelsCode3
The Ideation-Execution Gap: Execution Outcomes of LLM-Generated versus Human Research IdeasCode3
Codec Does Matter: Exploring the Semantic Shortcoming of Codec for Audio Language ModelCode3
SALMONN: Towards Generic Hearing Abilities for Large Language ModelsCode3
Text-to-Audio Generation using Instruction-Tuned LLM and Latent Diffusion ModelCode3
PipeOffload: Improving Scalability of Pipeline Parallelism with Memory OptimizationCode3
OVLW-DETR: Open-Vocabulary Light-Weighted Detection TransformerCode3
Observation-Centric SORT: Rethinking SORT for Robust Multi-Object TrackingCode3
TransFuser: Imitation with Transformer-Based Sensor Fusion for Autonomous DrivingCode3
EfficientVMamba: Atrous Selective Scan for Light Weight Visual MambaCode3
ToolSandbox: A Stateful, Conversational, Interactive Evaluation Benchmark for LLM Tool Use CapabilitiesCode3
Accelerating Diffusion Transformers with Dual Feature CachingCode3
Keypoint Promptable Re-IdentificationCode3
Proteus: A Self-Designing Range FilterCode3
SARATR-X: Toward Building A Foundation Model for SAR Target RecognitionCode3
AutoTimes: Autoregressive Time Series Forecasters via Large Language ModelsCode3
PromptKD: Unsupervised Prompt Distillation for Vision-Language ModelsCode3
Matbench Discovery -- A framework to evaluate machine learning crystal stability predictionsCode3
Mipha: A Comprehensive Overhaul of Multimodal Assistant with Small Language ModelsCode3
SAM2-UNet: Segment Anything 2 Makes Strong Encoder for Natural and Medical Image SegmentationCode3
Multimodal Foundation Models: From Specialists to General-Purpose AssistantsCode3
Aria-UI: Visual Grounding for GUI InstructionsCode3
Karatsuba Matrix Multiplication and its Efficient Custom Hardware ImplementationsCode3
VRT: A Video Restoration TransformerCode3
A Demonstration of Adaptive Collaboration of Large Language Models for Medical Decision-MakingCode3
TinyAgent: Function Calling at the EdgeCode3
Graph-constrained Reasoning: Faithful Reasoning on Knowledge Graphs with Large Language ModelsCode3
Graph-Augmented Normalizing Flows for Anomaly Detection of Multiple Time SeriesCode3
Towards An End-to-End Framework for Flow-Guided Video InpaintingCode3
Sintel: A Machine Learning Framework to Extract Insights from SignalsCode3
VideoCutLER: Surprisingly Simple Unsupervised Video Instance SegmentationCode3
TAPIR: Tracking Any Point with per-frame Initialization and temporal RefinementCode3
Playing Non-Embedded Card-Based Games with Reinforcement LearningCode3
Conv-TasNet: Surpassing Ideal Time-Frequency Magnitude Masking for Speech SeparationCode3
DiffusionDB: A Large-scale Prompt Gallery Dataset for Text-to-Image Generative ModelsCode3
Show:102550
← PrevPage 57 of 3547Next →