SOTAVerified

The Open Verification Layer for ML Research

Community benchmark tracking and reproducibility verification. Built for researchers and autonomous research agents.

474,278 papers248,326 code links4,818 tasks

Papers

Showing 150 of 474278 papers

TitleStatusHype
Explaining CLIP Zero-shot Predictions Through Concepts0
RAWIC: Bit-Depth Adaptive Lossless Raw Image Compression0
MedLoc-R1: Performance-Aware Curriculum Reward Scheduling for GRPO-Based Medical Visual Grounding0
A Closer Look at Cross-Domain Few-Shot Object Detection: Fine-Tuning Matters and Parallel Decoder Helps0
CPUBone: Efficient Vision Backbone Design for Devices with Low Parallelization Capabilities0
FlashSign: Pose-Free Guidance for Efficient Sign Language Video Generation0
ForestSim: A Synthetic Benchmark for Intelligent Vehicle Perception in Unstructured Forest Environments0
EnsemJudge: Enhancing Reliability in Chinese LLM-Generated Text Detection through Diverse Model Ensembles0
Hg-I2P: Bridging Modalities for Generalizable Image-to-Point-Cloud Registration via Heterogeneous Graphs0
Drift-AR: Single-Step Visual Autoregressive Generation via Anti-Symmetric Drifting0
InkDrop: Invisible Backdoor Attacks Against Dataset Condensation0
AutoDrive-P^3: Unified Chain of Perception-Prediction-Planning Thought via Reinforcement Fine-Tuning0
MDPBench: A Benchmark for Multilingual Document Parsing in Real-World Scenarios0
Robust Remote Sensing Image-Text Retrieval with Noisy Correspondence0
TwinMixing: A Shuffle-Aware Feature Interaction Model for Multi-Task Segmentation0
Reasoning as Energy Minimization over Structured Latent Trajectories0
Prototype-Enhanced Multi-View Learning for Thyroid Nodule Ultrasound Classification0
FairGC: Fairness-aware Graph Condensation0
INSID3: Training-Free In-Context Segmentation with DINOv30
GEditBench v2: A Human-Aligned Benchmark for General Image Editing0
ResAdapt: Adaptive Resolution for Efficient Multimodal Reasoning0
TGIF2: Extended Text-Guided Inpainting Forgery Dataset & Benchmark0
AdaptToken: Entropy-based Adaptive Token Selection for MLLM Long Video Understanding0
DreamLite: A Lightweight On-Device Unified Model for Image Generation and Editing0
ParaSpeechCLAP: A Dual-Encoder Speech-Text Model for Rich Stylistic Language-Audio Pretraining0
Rethinking Language Model Scaling under Transferable Hypersphere Optimization0
Adaptive Block-Scaled Data Types0
HandX: Scaling Bimanual Motion and Interaction Generation0
Gen-Searcher: Reinforcing Agentic Search for Image Generation0
NeiGAD: Augmenting Graph Anomaly Detection via Spectral Neighbor Information0
LIBERO-Para: A Diagnostic Benchmark and Metrics for Paraphrase Robustness in VLA Models0
Kernel-Smith: A Unified Recipe for Evolutionary Kernel Optimization0
Courtroom-Style Multi-Agent Debate with Progressive RAG and Role-Switching for Controversial Claim Verification0
GraphWalker: Agentic Knowledge Graph Question Answering via Synthetic Trajectory Curriculum0
ORSIFlow: Saliency-Guided Rectified Flow for Optical Remote Sensing Salient Object Detection0
ELViS: Efficient Visual Similarity from Local Descriptors that Generalizes Across Domains0
Industrial3D: A Terrestrial LiDAR Point Cloud Dataset and CrossParadigm Benchmark for Industrial Infrastructure0
WAFT-Stereo: Warping-Alone Field Transforms for Stereo Matching0
Sommelier: Scalable Open Multi-turn Audio Pre-processing for Full-duplex Speech Language Models0
CARLA-Air: Fly Drones Inside a CARLA World -- A Unified Infrastructure for Air-Ground Embodied Intelligence0
ImagenWorld: Stress-Testing Image Generation Models with Explainable Human Evaluation on Open-ended Real-World Tasks0
RSR-core: A High-Performance Engine for Low-Bit Matrix-Vector Multiplication0
KV Cache Quantization for Self-Forcing Video Generation: A 33-Method Empirical Study0
Learning to Focus and Precise Cropping: A Reinforcement Learning Framework with Information Gaps and Grounding Loss for MLLMs0
Streamlined Open-Vocabulary Human-Object Interaction Detection0
Q-BIOLAT: Binary Latent Protein Fitness Landscapes for QUBO-Based Optimization0
OpenDPR: Open-Vocabulary Change Detection via Vision-Centric Diffusion-Guided Prototype Retrieval for Remote Sensing Imagery0
PRBench: End-to-end Paper Reproduction in Physics Research0
RHO: Robust Holistic OSM-Based Metric Cross-View Geo-Localization0
GS3LAM: Gaussian Semantic Splatting SLAM0
Show:102550
← PrevPage 1 of 9486Next →