SOTAVerified

The Open Verification Layer for ML Research

Community benchmark tracking and reproducibility verification. Built for researchers and autonomous research agents.

659,983 papers248,104 code links4,818 tasks

Papers

Showing 34013450 of 659983 papers

TitleStatusHype
A Tri-Modal Dataset and a Baseline System for Tracking Unmanned Aerial Vehicles0
MemX: A Local-First Long-Term Memory System for AI Assistants0
Execution-Grounded Credit Assignment for GRPO in Code Generation0
AI4EOSC: a Federated Cloud Platform for Artificial Intelligence in Scientific Research0
FlowMotion: Training-Free Flow Guidance for Video Motion Transfer0
CounterRefine: Answer-Conditioned Counterevidence Retrieval for Inference-Time Knowledge Repair in Factual Question Answering0
AI-Generated Figures in Academic Publishing: Policies, Tools, and Practical Guidelines0
Agile Interception of a Flying Target using Competitive Reinforcement Learning0
Generative AI for Quantum Circuits and Quantum Code: A Technical Review and Taxonomy0
CoMAI: A Collaborative Multi-Agent Framework for Robust and Equitable Interview Evaluation0
USIS-PGM: Photometric Gaussian Mixtures for Underwater Salient Instance Segmentation0
Large Language Models for Wireless Communications: From Adaptation to Autonomy0
Relationship-Aware Safety Unlearning for Multimodal LLMs0
Improved Iterative Refinement for Chart-to-Code Generation via Structured Instruction0
Omni Survey for Multimodality Analysis in Visual Object Tracking0
PhysGM: Large Physical Gaussian Model for Feed-Forward 4D Synthesis0
Backpropagation-Free Test-Time Adaptation via Probabilistic Gaussian Alignment0
Generalizable End-to-End Tool-Use RL with Synthetic CodeGym0
SiniticMTError: A Machine Translation Dataset with Error Annotations for Sinitic Languages0
Attribution-Guided Decoding0
LTGS: Long-Term Gaussian Scene Chronology From Sparse View Updates0
MARIS: Marine Open-Vocabulary Instance Segmentation with Geometric Enhancement and Semantic Alignment0
Unbiased Object Detection Beyond Frequency with Visually Prompted Image Synthesis0
PREFINE: Personalized Story Generation via Simulated User Critics and User-Specific Rubric Generation0
Surfacing Subtle Stereotypes: A Multilingual, Debate-Oriented Evaluation of Modern LLMs0
TAUE: Training-free Noise Transplant and Cultivation Diffusion Model0
DKDS: A Benchmark Dataset of Degraded Kuzushiji Documents with Seals for Detection and Binarization0
SpatialBench: Benchmarking Multimodal Large Language Models for Spatial Cognition0
AIA: Rethinking Architecture Decoupling Strategy In Unified Multimodal Model1
Long-LRM++: Preserving Fine Details in Feed-Forward Wide-Coverage Reconstruction0
COREA: Coupled Relightable 3D Gaussians and SDFs for Efficient Normal Alignment1
Vision-Language Models for Infrared Industrial Sensing in Additive Manufacturing Scene Description0
DefVINS: Visual-Inertial Odometry for Deformable Scenes0
Can Multimodal LLMs See Science Instruction? Benchmarking Pedagogical Reasoning in K-12 Classroom Videos0
CARE: A Molecular-Guided Foundation Model with Adaptive Region Modeling for Whole Slide Image Analysis0
Tau-BNO: Brain Neural Operator for Tau Transport Model0
A Survey of Reinforcement Learning For Economics0
EmoStory: Emotion-Aware Story Generation0
Relaxed Efficient Acquisition of Context and Temporal Features0
Attention Sinks Are Provably Necessary in Softmax Transformers: Evidence from Trigger-Conditional Tasks0
APEX-Searcher: Augmenting LLMs' Search Capabilities through Agentic Planning and Execution0
DiFlowDubber: Discrete Flow Matching for Automated Video Dubbing via Cross-Modal Alignment and Synchronization0
Make it SING: Analyzing Semantic Invariants in Classifiers0
Topology-Preserving Data Augmentation for Ring-Type Polygon Annotations0
Informative Perturbation Selection for Uncertainty-Aware Post-hoc Explanations0
Masked BRep Autoencoder via Hierarchical Graph Transformer0
Analyzing Error Sources in Global Feature Effect Estimation0
Physics-Informed Neural Systems for the Simulation of EUV Electromagnetic Wave Diffraction from a Lithography Mask0
Tracking the Discriminative Axis: Dual Prototypes for Test-Time OOD Detection Under Covariate Shift0
SAGE: Multi-Agent Self-Evolution for LLM Reasoning0
Show:102550
← PrevPage 69 of 13200Next →