SOTAVerified

The Open Verification Layer for ML Research

Community benchmark tracking and reproducibility verification. Built for researchers and autonomous research agents.

474,278 papers248,326 code links4,818 tasks

Papers

Showing 1645116500 of 474278 papers

TitleStatusHype
GenBreak: Red Teaming Text-to-Image Generators Using Large Language Models0
Disclosure Audits for LLM Agents0
GRAIL: A Benchmark for GRaph ActIve Learning in Dynamic Sensing Environments0
DynaSubVAE: Adaptive Subgrouping for Scalable and Robust OOD Detection0
AtmosMJ: Revisiting Gating Mechanism for AI Weather Forecasting Beyond the Year ScaleCode0
TaskCraft: Automated Generation of Agentic TasksCode2
Learning to Collaborate Over Graphs: A Selective Federated Multi-Task Learning ApproachCode0
The 2025 PNPL Competition: Speech Detection and Phoneme Classification in the LibriBrain Dataset0
Data-Driven Modeling of IRCU Patient Flow in the COVID-19 PandemicCode0
TransXSSM: A Hybrid Transformer State Space Model with Unified Rotary Position Embedding0
NnD: Diffusion-based Generation of Physically-Nonnegative Objects0
Textual Bayes: Quantifying Uncertainty in LLM-Based Systems0
What is the Cost of Differential Privacy for Deep Learning-Based Trajectory Generation?Code0
Chat-of-Thought: Collaborative Multi-Agent System for Generating Domain Specific Information0
3D-Aware Vision-Language Models Fine-Tuning with Geometric DistillationCode1
Revisit What You See: Disclose Language Prior in Vision Tokens for Efficient Guided Decoding of LVLMsCode1
Q2E: Query-to-Event Decomposition for Zero-Shot Multilingual Text-to-Video Retrieval0
Efficient kernelized bandit algorithms via exploration distributions0
Interpreting learned search: finding a transition model and value function in an RNN that plays SokobanCode1
A Call for Collaborative Intelligence: Why Human-Agent Systems Should Precede AI AutonomyCode2
A quantum semantic framework for natural language processingCode5
Omni-DPO: A Dual-Perspective Paradigm for Dynamic Preference Learning of LLMsCode0
Exposure-slot: Exposure-centric representations learning with Slot-in-Slot Attention for Region-aware Exposure CorrectionCode1
VersaVid-R1: A Versatile Video Understanding and Reasoning Model from Question Answering to Captioning Tasks0
Colors See Colors Ignore: Clothes Changing ReID with Color Disentanglement (ICCV-25 🥳)0
Improving LLM Agent Planning with In-Context Learning via Atomic Fact Augmentation and Lookahead Search0
GUIRoboTron-Speech: Towards Automated GUI Agents Based on Speech InstructionsCode1
ContextLoss: Context Information for Topology-Preserving Segmentation0
Sparse Autoencoders Bridge The Deep Learning Model and The Brain0
Grids Often Outperform Implicit Neural RepresentationsCode0
GPU-accelerated Modeling of Biological Regulatory Networks0
JAFAR: Jack up Any Feature at Any ResolutionCode3
Technical Report for Argoverse2 Scenario Mining Challenges on Iterative Error Correction and Spatially-Aware Prompting0
Optimal Operating Strategy for PV-BESS Households: Balancing Self-Consumption and Self-Sufficiency0
Navigating High-Dimensional Backstage: A Guide for Exploring Literature for the Reliable Use of Dimensionality Reduction0
Cross-Frame Representation Alignment for Fine-Tuning Video Diffusion Models0
A Multi-Modal Spatial Risk Framework for EV Charging Infrastructure Using Remote Sensing0
An Open-Source Software Toolkit & Benchmark Suite for the Evaluation and Adaptation of Multimodal Action Models0
Comment on The Illusion of Thinking: Understanding the Strengths and Limitations of Reasoning Models via the Lens of Problem Complexity0
FlagEvalMM: A Flexible Framework for Comprehensive Multimodal Model EvaluationCode2
Exploring the Capabilities of the Frontier Large Language Models for Nuclear Energy Research0
DualEquiNet: A Dual-Space Hierarchical Equivariant Network for Large Biomolecules0
Scalable and Cost-Efficient de Novo Template-Based Molecular GenerationCode1
SDMPrune: Self-Distillation MLP Pruning for Efficient Large Language ModelsCode1
Segment This Thing: Foveated Tokenization for Efficient Point-Prompted SegmentationCode2
Solving the Job Shop Scheduling Problem with Graph Neural Networks: A Customizable Reinforcement Learning EnvironmentCode2
Monocular 3D Hand Pose Estimation with Implicit Camera AlignmentCode1
XGraphRAG: Interactive Visual Analysis for Graph-based Retrieval-Augmented GenerationCode0
SUTA-LM: Bridging Test-Time Adaptation and Language Model Rescoring for Robust ASR0
A Self-Refining Framework for Enhancing ASR Using TTS-Synthesized Data0
Show:102550
← PrevPage 330 of 9486Next →