SOTAVerified

The Open Verification Layer for ML Research

Community benchmark tracking and reproducibility verification. Built for researchers and autonomous research agents.

659,983 papers248,104 code links4,818 tasks

Papers

Showing 27012750 of 659983 papers

TitleStatusHype
A Contextual Help Browser Extension to Assist Digital Illiterate Internet Users0
FailureMem: A Failure-Aware Multimodal Framework for Autonomous Software Repair0
Trust the Unreliability: Inward Backward Dynamic Unreliability Driven Coreset Selection for Medical Image Classification0
End-to-end data-driven prediction of urban airflow and pollutant dispersion0
VeriAgent: A Tool-Integrated Multi-Agent System with Evolving Memory for PPA-Aware RTL Code Generation0
Temporal Narrative Monitoring in Dynamic Information Environments0
Do Language Models Encode Semantic Relations? Probing and Sparse Feature Analysis0
A Multi-Agent System for Building-Age Cohort Mapping to Support Urban Energy Planning0
Atomic Trajectory Modeling with State Space Models for Biomolecular Dynamics0
DSS-GAN: Directional State Space GAN with Mamba backbone for Class-Conditional Image Synthesis0
Towards Infinitely Long Neural Simulations: Self-Refining Neural Surrogate Models for Dynamical Systems0
VeriGrey: Greybox Agent Validation0
Interpretable Cross-Domain Few-Shot Learning with Rectified Target-Domain Local Alignment0
Few-Step Diffusion Sampling Through Instance-Aware Discretizations0
Post-Training Local LLM Agents for Linux Privilege Escalation with Verifiable Rewards0
Illumination-Aware Contactless Fingerprint Spoof Detection via Paired Flash-Non-Flash Imaging0
WeatherReasonSeg: A Benchmark for Weather-Aware Reasoning Segmentation in Visual Language Models0
Sensi: Learn One Thing at a Time -- Curriculum-Based Test-Time Learning for LLM Game Agents0
Does YOLO Really Need to See Every Training Image in Every Epoch?0
Objective Mispricing Detection for Shortlisting Undervalued Football Players via Market Dynamics and News Signals0
Stochastic set-valued optimization and its application to robust learning0
Learning Transferable Temporal Primitives for Video Reasoning via Synthetic Videos0
Exploring parameter-efficient fine-tuning (PEFT) of billion-parameter vision models with QLoRA and DoRA: insights into generalization for limited-data image classification under a 98:1 test-to-train regime0
AERR-Nav: Adaptive Exploration-Recovery-Reminiscing Strategy for Zero-Shot Object Navigation0
PC-CrossDiff: Point-Cluster Dual-Level Cross-Modal Differential Attention for Unified 3D Referring and Segmentation0
Evidence Packing for Cross-Domain Image Deepfake Detection with LVLMs0
ResNet-50 with Class Reweighting and Anatomy-Guided Temporal Decoding for Gastrointestinal Video Analysis0
Facial Movement Dynamics Reveal Workload During Complex Multitasking0
CoVerRL: Breaking the Consensus Trap in Label-Free Reasoning via Generator-Verifier Co-Evolution0
CrowdGaussian: Reconstructing High-Fidelity 3D Gaussians for Human Crowd from a Single Image0
Facts as First Class Objects: Knowledge Objects for Persistent LLM Memory0
EVA: Aligning Video World Models with Executable Robot Actions via Inverse Dynamics Rewards0
Dropout Robustness and Cognitive Profiling of Transformer Models via Stochastic Inference0
ChopGrad: Pixel-Wise Losses for Latent Video Diffusion via Truncated Backpropagation0
Discovering Decoupled Functional Modules in Large Language Models0
RPMS: Enhancing LLM-Based Embodied Planning through Rule-Augmented Memory Synergy0
Symmetry-Reduced Physics-Informed Learning of Tensegrity Dynamics0
Steering Video Diffusion Transformers with Massive Activations0
TINA: Text-Free Inversion Attack for Unlearned Text-to-Image Diffusion Models0
CodeScout: An Effective Recipe for Reinforcement Learning of Code Search Agents0
Generative Control as Optimization: Time Unconditional Flow Matching for Adaptive and Robust Robotic Control0
Verification and Validation of Physics-Informed Surrogate Component Models for Dynamic Power-System Simulation0
The Silent Thought: Modeling Internal Cognition in Full-Duplex Spoken Dialogue Models via Latent Reasoning0
How do LLMs Compute Verbal Confidence0
Operator-Theoretic Foundations and Policy Gradient Methods for General MDPs with Unbounded Costs0
Edit Spillover as a Probe: Do Image Editing Models Implicitly Understand World Relations?0
AI-Assisted Goal Setting Improves Goal Progress Through Social Accountability0
Identity as Presence: Towards Appearance and Voice Personalized Joint Audio-Video Generation0
RAMP: Reinforcement Adaptive Mixed Precision Quantization for Efficient On Device LLM Inference0
scicode-lint: Detecting Methodology Bugs in Scientific Python Code with LLM-Generated Patterns0
Show:102550
← PrevPage 55 of 13200Next →