SOTAVerified

The Open Verification Layer for ML Research

Community benchmark tracking and reproducibility verification. Built for researchers and autonomous research agents.

659,983 papers248,104 code links4,818 tasks

Papers

Showing 22012250 of 659983 papers

TitleStatusHype
HiMu: Hierarchical Multimodal Frame Selection for Long Video Question Answering0
UEPS: Robust and Efficient MRI Reconstruction0
Interplay: Training Independent Simulators for Reference-Free Conversational Recommendation0
Cross-Modal Rationale Transfer for Explainable Humanitarian Classification on Social Media0
ZEBRAARENA: A Diagnostic Simulation Environment for Studying Reasoning-Action Coupling in Tool-Augmented LLMs0
GEAR: Geography-knowledge Enhanced Analog Recognition Framework in Extreme Environments0
Training-Free Sparse Attention for Fast Video Generation via Offline Layer-Wise Sparsity Profiling and Online Bidirectional Co-Clustering0
Multimodal Model for Computational Pathology:Representation Learning and Image Compression0
Thinking with Constructions: A Benchmark and Policy Optimization for Visual-Text Interleaved Geometric Reasoning0
Agent Control Protocol: Admission Control for Agent ActionsCode0
STEP: Scientific Time-Series Encoder Pretraining via Cross-Domain Distillation0
From ex(p) to poly: Gaussian Splatting with Polynomial Kernels0
Seasoning Generative Models for a Generalization Aftertaste0
Towards Interpretable Foundation Models for Retinal Fundus Images0
BeamAgent: LLM-Aided MIMO Beamforming with Decoupled Intent Parsing and Alternating Optimization for Joint Site Selection and Precoding0
Why Better Cross-Lingual Alignment Fails for Better Cross-Lingual Transfer: Case of Encoders0
A Human-in/on-the-Loop Framework for Accessible Text Generation0
I Can't Believe It's Corrupt: Evaluating Corruption in Multi-Agent Governance Systems0
Kernel Single-Index Bandits: Estimation, Inference, and Learning0
VGGT-360: Geometry-Consistent Zero-Shot Panoramic Depth Estimation0
Best-of-Both-Worlds Multi-Dueling Bandits: Unified Algorithms for Stochastic and Adversarial Preferences under Condorcet and Borda Objectives0
Book your room in the Turing Hotel! A symmetric and distributed Turing Test with multiple AIs and humans0
Unleashing the Power of Simplicity: A Minimalist Strategy for State-of-the-Art Fingerprint Enhancement0
FUMO: Prior-Modulated Diffusion for Single Image Reflection RemovalCode0
Em-Garde: A Propose-Match Framework for Proactive Streaming Video Understanding0
FedTrident: Resilient Road Condition Classification Against Poisoning Attacks in Federated Learning0
Numerical Considerations for the Construction of Karhunen-Loève Expansions0
From Inference Efficiency to Embodied Efficiency: Revisiting Efficiency Metrics for Vision-Language-Action Models0
Adaptive Regime-Aware Stock Price Prediction Using Autoencoder-Gated Dual Node Transformers with Reinforcement Learning Control0
Hierarchical Latent Structure Learning through Online Inference0
Few-shot Acoustic Synthesis with Multimodal Flow Matching0
Improving RCT-Based Treatment Effect Estimation Under Covariate Mismatch via Calibrated Alignment0
Tinted Frames: Question Framing Blinds Vision-Language Models0
FinTradeBench: A Financial Reasoning Benchmark for LLMs0
Under One Sun: Multi-Object Generative Perception of Materials and Illumination0
Learning-to-Defer with Expert-Conditioned Advice0
iSatCR: Graph-Empowered Joint Onboard Computing and Routing for LEO Data Delivery0
3DreamBooth: High-Fidelity 3D Subject-Driven Video Generation Model1
Attack by Unlearning: Unlearning-Induced Adversarial Attacks on Graph Neural Networks0
Inst4DGS: Instance-Decomposed 4D Gaussian Splatting with Multi-Video Label Permutation Learning0
TopoChunker: Topology-Aware Agentic Document Chunking Framework0
Nonparametric Variational Differential Privacy via Embedding Parameter Clipping0
AdaSwitch: Balancing Exploration and Guidance in Knowledge Distillation via Adaptive Switching0
Affect Decoding in Phonated and Silent Speech Production from Surface EMG0
Social Simulacra in the Wild: AI Agent Communities on Moltbook0
SAVeS: Steering Safety Judgments in Vision-Language Models via Semantic Cues0
Hardness of High-Dimensional Linear Classification0
Act While Thinking: Accelerating LLM Agents via Pattern-Aware Speculative Tool Execution0
From Accuracy to Readiness: Metrics and Benchmarks for Human-AI Decision-Making0
AgentDS Technical Report: Benchmarking the Future of Human-AI Collaboration in Domain-Specific Data Science0
Show:102550
← PrevPage 45 of 13200Next →