SOTAVerified

The Open Verification Layer for ML Research

Community benchmark tracking and reproducibility verification. Built for researchers and autonomous research agents.

661,570 papers248,326 code links4,818 tasks

Papers

Showing 96519700 of 661570 papers

TitleStatusHype
AMB-DSGDN: Adaptive Modality-Balanced Dynamic Semantic Graph Differential Network for Multimodal Emotion Recognition0
Targeted Bit-Flip Attacks on LLM-Based Agents0
Permutation-Equivariant 2D State Space Models: Theory and Canonical Architecture for Multivariate Time Series0
Hindsight Credit Assignment for Long-Horizon LLM Agents0
Turn: A Language for Agentic ComputationCode0
DevBench: A Realistic, Developer-Informed Benchmark for Code Generation Models0
CoTJudger: A Graph-Driven Framework for Automatic Evaluation of Chain-of-Thought Efficiency and Redundancy in LRMs0
Tokenizing Semantic Segmentation with RLE0
Reallocating Attention Across Layers to Reduce Multimodal Hallucination0
N-Tree Diffusion for Long-Horizon Wildfire Risk Forecasting0
Data-Driven Hints in Intelligent Tutoring Systems0
Rethinking Deep Research from the Perspective of Web Content Distribution Matching0
MoE-GS: Mixture of Experts for Dynamic Gaussian Splatting0
The Partition Principle Revisited: Non-Equal Volume Designs Achieve Minimal Expected Star Discrepancy0
Explainable and Hardware-Efficient Jamming Detection for 5G Networks Using the Convolutional Tsetlin Machine0
INDUCTION: Finite-Structure Concept Synthesis in First-Order Logic0
Foundational World Models Accurately Detect Bimanual Manipulator Failures0
ACD-U: Asymmetric co-teaching with machine unlearning for robust learning with noisy labelsCode0
Mathematicians in the age of AI0
Task learning increases information redundancy of neural responses in macaque visual cortex0
Towards Objective Gastrointestinal Auscultation: Automated Segmentation and Annotation of Bowel Sound Patterns0
A Distributed Gaussian Process Model for Multi-Robot Mapping0
ShakyPrepend: A Multi-Group Learner with Improved Sample Complexity0
A Systematic Investigation of Document Chunking Strategies and Embedding Sensitivity0
Shutdown Safety Valves for Advanced AI0
Efficient Personalized Reranking with Semi-Autoregressive Generation and Online Knowledge Distillation0
Enhancing low energy reconstruction and classification in KM3NeT/ORCA with transformers0
Margin in Abstract Spaces0
Unlocking Data Value in Finance: A Study on Distillation and Difficulty-Aware Training0
The Talking Robot: Distortion-Robust Acoustic Models for Robot-Robot Communication0
DINOv3 Visual Representations for Blueberry Perception Toward Robotic Harvesting0
The Third Ambition: Artificial Intelligence and the Science of Human Behavior0
A Single Model Ensemble Framework for Neural Machine Translation using Pivot Translation0
Prototype Perturbation for Relaxing Alignment Constraints in Backward-Compatible Learning0
Weak-to-Strong Generalization with Failure Trajectories: A Tree-based Approach to Elicit Optimal Policy in Strong Models0
GraphProp: Training the Graph Foundation Models using Graph Properties0
3D Gaussian Splatting with Fisheye Images: Field of View Analysis and Depth-Based Initialization0
IAG: Input-aware Backdoor Attack on VLM-based Visual Grounding0
Synthetic Homes: An Accessible Multimodal Pipeline for Producing Residential Building Data with Generative AI0
Stealth Fine-Tuning: Efficiently Breaking Alignment in RVLMs Using Self-Generated CoT0
Generative Evolutionary Meta-Solver (GEMS): Scalable Surrogate-Free Multi-Agent Reinforcement Learning0
Automated Pest Counting in Water Traps through Active Robotic Stirring for Occlusion Handling0
SwiftTS: A Swift Selection Framework for Time Series Pre-trained Models via Multi-task Meta-Learning0
HatePrototypes: Interpretable and Transferable Representations for Implicit and Explicit Hate Speech Detection0
Counting Through Occlusion: Framework for Open World Amodal Counting0
Video2Layout: Recall and Reconstruct Metric-Grounded Cognitive Map for Spatial Reasoning1
Towards Realistic Guarantees: A Probabilistic Certificate for SmoothLLM0
Automating Deception: Scalable Multi-Turn LLM Jailbreaks0
Shortcut Invariance: Targeted Jacobian Regularization in Disentangled Latent Space0
Process-Centric Analysis of Agentic Software Systems0
Show:102550
← PrevPage 194 of 13232Next →