SOTAVerified

The Open Verification Layer for ML Research

Community benchmark tracking and reproducibility verification. Built for researchers and autonomous research agents.

661,570 papers248,326 code links4,818 tasks

Papers

Showing 1030110350 of 661570 papers

TitleStatusHype
Agentified Assessment of Logical Reasoning Agents0
Whisper-CD: Accurate Long-Form Speech Recognition using Multi-Negative Contrastive Decoding0
Stress-Testing Alignment Audits With Prompt-Level Strategic Deception0
DataChef: Cooking Up Optimal Data Recipes for LLM Adaptation via Reinforcement Learning0
Bilateral Trade Under Heavy-Tailed Valuations: Minimax Regret with Infinite Variance0
EgoReasoner: Learning Egocentric 4D Reasoning via Task-Adaptive Structured Thinking0
RoboPocket: Improve Robot Policies Instantly with Your Phone0
Spatial Calibration of Diffuse LiDARs0
Physics-Informed Diffusion Model for Generating Synthetic Extreme Rare Weather Events Data0
Calibrated Credit Intelligence: Shift-Robust and Fair Risk Scoring with Bayesian Uncertainty and Gradient Boosting0
Facial Expression Recognition Using Residual Masking NetworkCode0
From Tokenizer Bias to Backbone Capability: A Controlled Study of LLMs for Time Series ForecastingCode0
ExDD: Explicit Dual Distribution Learning for Surface Defect Detection via Diffusion SynthesisCode0
Diffusion Fine-Tuning via Reparameterized Policy Gradient of the Soft Q-FunctionCode0
Fast-BEV++: Fast by Algorithm, Deployable by DesignCode0
Restoring Exploration after Post-Training: Latent Exploration Decoding for Large Reasoning ModelsCode0
Reparameterized Tensor Ring Functional Decomposition for Multi-Dimensional Data RecoveryCode0
Think-as-You-See: Streaming Chain-of-Thought Reasoning for Large Vision-Language ModelsCode0
Imagine How To Change: Explicit Procedure Modeling for Change CaptioningCode0
NOVA: Next-step Open-Vocabulary Autoregression for 3D Multi-Object Tracking in Autonomous DrivingCode0
WorldCache: Accelerating World Models for Free via Heterogeneous Token CachingCode0
From Prompting to Preference Optimization: A Comparative Study of LLM-based Automated Essay ScoringCode0
Modeling and Measuring Redundancy in Multisource Multimodal Data for Autonomous DrivingCode0
xaitimesynth: A Python Package for Evaluating Attribution Methods for Time Series with Synthetic Ground TruthCode0
EarthBridge: A Solution for 4th Multi-modal Aerial View Image Challenge Translation TrackCode0
Contextual Counterfactual Credit Assignment for Multi-Agent Reinforcement Learning in LLM CollaborationCode0
How Private Are DNA Embeddings? Inverting Foundation Model Representations of Genomic SequencesCode0
Better Late Than Never: Meta-Evaluation of Latency Metrics for Simultaneous Speech-to-Text TranslationCode0
MASFactory: A Graph-centric Framework for Orchestrating LLM-Based Multi-Agent Systems with Vibe GraphingCode0
Spectral and Trajectory Regularization for Diffusion Transformer Super-ResolutionCode0
Validating Interpretability in siRNA Efficacy Prediction: A Perturbation-Based, Dataset-Aware ProtocolCode0
Can we Trust Unreliable Voxels? Exploring 3D Semantic Occupancy Prediction under Label NoiseCode0
VisualPrompter: Semantic-Aware Prompt Optimization with Visual Feedback for Text-to-Image SynthesisCode0
SGDFuse: SAM-Guided Diffusion Model for High-Fidelity Infrared and Visible Image FusionCode0
PepEDiff: Zero-Shot Peptide Binder Design via Protein Embedding DiffusionCode0
Neural Signals Generate Clinical Notes in the WildCode0
From Features to Actions: Explainability in Traditional and Agentic AI SystemsCode0
Kiwi-Edit: Versatile Video Editing via Instruction and Reference GuidanceCode0
FontUse: A Data-Centric Approach to Style- and Use-Case-Conditioned In-Image TypographyCode0
Devil is in Narrow Policy: Unleashing Exploration in Driving VLA ModelsCode0
LIT-RAGBench: Benchmarking Generator Capabilities of Large Language Models in Retrieval-Augmented GenerationCode0
Adaptive Language-Aware Image Reflection Removal NetworkCode0
Cut to the Chase: Training-free Multimodal Summarization via Chain-of-EventsCode0
REACT++: Efficient Cross-Attention for Real-Time Scene Graph GenerationCode0
Don't Freeze, Don't Crash: Extending the Safe Operating Range of Neural Navigation in Dense CrowdsCode0
SpatialMAGIC: A Hybrid Framework Integrating Graph Diffusion and Spatial Attention for Spatial Transcriptomics ImputationCode0
NEST: Network- and Memory-Aware Device Placement For Distributed Deep LearningCode0
Reforming the Mechanism: Editing Reasoning Patterns in LLMs with Circuit ReshapingCode0
Extracting and analyzing 3D histomorphometric features related to perineural and lymphovascular invasion in prostate cancerCode0
Diffusion Alignment as Variational Expectation-MaximizationCode0
Show:102550
← PrevPage 207 of 13232Next →