SOTAVerified

The Open Verification Layer for ML Research

Community benchmark tracking and reproducibility verification. Built for researchers and autonomous research agents.

661,570 papers248,326 code links4,818 tasks

Papers

Showing 1160111650 of 661570 papers

TitleStatusHype
Segment-to-Act: Label-Noise-Robust Action-Prompted Video Segmentation Towards Embodied IntelligenceCode0
AMiD: Knowledge Distillation for LLMs with α-mixture Assistant DistributionCode0
When and Where to Reset Matters for Long-Term Test-Time AdaptationCode0
Toward Early Quality Assessment of Text-to-Image Diffusion ModelsCode0
Glass Segmentation with Fusion of Learned and General Visual FeaturesCode0
From Narrow to Panoramic Vision: Attention-Guided Cold-Start Reshapes Multimodal ReasoningCode0
Bridging Human Evaluation to Infrared and Visible Image FusionCode0
Joint Hardware-Workload Co-Optimization for In-Memory Computing AcceleratorsCode0
CzechTopic: A Benchmark for Zero-Shot Topic Localization in Historical Czech DocumentsCode0
RIVER: A Real-Time Interaction Benchmark for Video LLMsCode0
VietNormalizer: An Open-Source, Dependency-Free Python Library for Vietnamese Text Normalization in TTS and NLP ApplicationsCode0
Error as Signal: Stiffness-Aware Diffusion Sampling via Embedded Runge-Kutta GuidanceCode0
Topological Alignment of Shared Vision-Language Embedding SpaceCode0
Towards Realistic Personalization: Evaluating Long-Horizon Preference Following in Personalized User-LLM InteractionsCode0
VidEoMT: Your ViT is Secretly Also a Video Segmentation Model2
Agent Data Protocol: Unifying Datasets for Diverse, Effective Fine-tuning of LLM Agents2
VideoChat-M1: Collaborative Policy Planning for Video Understanding via Multi-Agent Reinforcement Learning0
Crab^+: A Scalable and Unified Audio-Visual Scene Understanding Model with Explicit Cooperation0
MERIT: Memory-Enhanced Retrieval for Interpretable Knowledge Tracing0
Evaluating Prompting Strategies for Chart Question Answering with Large Language Models0
Multi-Agent Debate with Memory Masking0
Locally Coherent Parallel Decoding in Diffusion Language Models0
Expected Reward Prediction, with Applications to Model Routing0
An experimental study of KV cache reuse strategies in chunk-level caching systems0
Thinking into the Future: Latent Lookahead Training for Transformers0
GSI Agent: Domain Knowledge Enhancement for Large Language Models in Green Stormwater Infrastructure0
EEG-SeeGraph: Interpreting functional connectivity disruptions in dementias via sparse-explanatory dynamic EEG-graph learning0
EEG-Based Brain-LLM Interface for Human Preference Aligned Generation0
Tokenization Tradeoffs in Structured EHR Foundation Models0
Form Follows Function: Recursive Stem Model0
CraniMem: Cranial Inspired Gated and Bounded Memory for Agentic SystemsCode0
Evidence-based Distributional Alignment for Large Language Models0
Benchmarking Compact VLMs for Clip-Level Surveillance Anomaly Detection Under Weak Supervision0
Task Expansion and Cross Refinement for Open-World Conditional Modeling0
Preventing Curriculum Collapse in Self-Evolving Reasoning Systems0
Suppressing Domain-Specific Hallucination in Construction LLMs: A Knowledge Graph Foundation for GraphRAG and QLoRA on River and Sediment Control Technical Standards0
A Browser-based Open Source Assistant for Multimodal Content Verification0
Hybrid Orchestration of Edge AI and Microservices via Graph-based Self-Imitation Learning0
calibfusion: Transformer-Based Differentiable Calibration for Radar-Camera Fusion Detection in Water-Surface Environments0
Unmixing microinfrared spectroscopic images of cross-sections of historical oil paintings0
XAI and Few-shot-based Hybrid Classification Model for Plant Leaf Disease Prognosis0
Chart Deep Research in LVLMs via Parallel Relative Policy Optimization0
VB: Visibility Benchmark for Visibility and Perspective Reasoning in Images0
MultiGen: Level-Design for Editable Multiplayer Worlds in Diffusion Game Engines0
ERP-RiskBench: Leakage-Safe Ensemble Learning for Financial Risk0
Does Semantic Noise Initialization Transfer from Images to Videos? A Paired Diagnostic Study0
AutoFigure-Edit: Generating Editable Scientific IllustrationCode0
GNN For Muon Particle Momentum estimation0
A theoretical model of dynamical grammatical gender shifting based on set-valued set function0
Baseline Performance of AI Tools in Classifying Cognitive Demand of Mathematical Tasks0
Show:102550
← PrevPage 233 of 13232Next →