SOTAVerified

The Open Verification Layer for ML Research

Community benchmark tracking and reproducibility verification. Built for researchers and autonomous research agents.

661,570 papers248,326 code links4,818 tasks

Papers

Showing 995110000 of 661570 papers

TitleStatusHype
Fine-Tuning and Evaluating Conversational AI for Agricultural Advisory0
OD-RASE: Ontology-Driven Risk Assessment and Safety Enhancement for Autonomous Driving0
Latent Diffusion-Based 3D Molecular Recovery from Vibrational Spectra0
Prompt Group-Aware Training for Robust Text-Guided Nuclei Segmentation0
SurgFormer: Scalable Learning of Organ Deformation with Resection Support and Real-Time Inference0
XMACNet: An Explainable Lightweight Attention based CNN with Multi Modal Fusion for Chili Disease Classification0
PICS: Pairwise Image Compositing with Spatial InteractionsCode0
FedARKS: Federated Aggregation via Robust and Discriminative Knowledge Selection and Integration for Person Re-identification0
Measuring AI R&D Automation0
Classroom AI: Large Language Models as Grade-Specific Teachers0
Chain-of-Thought Reasoning Improves Context-Aware Translation with Large Language Models0
A Geometric Perspective on the Difficulties of Learning GNN-based SAT Solvers0
LiveSense: A Real-Time Wi-Fi Sensing Platform for Range-Doppler on COTS Laptop0
KCLarity at SemEval-2026 Task 6: Encoder and Zero-Shot Approaches to Political Evasion Detection0
Agri-Query: A Case Study on RAG vs. Long-Context LLMs for Cross-Lingual Technical Question Answering0
CAReDiO: Cultural Alignment via Representativeness and Distinctiveness Guided Data Optimization0
Instance Data Condensation for Image Super-Resolution0
Sysformer: Safeguarding Frozen Large Language Models with Adaptive System Prompts0
NarrLV: Towards a Comprehensive Narrative-Centric Evaluation for Long Video Generation0
Gaussian Set Surface Reconstruction through Per-Gaussian Optimization0
Diverse and Adaptive Behavior Curriculum for Autonomous Driving: A Student-Teacher Framework with Multi-Agent RL0
A Multi-Agent System Enables Versatile Information Extraction from the Chemical Literature0
MAP: Mitigating Hallucinations in Large Vision-Language Models with Map-Level Attention Processing0
TIC-GRPO: Provable and Efficient Optimization for Reinforcement Learning from Human Feedback0
VLMQ: Token Saliency-Driven Post-Training Quantization for Vision-language Models0
DianJin-OCR-R1: Enhancing OCR Capabilities via a Reasoning-and-Tool Interleaved Vision-Language Model0
SSL-SLR: Self-Supervised Representation Learning for Sign Language Recognition0
RED: Robust Event-Guided Motion Deblurring with Modality-Specific Disentanglement0
VEGA: Electric Vehicle Navigation Agent via Physics-Informed Neural Operator and Proximal Policy Optimization0
Spectral/Spatial Tensor Atomic Cluster Expansion with Universal Embeddings in Cartesian Space0
Auto-Regressive U-Net for Full-Field Prediction of Shrinkage-Induced Damage in Concrete0
Decision-Driven Semantic Object Exploration for Legged Robots via Confidence-Calibrated Perception and Topological Subgoal Selection0
Taxonomy-aware Dynamic Motion Generation on Hyperbolic Manifolds0
Online Minimization of Polarization and Disagreement via Low-Rank Matrix Bandits0
Self-Speculative Masked Diffusions0
Decoding Partial Differential Equations: Cross-Modal Adaptation of Decoder-only Models to PDEs0
How Reliable is Language Model Micro-Benchmarking?0
Do LLMs Really Know What They Don't Know? Internal States Mainly Reflect Knowledge Recall Rather Than Truthfulness0
Beyond Flat Unknown Labels in Open-World Object Detection0
CanvasMAR: Improving Masked Autoregressive Video Prediction With Canvas0
Shoot First, Ask Questions Later? Building Rational Agents that Explore and Act Like People0
OnlineSI: Taming Large Language Model for Online 3D Understanding and Grounding0
Co-Layout: LLM-driven Co-optimization for Interior Layout0
DETECT: Determining Ease and Textual Clarity of German Text Simplifications0
Culture in Action: Evaluating Text-to-Image Models through Social Activities0
LaxMotion: Rethinking Supervision Granularity for 3D Human Motion Generation0
MRIQT: Physics-Aware Diffusion Model for Image Quality Transfer in Neonatal Ultra-Low-Field MRI0
A method for tissue-mask supported whole-body image registration in the UK Biobank0
Whatever Remains Must Be True: Filtering Drives Reasoning in LLMs, Shaping Diversity0
DFIR-DETR: Frequency-Domain Iterative Refinement and Dynamic Feature Aggregation for Small Object Detection0
Show:102550
← PrevPage 200 of 13232Next →