SOTAVerified

The Open Verification Layer for ML Research

Community benchmark tracking and reproducibility verification. Built for researchers and autonomous research agents.

658,356 papers258,216 code links4,818 tasks

Papers

Showing 201250 of 658356 papers

TitleStatusHype
Can Large Multimodal Models Inspect Buildings? A Hierarchical Benchmark for Structural Pathology Reasoning0
Improving Generalization on Cybersecurity Tasks with Multi-Modal Contrastive Learning0
Enhancing Hyperspace Analogue to Language (HAL) Representations via Attention-Based Pooling for Text Classification0
Design-OS: A Specification-Driven Framework for Engineering System Design with a Control-Systems Design Case0
Beyond Single Tokens: Distilling Discrete Diffusion Models via Discrete MMD0
Semantic Token Clustering for Efficient Uncertainty Quantification in Large Language Models0
Evaluating Evidence Grounding Under User Pressure in Instruction-Tuned Language Models0
The Robot's Inner Critic: Self-Refinement of Social Behaviors through VLM-based Replanning0
EgoForge: Goal-Directed Egocentric World Simulator0
Learning Dynamic Belief Graphs for Theory-of-mind Reasoning0
TinyML Enhances CubeSat Mission Capabilities0
LagerNVS: Latent Geometry for Fully Neural Real-time Novel View Synthesis0
AI Agents Can Already Autonomously Perform Experimental High Energy Physics0
Adaptive Greedy Frame Selection for Long Video Understanding0
VideoSeek: Long-Horizon Video Agent with Tool-Guided Seeking0
Improving Image-to-Image Translation via a Rectified Flow Reformulation0
MeanFlow Meets Control: Scaling Sampled-Data Control for Swarms0
Deterministic Mode Proposals: An Efficient Alternative to Generative Sampling for Ambiguous Segmentation0
LumosX: Relate Any Identities with Their Attributes for Personalized Video Generation0
MME-CoF-Pro: Evaluating Reasoning Coherence in Video Generative Models with Text and Visual Hints0
Graph-Informed Adversarial Modeling: Infimal Subadditivity of Interpolative Divergences0
Layered Quantum Architecture Search for 3D Point Cloud Classification0
Enhancing Alignment for Unified Multimodal Models via Semantically-Grounded Supervision0
A Super Fast K-means for Indexing Vector EmbeddingsCode1
Dual Prompt-Driven Feature Encoding for Nighttime UAV TrackingCode0
DynFlowDrive: Flow-Based Dynamic World Modeling for Autonomous DrivingCode0
Vision-Language Attribute Disentanglement and Reinforcement for Lifelong Person Re-IdentificationCode0
Unbiased Dynamic Multimodal FusionCode0
Demographic-Aware Self-Supervised Anomaly Detection Pretraining for Equitable Rare Cardiac DiagnosisCode0
BALM: A Model-Agnostic Framework for Balanced Multimodal Learning under Imbalanced Missing RatesCode0
ReManNet: A Riemannian Manifold Network for Monocular 3D Lane DetectionCode0
IsoCLIP: Decomposing CLIP Projectors for Efficient Intra-modal AlignmentCode0
What If Consensus Lies? Selective-Complementary Reinforcement Learning at Test TimeCode0
Deep Autocorrelation Modeling for Time-Series Forecasting: Progress and ProspectsCode0
Learning Like Humans: Analogical Concept Learning for Generalized Category DiscoveryCode0
MedSPOT: A Workflow-Aware Sequential Grounding Benchmark for Clinical GUICode0
CFCML: A Coarse-to-Fine Crossmodal Learning Framework For Disease Diagnosis Using Multimodal Images and Tabular DataCode0
Kolmogorov-Arnold causal generative modelsCode0
MuSteerNet: Human Reaction Generation from Videos via Observation-Reaction Mutual SteeringCode0
Wildfire Spread Scenarios: Increasing Sample Diversity of Segmentation Diffusion Models with Training-Free MethodsCode0
CoVR-R:Reason-Aware Composed Video RetrievalCode0
From Masks to Pixels and Meaning: A New Taxonomy, Benchmark, and Metrics for VLM Image TamperingCode0
EvidenceRL: Reinforcing Evidence Consistency for Trustworthy Language ModelsCode0
PFM-VEPAR: Prompting Foundation Models for RGB-Event Camera based Pedestrian Attribute RecognitionCode0
CurveStream: Boosting Streaming Video Understanding in MLLMs via Curvature-Aware Hierarchical Visual Memory ManagementCode0
MagicSeg: Open-World Segmentation Pretraining via Counterfactural Diffusion-Based Auto-GenerationCode0
Semantic Audio-Visual Navigation in Continuous EnvironmentsCode0
RouterKGQA: Specialized--General Model Routing for Constraint-Aware Knowledge Graph Question AnsweringCode0
Experience is the Best Teacher: Motivating Effective Exploration in Reinforcement Learning for LLMsCode0
MFil-Mamba: Multi-Filter Scanning for Spatial Redundancy-Aware Visual State Space ModelsCode0
Show:102550
← PrevPage 5 of 13168Next →