SOTAVerified

The Open Verification Layer for ML Research

Community benchmark tracking and reproducibility verification. Built for researchers and autonomous research agents.

661,570 papers248,326 code links4,818 tasks

Papers

Showing 57015750 of 661570 papers

TitleStatusHype
Practicing with Language Models Cultivates Human Empathic Communication0
Directional Embedding Smoothing for Robust Vision Language Models0
A Closer Look into LLMs for Table Understanding0
Scalable Simulation-Based Model Inference with Test-Time Complexity Control0
Enhancing classification accuracy through chaos0
Tagarela - A Portuguese speech dataset from podcasts0
MeMix: Writing Less, Remembering More for Streaming 3D Reconstruction0
Persistence Spheres: a Bi-continuous Linear Representation of Measures for Partial Optimal Transport0
RieMind: Geometry-Grounded Spatial Agent for Scene Understanding0
Fusian: Multi-LoRA Fusion for Fine-Grained Continuous MBTI Personality Control in Large Language Models0
Evasive Intelligence: Lessons from Malware Analysis for Evaluating AI Agents0
Agent Lifecycle Toolkit (ALTK): Reusable Middleware Components for Robust AI Agents0
Estimating Staged Event Tree Models via Hierarchical Clustering on the Simplex0
ViX-Ray: A Vietnamese Chest X-Ray Dataset for Vision-Language Models0
Clinically Aware Synthetic Image Generation for Concept Coverage in Chest X-ray Models0
Can LLMs Model Incorrect Student Reasoning? A Case Study on Distractor Generation0
Self-Distillation of Hidden Layers for Self-Supervised Representation Learning0
Mamba-3: Improved Sequence Modeling using State Space Principles0
Do Metrics for Counterfactual Explanations Align with User Perception?0
Towards Generalizable Robotic Manipulation in Dynamic EnvironmentsCode0
GlyphPrinter: Region-Grouped Direct Preference Optimization for Glyph-Accurate Visual Text Rendering2
Diverse AI Personas Can Mitigate the Homogenization Effect in Human-AI Collaborative Ideation0
IMAIA: Interactive Maps AI Assistant for Travel Planning and Geo-Spatial Intelligence0
CountLoop: Training-Free High-Instance Image Generation via Iterative Agent Guidance0
From Image Generation to Infrastructure Design: a Multi-agent Pipeline for Street Design Generation0
Protecting De-identified Documents from Search-based Linkage Attacks0
LeAD-M3D: Leveraging Asymmetric Distillation for Real-Time Monocular 3D Detection0
World Models for Learning Dexterous Hand-Object Interactions from Human Videos0
SolarGPT-QA: A Domain-Adaptive Large Language Model for Educational Question Answering in Space Weather and Heliophysics0
Prompt Sensitivity and Answer Consistency of Small Open-Source Language Models for Clinical Question Answering in Low-Resource Healthcare0
Match4Annotate: Propagating Sparse Video Annotations via Implicit Neural Feature Matching0
Defining AI Models and AI Systems: A Framework to Resolve the Boundary Problem0
Knowledge Graph Extraction from Biomedical Literature for Alkaptonuria Rare Disease0
CUBE: A Standard for Unifying Agent Benchmarks0
GLANCE: Gaze-Led Attention Network for Compressed Edge-inference0
Context-Length Robustness in Question Answering Models: A Comparative Empirical Study0
MiroThinker-1.7 & H1: Towards Heavy-Duty Research Agents via Verification0
Time-Aware Prior Fitted Networks for Zero-Shot Forecasting with Exogenous Variables0
Don't Trust Stubborn Neighbors: A Security Framework for Agentic Networks0
Longitudinal Risk Prediction in Mammography with Privileged History Distillation0
Conflict-Aware Multimodal Fusion for Ambivalence and Hesitancy Recognition0
Persona-Conditioned Risk Behavior in Large Language Models: A Simulated Gambling Study with GPT-4.10
Informationally Compressive Anonymization: Non-Degrading Sensitive Input Protection for Privacy-Preserving Supervised Machine Learning0
Regularized Latent Dynamics Prediction is a Strong Baseline For Behavioral Foundation Models0
The Internet of Physical AI Agents: Interoperability, Longevity, and the Cost of Getting It Wrong0
ExpertGen: Scalable Sim-to-Real Expert Policy Learning from Imperfect Behavior Priors0
Optimizing Hospital Capacity During Pandemics: A Dual-Component Framework for Strategic Patient Relocation0
MoLoRA: Composable Specialization via Per-Token Adapter Routing0
NLP Occupational Emergence Analysis: How Occupations Form and Evolve in Real Time -- A Zero-Assumption Method Demonstrated on AI in the US Technology Workforce, 2022-20260
Selective Memory for Artificial Intelligence: Write-Time Gating with Hierarchical Archiving0
Show:102550
← PrevPage 115 of 13232Next →