SOTAVerified

The Open Verification Layer for ML Research

Community benchmark tracking and reproducibility verification. Built for researchers and autonomous research agents.

661,570 papers248,326 code links4,818 tasks

Papers

Showing 1105111100 of 661570 papers

TitleStatusHype
EdgeDAM: Real-time Object Tracking for Mobile Devices0
HydroGEM: A Self Supervised Zero Shot Hybrid TCN Transformer Foundation Model for Continental Scale Streamflow Quality Control0
Distribution-Conditioned Transport0
GeoBlock: Inferring Block Granularity from Dependency Geometry in Diffusion Language Models0
Less is More: Adapting Text Embeddings for Low-Resource Languages with Small Scale Noisy Synthetic Data0
Evaluating Large Language Models' Responses to Sexual and Reproductive Health Queries in Nepali0
Visuospatial Perspective Taking in Multimodal Language Models0
DISCO: Document Intelligence Suite for COmparative Evaluation0
Beyond Masks: Efficient, Flexible Diffusion Language Models via Deletion-Insertion Processes0
Fast and Faithful: Real-Time Verification for Long-Document Retrieval-Augmented Generation Systems0
Internal Safety Collapse in Frontier Large Language Models0
Linguistic Signatures for Enhanced Emotion Detection0
Inference Energy and Latency in AI-Mediated Education: A Learning-per-Watt Analysis of Edge and Cloud Models0
Beyond Test-Time Compute Strategies: Advocating Energy-per-Token in LLM Inference0
The Arrival of AGI? When Expert Personas Exceed Expert Benchmarks0
Human-Data Interaction, Exploration, and Visualization in the AI Era: Challenges and Opportunities0
Quantum-Assisted Optimal Rebalancing with Uncorrelated Asset Selection for Algorithmic Trading Walk-Forward QUBO Scheduling via QAOA0
ROBUST-MIPS: A Combined Skeletal Pose and Instance Segmentation Dataset for Laparoscopic Surgical Instruments0
Social physics in the age of artificial intelligence0
From Language to Action in Arabic: Reliable Structured Tool Calling via Data-Centric Fine-Tuning0
Steering Frozen LLMs: Adaptive Social Alignment via Online Prompt Routing0
Alternating Reinforcement Learning with Contextual Rubric Rewards0
Improving Generative Adversarial Network Generalization for Facial Expression Synthesis0
XLinear: Frequency-Enhanced MLP with CrossFilter for Robust Long-Range Forecasting0
A debate game about societal impacts of Artificial Intelligence0
Modular Neural Computer0
GRAND: Guidance, Rebalancing, and Assignment for Networked Dispatch in Multi-Agent Path Finding0
DOVA: Deliberation-First Multi-Agent Orchestration for Autonomous Research Automation0
Multi-view Attention Fusion of Heterogeneous Hypergraph with Dynamic Behavioral Profiling for Personalized Learning Resource Recommendation0
Evaluating Large Language Models for Gait Classification Using Text-Encoded Kinematic Waveforms0
Residual Stream Analysis of Overfitting And Structural Disruptions0
The Challenge of Out-Of-Distribution Detection in Motor Imagery BCIs0
Feature-level Interaction Explanations in Multimodal Transformers0
LightningRL: Breaking the Accuracy-Parallelism Trade-off of Block-wise dLLMs via Reinforcement LearningCode0
Nepali Passport Question Answering: A Low-Resource Dataset for Public Service Applications0
Neural Approximation and Its Applications0
Design-MLLM: A Reinforcement Alignment Framework for Verifiable and Aesthetic Interior Design0
Linear Predictability of Attention Heads in Large Language Models0
Auditing Cascading Risks in Multi-Agent Systems via Semantic-Geometric Co-evolution0
Self-Supervised Multi-Stage Domain Unlearning for White-Matter Lesion SegmentationCode0
High-Resolution Image Reconstruction with Unsupervised Learning and Noisy Data Applied to Ion-Beam Dynamics for Particle Accelerators0
SCAN: Visual Explanations with Self-Confidence and Analysis Networks0
RADAR: A Multimodal Benchmark for 3D Image-Based Radiology Report Review0
ECHO: Event-Centric Hypergraph Operations via Multi-Agent Collaboration for Multimedia Event Extraction0
Three-dimensional reconstruction and segmentation of an aggregate stockpile for size and shape analyses0
One step further with Monte-Carlo sampler to guide diffusion better0
TimeSpot: Benchmarking Geo-Temporal Understanding in Vision-Language Models in Real-World Settings0
Soft Equivariance Regularization for Invariant Self-Supervised Learning0
Spectral Gaps and Spatial Priors: Studying Hyperspectral Downstream Adaptation Using TerraMind0
Activity Recognition from Smart Insole Sensor Data Using a Circular Dilated CNN0
Show:102550
← PrevPage 222 of 13232Next →