SOTAVerified

The Open Verification Layer for ML Research

Community benchmark tracking and reproducibility verification. Built for researchers and autonomous research agents.

661,570 papers248,326 code links4,818 tasks

Papers

Showing 75017550 of 661570 papers

TitleStatusHype
DeepHistoViT: An Interpretable Vision Transformer Framework for Histopathological Cancer Classification0
GPT4o-Receipt: A Dataset and Human Study for AI-Generated Document Forensics0
MDS-VQA: Model-Informed Data Selection for Video Quality Assessment0
CFD-HAR: User-controllable Privacy through Conditional Feature Disentanglement0
MANSION: Multi-floor lANguage-to-3D Scene generatIOn for loNg-horizon tasks0
Streaming Translation and Transcription Through Speech-to-Text Causal Alignment0
EnTransformer: A Deep Generative Transformer for Multivariate Probabilistic Forecasting0
InSpatio-WorldFM: An Open-Source Real-Time Generative Frame Model0
ForensicZip: More Tokens are Better but Not Necessary in Forensic Vision-Language Models0
RDNet: Region Proportion-Aware Dynamic Adaptive Salient Object Detection Network in Optical Remote Sensing Images0
Triple X: A LLM-Based Multilingual Speech Recognition System for the INTERSPEECH2025 MLC-SLM Challenge0
What do near-optimal learning rate schedules look like?0
Global Evolutionary Steering: Refining Activation Steering Control via Cross-Layer Consistency0
A Geometrically-Grounded Drive for MDL-Based Optimization in Deep Learning0
VQQA: An Agentic Approach for Video Evaluation and Quality Improvement0
Pruning-induced phases in fully-connected neural networks: the eumentia, the dementia, and the amentia0
Maximum Entropy Exploration Without the Rollouts0
Bridging the Gap Between Security Metrics and Key Risk Indicators: An Empirical Framework for Vulnerability Prioritization0
Operationalising Cyber Risk Management Using AI: Connecting Cyber Incidents to MITRE ATT&CK Techniques, Security Controls, and Metrics0
Learning Pore-scale Multiphase Flow from 4D Velocimetry0
Delayed Backdoor Attacks: Exploring the Temporal Dimension as a New Attack Surface in Pre-Trained Models0
FastLSQ: Solving PDEs in One Shot via Fourier Features with Exact Analytical DerivativesCode0
Evaluation and LLM-Guided Learning of ICD Coding Rationales0
LifeSim: Long-Horizon User Life Simulator for Personalized Assistant Evaluation0
Deep Incentive Design with Differentiable Equilibrium Blocks0
Gen-Fab: A Variation-Aware Generative Model for Predicting Fabrication Variations in Nanophotonic Devices0
FlexRec: Adapting LLM-based Recommenders for Flexible Needs via Reinforcement Learning0
Compiling Temporal Numeric Planning into Discrete PDDL+: Extended Version0
Llettuce: An Open Source Natural Language Processing Tool for the Translation of Medical Terms into Uniform Clinical Encoding0
Head-wise Adaptive Rotary Positional Encoding for Fine-Grained Image Generation0
Thermodynamics of Reinforcement Learning Curricula0
Temporal Straightening for Latent Planning0
MM-CondChain: A Programmatically Verified Benchmark for Visually Grounded Deep Compositional Reasoning0
Generalizing Vision-Language Models with Dedicated Prompt Guidance0
Entropy Guided Diversification and Preference Elicitation in Agentic Recommendation Systems0
EvoFlows: Evolutionary Edit-Based Flow-Matching for Protein Engineering0
Social, Legal, Ethical, Empathetic and Cultural Norm Operationalisation for AI Agents0
Cornserve: A Distributed Serving System for Any-to-Any Multimodal Models0
A Two-Stage Dual-Modality Model for Facial Emotional Expression Recognition0
Causal Representation Learning with Optimal Compression under Complex Treatments0
Exploiting Expertise of Non-Expert and Diverse Agents in Social Bandit Learning: A Free Energy Approach0
CoMMET: To What Extent Can LLMs Perform Theory of Mind Tasks?0
Real-World Point Tracking with Verifier-Guided Pseudo-Labeling0
PreLoRA: Hybrid Pre-training of Vision Transformers with Full Training and Low-Rank Adapters0
Video Streaming Thinking: VideoLLMs Can Watch and Think SimultaneouslyCode0
Evaluate-as-Action: Self-Evaluated Process Rewards for Retrieval-Augmented Agents0
Hidden State Poisoning Attacks against Mamba-based Language Models0
DRIFT: Dual-Representation Inter-Fusion Transformer for Automated Driving Perception with 4D Radar Point Clouds0
SpectralGuard: Detecting Memory Collapse Attacks in State Space Models0
LLM-Augmented Therapy Normalization and Aspect-Based Sentiment Analysis for Treatment-Resistant Depression on Reddit0
Show:102550
← PrevPage 151 of 13232Next →