SOTAVerified

The Open Verification Layer for ML Research

Community benchmark tracking and reproducibility verification. Built for researchers and autonomous research agents.

474,278 papers248,326 code links4,818 tasks

Papers

Showing 1365113700 of 474278 papers

TitleStatusHype
Deception Detection in Dyadic Exchanges Using Multimodal Machine Learning: A Study on a Swedish Cohort0
SMMILE: An Expert-Driven Benchmark for Multimodal Medical In-Context Learning0
Distributed Cross-Channel Hierarchical Aggregation for Foundation Models0
Where to find Grokking in LLM Pretraining? Monitor Memorization-to-Generalization without Test0
Predictive Maintenance Optimization for Smart Vending Machines Using IoT and Machine Learning0
Graph-Structured Feedback Multimodel Ensemble Online Conformal Prediction0
KaLM-Embedding-V2: Superior Training Techniques and Data Inspire A Versatile Embedding ModelCode2
ComRAG: Retrieval-Augmented Generation with Dynamic Vector Stores for Real-time Community Question Answering in Industry0
TableMoE: Neuro-Symbolic Routing for Structured Expert Reasoning in Multimodal Table UnderstandingCode0
FineWeb2: One Pipeline to Scale Them All -- Adapting Pre-Training Data Processing to Every LanguageCode3
Can "consciousness" be observed from large language model (LLM) internal states? Dissecting LLM representations obtained from Theory of Mind test with Integrated Information Theory and Span Representation analysis0
ODE_t (ODE_l ): Shortcutting the Time and Length in Diffusion and Flow Models for Faster SamplingCode0
Complexity-aware fine-tuningCode0
Agent-RewardBench: Towards a Unified Benchmark for Reward Modeling across Perception, Planning, and Safety in Real-World Multimodal AgentsCode0
Model State Arithmetic for Machine UnlearningCode0
Asymmetric Dual Self-Distillation for 3D Self-Supervised Representation LearningCode0
CaloHadronic: a diffusion model for the generation of hadronic showersCode0
Domain Knowledge-Enhanced LLMs for Fraud and Concept Drift Detection0
Scalable Bayesian Low-Rank Adaptation of Large Language Models via Stochastic Variational Subspace InferenceCode0
Mind2Web 2: Evaluating Agentic Search with Agent-as-a-Judge0
Distilling Normalizing Flows0
Large Language Models Acing Chartered Accountancy0
Process mining-driven modeling and simulation to enhance fault diagnosis in cyber-physical systems0
rQdia: Regularizing Q-Value Distributions With Image Augmentation0
Optimising Language Models for Downstream Tasks: A Post-Training Perspective0
Strict Subgoal Execution: Reliable Long-Horizon Planning in Hierarchical Reinforcement Learning0
Efficient Skill Discovery via Regret-Aware Optimization0
Compressed and Smooth Latent Space for Text Diffusion Modeling0
Enhancing LLM Tool Use with High-quality Instruction Data from Knowledge Graph0
Antibody Design and Optimization with Multi-scale Equivariant Graph Diffusion Models for Accurate Complex Antigen BindingCode0
Detecting Referring Expressions in Visually Grounded Dialogue with Autoregressive Language ModelsCode0
mTSBench: Benchmarking Multivariate Time Series Anomaly Detection and Model Selection at ScaleCode0
TopK Language Models0
Zero-Shot Learning for Obsolescence Risk Forecasting0
Latent Prototype Routing: Achieving Near-Perfect Load Balancing in Mixture-of-ExpertsCode0
Discovering multiple antibiotic resistance phenotypes using diverse top-k subgroup list discoveryCode0
FaSTA^*: Fast-Slow Toolpath Agent with Subroutine Mining for Efficient Multi-turn Image EditingCode1
IPFormer-VideoLLM: Enhancing Multi-modal Video Understanding for Multi-shot Scenes0
DeOcc-1-to-3: 3D De-Occlusion from a Single Image via Self-Supervised Multi-View DiffusionCode1
HumanOmniV2: From Understanding to Omni-Modal Reasoning with ContextCode2
Detection of Breast Cancer Lumpectomy Margin with SAM-incorporated Forward-Forward Contrastive LearningCode0
Video Virtual Try-on with Conditional Diffusion Transformer Inpainter0
Boosting Domain Generalized and Adaptive Detection with Diffusion Models: Fitness, Generalization, and TransferabilityCode1
Active Learning for Manifold Gaussian Process RegressionCode0
MedPrompt: LLM-CNN Fusion with Weight Routing for Medical Image Segmentation and Classification0
Analysis of Null Related Beampattern Measures and Signal Quantization Effects for Linear Differential Microphone Arrays0
Forecasting Geopolitical Events with a Sparse Temporal Fusion Transformer and Gaussian Process Hybrid: A Case Study in Middle Eastern and U.S. Conflict Dynamics0
Linearity-based neural network compression0
A Semi-supervised Scalable Unified Framework for E-commerce Query Classification0
Enhancing Automatic Term Extraction with Large Language Models via Syntactic Retrieval0
Show:102550
← PrevPage 274 of 9486Next →