SOTAVerified

The Open Verification Layer for ML Research

Community benchmark tracking and reproducibility verification. Built for researchers and autonomous research agents.

474,278 papers248,326 code links4,818 tasks

Papers

Showing 1455114600 of 474278 papers

TitleStatusHype
Optimizing Mastery Learning by Fast-Forwarding Over-Practice StepsCode0
Reflective Verbal Reward Design for Pluralistic Alignment0
Enhancing Few-shot Keyword Spotting Performance through Pre-Trained Self-supervised Speech Models0
Trustworthy Chronic Disease Risk Prediction For Self-Directed Preventive Care via Medical Literature Validation0
Research on Model Parallelism and Data Parallelism Optimization Methods in Large Language Model-Based Recommendation Systems0
Leveling the Playing Field: Carefully Comparing Classical and Learned Controllers for Quadrotor Trajectory Tracking0
DRAMA-X: A Fine-grained Intent Prediction and Risk Reasoning Benchmark For DrivingCode1
RLRC: Reinforcement Learning-based Recovery for Compressed Vision-Language-Action Models0
ConsumerBench: Benchmarking Generative AI Applications on End-User DevicesCode1
Scalable Machine Learning Algorithms using Path Signatures0
SELFI: Selective Fusion of Identity for Generalizable Deepfake Detection0
AdRo-FL: Informed and Secure Client Selection for Federated Learning in the Presence of Adversarial Aggregator0
Quantum-Hybrid Support Vector Machines for Anomaly Detection in Industrial Control Systems0
AI Safety vs. AI Security: Demystifying the Distinction and Boundaries0
CultureMERT: Continual Pre-Training for Cross-Cultural Music Representation Learning0
CEGA: A Cost-Effective Approach for Graph-Based Model Extraction and AcquisitionCode0
Pix2Geomodel: A Next-Generation Reservoir Geomodeling with Property-to-Property Translation0
Secure Energy Transactions Using Blockchain Leveraging AI for Fraud Detection and Energy Market Stability0
PhysUniBench: An Undergraduate-Level Physics Reasoning Benchmark for Multimodal Models0
Large Language Model-Driven Surrogate-Assisted Evolutionary Algorithm for Expensive OptimizationCode0
Towards AI Search Paradigm0
CORE-KG: An LLM-Driven Knowledge Graph Construction Framework for Human Smuggling Networks0
LSCD: Lomb-Scargle Conditioned Diffusion for Time series Imputation0
The Importance of Being Lazy: Scaling Limits of Continual Learning0
OmniReflect: Discovering Transferable Constitutions for LLM agents via Neuro-Symbolic Reflections0
RocketStack: A level-aware deep recursive ensemble learning framework with exploratory feature fusion and model pruning dynamics0
MemBench: Towards More Comprehensive Evaluation on the Memory of LLM-based AgentsCode2
TransDreamerV3: Implanting Transformer In DreamerV3Code0
Self-supervised Feature Extraction for Enhanced Ball Detection on Soccer Robots0
UProp: Investigating the Uncertainty Propagation of LLMs in Multi-Step Agentic Decision-MakingCode0
Mesh-Informed Neural Operator : A Transformer Generative ApproachCode1
EHCube4P: Learning Epistatic Patterns Through Hypercube Graph Convolution Neural Network for Protein Fitness Function Estimation0
Efficient and faithful reconstruction of dynamical attractors using homogeneous differentiatorsCode0
Challenges in Grounding Language in the Real World0
Identifiability of Deep Polynomial Neural Networks0
Multi-Armed Bandits With Machine Learning-Generated Surrogate Rewards0
Sequence-to-Sequence Models with Attention Mechanistically Map to the Architecture of Human Memory Search0
A Minimalist Optimizer Design for LLM PretrainingCode1
Generative Modeling of Full-Atom Protein Conformations using Latent Diffusion on Graph EmbeddingsCode1
Optimal Depth of Neural Networks0
From Lab to Factory: Pitfalls and Guidelines for Self-/Unsupervised Defect Detection on Low-Quality Industrial Images0
Wi-Fi Sensing Tool Release: Gathering 802.11ax Channel State Information from a Commercial Wi-Fi Access Point0
Soft decision trees for survival analysis0
Bayesian Joint Model of Multi-Sensor and Failure Event Data for Multi-Mode Failure Prediction0
Beamforming design for minimizing the signal power estimation errorCode0
Brain-inspired interpretable reservoir computing with resonant recurrent neural networksCode0
Empirical Models of the Time Evolution of SPX Option Prices0
Low-Complexity Receiver Design for Affine Filter Bank Modulation0
Consistent Sampling and Simulation: Molecular Dynamics with Energy-Based Diffusion ModelsCode2
Metapath-based Hyperbolic Contrastive Learning for Heterogeneous Graph Embedding0
Show:102550
← PrevPage 292 of 9486Next →