SOTAVerified

The Open Verification Layer for ML Research

Community benchmark tracking and reproducibility verification. Built for researchers and autonomous research agents.

474,278 papers248,326 code links4,818 tasks

Papers

Showing 1400114050 of 474278 papers

TitleStatusHype
Pay Less Attention to Deceptive Artifacts: Robust Detection of Compressed Deepfakes on Online Social NetworksCode0
Memento: Note-Taking for Your Future Self0
Probing AI Safety with Source CodeCode0
Feature Hallucination for Self-supervised Action Recognition0
Permutation Equivariant Neural Controlled Differential Equations for Dynamic Graph Representation Learning0
Producer-Fairness in Sequential Bundle Recommendation0
GymPN: A Library for Decision-Making in Process Management Systems0
Enterprise Large Language Model Evaluation Benchmark0
DuoGPT: Training-free Dual Sparsity through Activation-aware Pruning in LLMs0
BrokenVideos: A Benchmark Dataset for Fine-Grained Artifact Localization in AI-Generated Videos0
SEED: A Structural Encoder for Embedding-Driven Decoding in Time Series Prediction with LLMs0
Valid Selection among Conformal Sets0
Learning Moderately Input-Sensitive Functions: A Case Study in QR Code Decoding0
Progressive Alignment Degradation Learning for Pansharpening0
Directed Link Prediction using GNN with Local and Global Feature Fusion0
Time-series surrogates from energy consumers generated by machine learning approaches for long-term forecasting scenarios0
DipSVD: Dual-importance Protected SVD for Efficient LLM Compression0
Client Clustering Meets Knowledge Sharing: Enhancing Privacy and Robustness in Personalized Peer-to-Peer Learning0
Off-Policy Evaluation and Learning for the Future under Non-Stationarity0
An Agentic System for Rare Disease Diagnosis with Traceable Reasoning0
Counterfactual Influence as a Distributional Quantity0
Large Language Model-Driven Code Compliance Checking in Building Information Modeling0
Dense Video Captioning using Graph-based Sentence Summarization0
Weighted Mean Frequencies: a handcraft Fourier feature for 4D Flow MRI segmentation0
Inside you are many wolves: Using cognitive models to interpret value trade-offs in LLMs0
ITFormer: Bridging Time Series and Natural Language for Multi-Modal QA with Large-Scale Multitask Dataset0
Intrinsic vs. Extrinsic Evaluation of Czech Sentence Embeddings: Semantic Relevance Doesn't Help with MT Evaluation0
CBF-AFA: Chunk-Based Multi-SSL Fusion for Automatic Fluency Assessment0
FundaQ-8: A Clinically-Inspired Scoring Framework for Automated Fundus Image Quality Assessment0
Asymmetric REINFORCE for off-Policy Reinforcement Learning: Balancing positive and negative rewards0
A Survey of Predictive Maintenance Methods: An Analysis of Prognostics via Classification and Regression0
On the ability of Deep Neural Networks to Learn Granger Causality in Multi-Variate Time Series Data0
Towards Interpretable and Efficient Feature Selection in Trajectory Datasets: A Taxonomic Approach0
Physics-Informed Machine Learning Regulated by Finite Element Analysis for Simulation Acceleration of Laser Powder Bed Fusion0
Demonstration of effective UCB-based routing in skill-based queues on real-world data0
Exploring Graph-Transformer Out-of-Distribution Generalization Abilities0
Mastering Multiple-Expert Routing: Realizable H-Consistency and Strong Guarantees for Learning to Defer0
Causal Representation Learning with Observational Grouping for CXR Classification0
Multimodal Representation Learning and Fusion0
GPTailor: Large Language Model Pruning Through Layer Cutting and StitchingCode1
A foundation model with multi-variate parallel attention to generate neuronal activityCode1
Q-resafe: Assessing Safety Risks and Quantization-aware Safety Patching for Quantized Large Language ModelsCode1
Fine-Tuning and Prompt Engineering of LLMs, for the Creation of Multi-Agent AI for Addressing Sustainable Protein Production ChallengesCode0
FedBKD: Distilled Federated Learning to Embrace Gerneralization and Personalization on Non-IID DataCode0
Argumentative Ensembling for Robust Recourse under Model MultiplicityCode0
WattsOnAI: Measuring, Analyzing, and Visualizing Energy and Carbon Footprint of AI WorkloadsCode1
A Multi-Pass Large Language Model Framework for Precise and Efficient Radiology Report Error DetectionCode0
Tackling Data Heterogeneity in Federated Learning through Knowledge Distillation with Inequitable AggregationCode0
The kernel of graph indices for vector searchCode0
Language Modeling by Language ModelsCode2
Show:102550
← PrevPage 281 of 9486Next →