SOTAVerified

The Open Verification Layer for ML Research

Community benchmark tracking and reproducibility verification. Built for researchers and autonomous research agents.

474,278 papers248,326 code links4,818 tasks

Papers

Showing 1425114300 of 474278 papers

TitleStatusHype
FlightKooba: A Fast Interpretable FTP Model0
MATER: Multi-level Acoustic and Textual Emotion Representation for Interpretable Speech Emotion Recognition0
Retrieval-Confused Generation is a Good Defender for Privacy Violation Attack of Large Language Models0
Orthogonal Soft Pruning for Efficient Class Unlearning0
Distillation-Enabled Knowledge Alignment for Generative Semantic Communications in AIGC Provisioning Tasks0
Inference Scaled GraphRAG: Improving Multi Hop Question Answering on Knowledge Graphs0
What Matters in LLM-generated Data: Diversity and Its Effect on Model Fine-Tuning0
Evaluating Rare Disease Diagnostic Performance in Symptom Checkers: A Synthetic Vignette Simulation Approach0
A Comparative Analysis of Reinforcement Learning and Conventional Deep Learning Approaches for Bearing Fault Diagnosis0
Neuromorphic Wireless Split Computing with Resonate-and-Fire Neurons0
Verifiable Unlearning on Edge0
Learning Instruction-Following Policies through Open-Ended Instruction Relabeling with Large Language Models0
VoxelOpt: Voxel-Adaptive Message Passing for Discrete Optimization in Deformable Abdominal CT RegistrationCode0
Accurate and Energy Efficient: Local Retrieval-Augmented Generation Models Outperform Commercial Large Language Models in Medical Tasks0
Universal pre-training by iterated random computationCode0
HERCULES: Hierarchical Embedding-based Recursive Clustering Using LLMs for Efficient SummarizationCode1
TRACED: Transition-aware Regret Approximation with Co-learnability for Environment DesignCode0
A Spatio-Temporal Point Process for Fine-Grained Modeling of Reading BehaviorCode0
Explaining deep neural network models for electricity price forecasting with XAI0
LSH-DynED: A Dynamic Ensemble Framework with LSH-Based Undersampling for Evolving Multi-Class Imbalanced ClassificationCode0
DIM-SUM: Dynamic IMputation for Smart Utility Management0
RepuNet: A Reputation System for Mitigating Malicious Clients in DFL0
Causal-Aware Intelligent QoE Optimization for VR Interaction with Adaptive Keyframe Extraction0
Hierarchical Reinforcement Learning and Value Optimization for Challenging Quadruped Locomotion0
Learning Bilateral Team Formation in Cooperative Multi-Agent Reinforcement Learning0
Prover Agent: An Agent-based Framework for Formal Mathematical Proofs0
AnchorDP3: 3D Affordance Guided Sparse Diffusion Policy for Robotic Manipulation0
Cross-Layer Discrete Concept Discovery for Interpreting Language Models0
Beyond Autocomplete: Designing CopilotLens Towards Transparent and Explainable AI Coding Agents0
A Framework for Uncertainty Quantification Based on Nearest Neighbors Across Layers0
QHackBench: Benchmarking Large Language Models for Quantum Code Generation Using PennyLane Hackathon Challenges0
Can LLMs Replace Humans During Code Chunking?0
New Insights on Unfolding and Fine-tuning Quantum Federated Learning0
Automated Generation of Diverse Courses of Actions for Multi-Agent Operations using Binary Optimization and Graph Learning0
CycleDistill: Bootstrapping Machine Translation using LLMs with Cyclical DistillationCode0
GNN's Uncertainty Quantification using Self-DistillationCode0
Supervised Coupled Matrix-Tensor Factorization (SCMTF) for Computational Phenotyping of Patient Reported Outcomes in Ulcerative ColitisCode0
Any-Order GPT as Masked Diffusion Model: Decoupling Formulation and ArchitectureCode1
Introducing EG-IPT and ipt~: a novel electric guitar dataset and a new Max/MSP object for real-time classification of instrumental playing techniquesCode1
Sensing Cardiac Health Across Scenarios and Devices: A Multi-Modal Foundation Model Pretrained on Heterogeneous Data from 1.7 Million Individuals0
Enhancing Image Restoration Transformer via Adaptive Translation Equivariance0
SIM-Net: A Multimodal Fusion Network Using Inferred 3D Object Shape Point Clouds from RGB Images for 2D Classification0
RAG-6DPose: Retrieval-Augmented 6D Pose Estimation via Leveraging CAD as Knowledge Base0
USVTrack: USV-Based 4D Radar-Camera Tracking Dataset for Autonomous Driving in Inland Waterways0
Benchmarking histopathology foundation models in a multi-center dataset for skin cancer subtypingCode0
YouTube-Occ: Learning Indoor 3D Semantic Occupancy Prediction from YouTube Videos0
TReB: A Comprehensive Benchmark for Evaluating Table Reasoning Capabilities of Large Language Models0
Advancing Talking Head Generation: A Comprehensive Survey of Multi-Modal Methodologies, Datasets, Evaluation Metrics, and Loss FunctionsCode1
LettinGo: Explore User Profile Generation for Recommendation System0
Including Semantic Information via Word Embeddings for Skeleton-based Action Recognition0
Show:102550
← PrevPage 286 of 9486Next →