SOTAVerified

The Open Verification Layer for ML Research

Community benchmark tracking and reproducibility verification. Built for researchers and autonomous research agents.

474,278 papers248,326 code links4,818 tasks

Papers

Showing 1405114100 of 474278 papers

TitleStatusHype
EAR: Erasing Concepts from Unified Autoregressive ModelsCode0
Industrial Energy Disaggregation with Digital Twin-generated Dataset and Efficient Data AugmentationCode0
Model Editing as a Double-Edged Sword: Steering Agent Ethical Behavior Toward Beneficence or HarmCode0
Benchmarking Unsupervised Strategies for Anomaly Detection in Multivariate Time SeriesCode0
The Decrypto Benchmark for Multi-Agent Reasoning and Theory of MindCode1
Comparative Analysis of Deep Learning Models for Crop Disease Detection: A Transfer Learning Approach0
Hear No Evil: Detecting Gradient Leakage by Malicious Servers in Federated Learning0
CCRS: A Zero-Shot LLM-as-a-Judge Framework for Comprehensive RAG Evaluation0
DeepQuark: deep-neural-network approach to multiquark bound states0
From Codicology to Code: A Comparative Study of Transformer and YOLO-based Detectors for Layout Analysis in Historical Documents0
TAPS: Tool-Augmented Personalisation via Structured TaggingCode0
Narrative Shift Detection: A Hybrid Approach of Dynamic Topic Models and Large Language ModelsCode0
SV-LLM: An Agentic Approach for SoC Security Verification using Large Language Models0
Knowledge-Aware Diverse Reranking for Cross-Source Question Answering0
Bridging Compositional and Distributional Semantics: A Survey on Latent Semantic Geometry via AutoEncoder0
Collaborative Batch Size Optimization for Federated Learning0
SACL: Understanding and Combating Textual Bias in Code Retrieval with Semantic-Augmented Reranking and Localization0
DiffuCoder: Understanding and Improving Masked Diffusion Models for Code GenerationCode4
Case-based Reasoning Augmented Large Language Model Framework for Decision Making in Realistic Safety-Critical Driving Scenarios0
Towards Community-Driven Agents for Machine Learning EngineeringCode0
OctoThinker: Mid-training Incentivizes Reinforcement Learning ScalingCode2
H-FEX: A Symbolic Learning Method for Hamiltonian Systems0
CogGen: A Learner-Centered Generative AI Architecture for Intelligent Tutoring with Programming Video0
Show, Tell and Summarize: Dense Video Captioning Using Visual Cue Aided Sentence Summarization0
Enhancing Large Language Models through Structured Reasoning0
A Modular Multitask Reasoning Framework Integrating Spatio-temporal Models and LLMs0
Mixtures of Neural Cellular Automata: A Stochastic Framework for Growth Modelling and Self-Organization0
Tabular Feature Discovery With Reasoning Type Exploration0
Automatic Demonstration Selection for LLM-based Tabular Data Classification0
How to Retrieve Examples in In-context Learning to Improve Conversational Emotion Recognition using Large Language Models?0
Beyond-Expert Performance with Limited Demonstrations: Efficient Imitation Learning with Double Exploration0
CARMA: Context-Aware Situational Grounding of Human-Robot Group Interactions by Combining Vision-Language Models with Object and Action Recognition0
When Life Gives You Samples: The Benefits of Scaling up Inference Compute for Multilingual LLMs0
Vulnerability Disclosure through Adaptive Black-Box Adversarial Attacks on NIDS0
Biomed-Enriched: A Biomedical Dataset Enriched with LLMs for Pretraining and Extracting Rare and Hidden Content0
MEL: Multi-level Ensemble Learning for Resource-Constrained Environments0
Causal discovery in deterministic discrete LTI-DAE systems0
Distilling A Universal Expert from Clustered Federated Learning0
TESSERA: Temporal Embeddings of Surface Spectra for Earth Representation and Analysis0
WallStreetFeds: Client-Specific Tokens as Investment Vehicles in Federated Learning0
Self-Supervised Graph Learning via Spectral Bootstrapping and Laplacian-Based AugmentationsCode0
AI Assistants to Enhance and Exploit the PETSc Knowledge Base0
Deciphering GunType Hierarchy through Acoustic Analysis of Gunshot Recordings0
Irec: A Metacognitive Scaffolding for Self-Regulated Learning through Just-in-Time Insight Recall: A Conceptual Framework and System Prototype0
Leveraging AI Graders for Missing Score Imputation to Achieve Accurate Ability Estimation in Constructed-Response Tests0
AALC: Large Language Model Efficient Reasoning via Adaptive Accuracy-Length ControlCode0
COIN: Uncertainty-Guarding Selective Question Answering for Foundation Models with Provable Risk Guarantees0
Causal Operator Discovery in Partial Differential Equations via Counterfactual Physics-Informed Neural Networks0
Efficient Federated Learning with Encrypted Data Sharing for Data-Heterogeneous Edge Devices0
Generating and Customizing Robotic Arm Trajectories using Neural NetworksCode0
Show:102550
← PrevPage 282 of 9486Next →