SOTAVerified

The Open Verification Layer for ML Research

Community benchmark tracking and reproducibility verification. Built for researchers and autonomous research agents.

474,278 papers248,326 code links4,818 tasks

Papers

Showing 1655116600 of 474278 papers

TitleStatusHype
GOLLuM: Gaussian Process Optimized LLMs -- Reframing LLM Finetuning through Bayesian OptimizationCode1
An Empirical Study of GPT-4o Image Generation CapabilitiesCode1
Mind the Trojan Horse: Image Prompt Adapter Enabling Scalable and Deceptive JailbreakingCode1
Retrieval Augmented Generation with Collaborative Filtering for Personalized Text GenerationCode1
A Control-Oriented Simplified Single Particle Model with Grouped Parameter and Sensitivity Analysis for Lithium-Ion BatteriesCode1
Temporal Alignment-Free Video Matching for Few-shot Action RecognitionCode1
FEABench: Evaluating Language Models on Multiphysics Reasoning AbilityCode1
Knowledge Graph Completion with Relation-Aware Anchor EnhancementCode1
A Multi-Modal AI System for Screening Mammography: Integrating 2D and 3D Imaging to Improve Breast Cancer Detection in a Prospective Clinical StudyCode1
V-MAGE: A Game Evaluation Framework for Assessing Vision-Centric Capabilities in Multimodal Large Language ModelsCode1
CamContextI2V: Context-aware Controllable Video GenerationCode1
Robo-taxi Fleet Coordination at Scale via Reinforcement LearningCode1
Leanabell-Prover: Posttraining Scaling in Formal ReasoningCode1
Why is Normalization Necessary for Linear Recommenders?Code1
kNN-SVC: Robust Zero-Shot Singing Voice Conversion with Additive Synthesis and Concatenation Smoothness OptimizationCode1
Reconstruction-Free Anomaly Detection with Diffusion Models via Direct Latent Likelihood EvaluationCode1
HRMedSeg: Unlocking High-resolution Medical Image segmentation via Memory-efficient Attention ModelingCode1
To Match or Not to Match: Revisiting Image Matching for Reliable Visual Place RecognitionCode1
Learning Affine Correspondences by Integrating Geometric ConstraintsCode1
ChartQAPro: A More Diverse and Challenging Benchmark for Chart Question AnsweringCode1
Predicting Survivability of Cancer Patients with Metastatic Patterns Using Explainable AICode1
mixEEG: Enhancing EEG Federated Learning for Cross-subject EEG Classification with Tailored mixupCode1
Advanced Codebook Design for SCMA-aided NTNs With Randomly Distributed UsersCode1
Climplicit: Climatic Implicit Embeddings for Global Ecological TasksCode1
Continuous Locomotive Crowd Behavior GenerationCode1
Dynamic Vision MambaCode1
R2Vul: Learning to Reason about Software Vulnerabilities with Reinforcement Learning and Structured Reasoning DistillationCode1
Data Augmentation as Free Lunch: Exploring the Test-Time Augmentation for Sequential RecommendationCode1
ELT-Bench: An End-to-End Benchmark for Evaluating AI Agents on ELT PipelinesCode1
On the Robustness of GUI Grounding Models Against Image AttacksCode1
Concise Reasoning via Reinforcement LearningCode1
3DM-WeConvene: Learned Image Compression with 3D Multi-Level Wavelet-Domain Convolution and Entropy ModelCode1
Lightweight and Direct Document Relevance Optimization for Generative Information RetrievalCode1
Collab-RAG: Boosting Retrieval-Augmented Generation for Complex Question Answering via White-Box and Black-Box LLM CollaborationCode1
System Log Parsing with Large Language Models: A ReviewCode1
Embracing Dynamics: Dynamics-aware 4D Gaussian Splatting SLAMCode1
Scaling Graph Neural Networks for Particle Track ReconstructionCode1
A Desideratum for Conversational Agents: Capabilities, Challenges, and Future DirectionsCode1
EquiCPI: SE(3)-Equivariant Geometric Deep Learning for Structure-Aware Prediction of Compound-Protein InteractionsCode1
Can LLM-Driven Hard Negative Sampling Empower Collaborative Filtering? Findings and PotentialsCode1
Joint Pedestrian and Vehicle Traffic Optimization in Urban Environments using Reinforcement LearningCode1
LoopGen: Training-Free Loopable Music GenerationCode1
CO-Bench: Benchmarking Language Model Agents in Algorithm Search for Combinatorial OptimizationCode1
COHESION: Composite Graph Convolutional Network with Dual-Stage Fusion for Multimodal RecommendationCode1
WaveNet-Volterra Neural Networks for Active Noise Control: A Fully Causal ApproachCode1
Hessian of Perplexity for Large Language Models by PyTorch autograd (Open Source)Code1
Window Token Concatenation for Efficient Visual Large Language ModelsCode1
MSL: Not All Tokens Are What You Need for Tuning LLM as a RecommenderCode1
Collaboration and Controversy Among Experts: Rumor Early Detection by Tuning a Comment GeneratorCode1
A Survey of Pathology Foundation Model: Progress and Future DirectionsCode1
Show:102550
← PrevPage 332 of 9486Next →