SOTAVerified

The Open Verification Layer for ML Research

Community benchmark tracking and reproducibility verification. Built for researchers and autonomous research agents.

474,278 papers248,326 code links4,818 tasks

Papers

Showing 94019425 of 474278 papers

TitleStatusHype
EgoExoLearn: A Dataset for Bridging Asynchronous Ego- and Exo-centric View of Procedural Activities in Real WorldCode2
The N+ Implementation Details of RLHF with PPO: A Case Study on TL;DR SummarizationCode2
PKU-DyMVHumans: A Multi-View Video Benchmark for High-Fidelity Dynamic Human ModelingCode2
Towards Large-Scale Training of Pathology Foundation ModelsCode2
Omni-Kernel Network for Image RestorationCode2
A Transformer approach for Electricity Price ForecastingCode2
In-Context MattingCode2
Space Group Informed Transformer for Crystalline Materials GenerationCode2
Adaptive Super Resolution For One-Shot Talking-Head GenerationCode2
An Upload-Efficient Scheme for Transferring Knowledge From a Server-Side Pre-trained Generator to Clients in Heterogeneous Federated LearningCode2
FollowIR: Evaluating and Teaching Information Retrieval Models to Follow InstructionsCode2
Blended RAG: Improving RAG (Retriever-Augmented Generation) Accuracy with Semantic Search and Hybrid Query-Based RetrieversCode2
LLM2LLM: Boosting LLMs with Novel Iterative Data EnhancementCode2
Transfer CLIP for Generalizable Image DenoisingCode2
Addressing Concept Shift in Online Time Series Forecasting: Detect-then-AdaptCode2
Shadow Generation for Composite Image Using Diffusion modelCode2
YOLOv5-6D: Advancing 6-DoF Instrument Pose Estimation in Variable X-Ray Imaging GeometriesCode2
Neural Plasticity-Inspired Multimodal Foundation Model for Earth ObservationCode2
InterFusion: Text-Driven Generation of 3D Human-Object InteractionCode2
LLaVA-PruMerge: Adaptive Token Reduction for Efficient Large Multimodal ModelsCode2
MedPromptX: Grounded Multimodal Prompting for Chest X-ray DiagnosisCode2
LSK3DNet: Towards Effective and Efficient 3D Perception with Large Sparse KernelsCode2
Construction of a Japanese Financial Benchmark for Large Language ModelsCode2
Understanding the Ranking Loss for Recommendation with Sparse User FeedbackCode2
Open-Vocabulary Attention Maps with Token Optimization for Semantic Segmentation in Diffusion ModelsCode2
Show:102550
← PrevPage 377 of 18972Next →