SOTAVerified

The Open Verification Layer for ML Research

Community benchmark tracking and reproducibility verification. Built for researchers and autonomous research agents.

474,278 papers248,326 code links4,818 tasks

Papers

Showing 1225112300 of 474278 papers

TitleStatusHype
Human Preference Score v2: A Solid Benchmark for Evaluating Human Preferences of Text-to-Image SynthesisCode2
GenerSpeech: Towards Style Transfer for Generalizable Out-Of-Domain Text-to-SpeechCode2
Investigating the Role of Image Retrieval for Visual Localization -- An exhaustive benchmarkCode2
Piloting Structure-Based Drug Design via Modality-Specific Optimal ScheduleCode2
Autonomous Catheterization with Open-source Simulator and Expert TrajectoryCode2
Data-Centric Foundation Models in Computational Healthcare: A SurveyCode2
BAPLe: Backdoor Attacks on Medical Foundational Models using Prompt LearningCode2
LRM-Zero: Training Large Reconstruction Models with Synthesized DataCode2
DiffuSeq-v2: Bridging Discrete and Continuous Text Spaces for Accelerated Seq2Seq Diffusion ModelsCode2
Regional Tiny Stories: Using Small Models to Compare Language Learning and Tokenizer PerformanceCode2
UVEB: A Large-scale Benchmark and Baseline Towards Real-World Underwater Video EnhancementCode2
TRACE: Temporal Grounding Video LLM via Causal Event ModelingCode2
DiffIR2VR-Zero: Zero-Shot Video Restoration with Diffusion-based Image Restoration ModelsCode2
ScatterFormer: Efficient Voxel Transformer with Scattered Linear AttentionCode2
A Survey on Large Language Models for Code GenerationCode2
Rethinking Optimization and Architecture for Tiny Language ModelsCode2
FusionMamba: Efficient Remote Sensing Image Fusion with State Space ModelCode2
Most Language Models can be Poets too: An AI Writing Assistant and Constrained Text Generation StudioCode2
VoCoT: Unleashing Visually Grounded Multi-Step Reasoning in Large Multi-Modal ModelsCode2
THEMIS: Towards Practical Intellectual Property Protection for Post-Deployment On-Device Deep Learning ModelsCode2
BaryIR: Learning Multi-Source Unified Representation in Continuous Barycenter Space for Generalizable All-in-One Image RestorationCode2
DivPrune: Diversity-based Visual Token Pruning for Large Multimodal ModelsCode2
GAUCHE: A Library for Gaussian Processes in ChemistryCode2
NMS Threshold matters for Ego4D Moment Queries -- 2nd place solution to the Ego4D Moment Queries Challenge 2023Code2
MUTR3D: A Multi-camera Tracking Framework via 3D-to-2D QueriesCode2
MambaTalk: Efficient Holistic Gesture Synthesis with Selective State Space ModelsCode2
DistMLIP: A Distributed Inference Platform for Machine Learning Interatomic PotentialsCode2
CausalPFN: Amortized Causal Effect Estimation via In-Context LearningCode2
Automated Capability Discovery via Model Self-ExplorationCode2
DiffDet4SAR: Diffusion-based Aircraft Target Detection Network for SAR ImagesCode2
HiDiff: Hybrid Diffusion Framework for Medical Image SegmentationCode2
Accessing Vision Foundation Models at ImageNet-level CostsCode2
SMILE: Single-turn to Multi-turn Inclusive Language Expansion via ChatGPT for Mental Health SupportCode2
Efficient Reasoning with Hidden ThinkingCode2
Machine Unlearning: Solutions and ChallengesCode2
Text2Human: Text-Driven Controllable Human Image GenerationCode2
SIFT: Grounding LLM Reasoning in Contexts via StickersCode2
Measuring Mathematical Problem Solving With the MATH DatasetCode2
ProcessBench: Identifying Process Errors in Mathematical ReasoningCode2
Towards Learning a Generalist Model for Embodied NavigationCode2
RS5M and GeoRSCLIP: A Large Scale Vision-Language Dataset and A Large Vision-Language Model for Remote SensingCode2
REST: Retrieval-Based Speculative DecodingCode2
S^2R: Teaching LLMs to Self-verify and Self-correct via Reinforcement LearningCode2
Whole-Song Hierarchical Generation of Symbolic Music Using Cascaded Diffusion ModelsCode2
OpenStreetView-5M: The Many Roads to Global Visual GeolocationCode2
A Multi-objective Optimization Benchmark Test Suite for Real-time Semantic SegmentationCode2
Unsupervised Learning for Joint Beamforming Design in RIS-aided ISAC SystemsCode2
Feedback Efficient Online Fine-Tuning of Diffusion ModelsCode2
Physics-informed active learning for accelerating quantum chemical simulationsCode2
Enabling Large Language Models to Generate Text with CitationsCode2
Show:102550
← PrevPage 246 of 9486Next →