SOTAVerified

The Open Verification Layer for ML Research

Community benchmark tracking and reproducibility verification. Built for researchers and autonomous research agents.

661,570 papers248,326 code links4,818 tasks

Papers

Showing 32513275 of 661570 papers

TitleStatusHype
SAM-Med2DCode3
Editable Scene Simulation for Autonomous Driving via Collaborative LLM-AgentsCode3
MoMA: Multimodal LLM Adapter for Fast Personalized Image GenerationCode3
DEADiff: An Efficient Stylization Diffusion Model with Disentangled RepresentationsCode3
GaussianCity: Generative Gaussian Splatting for Unbounded 3D City GenerationCode3
Hunyuan3D 2.5: Towards High-Fidelity 3D Assets Generation with Ultimate DetailsCode3
ResearchTown: Simulator of Human Research CommunityCode3
From Easy to Hard: Progressive Active Learning Framework for Infrared Small Target Detection with Single Point SupervisionCode3
How Johnny Can Persuade LLMs to Jailbreak Them: Rethinking Persuasion to Challenge AI Safety by Humanizing LLMsCode3
LocoMuJoCo: A Comprehensive Imitation Learning Benchmark for LocomotionCode3
TorchDrug: A Powerful and Flexible Machine Learning Platform for Drug DiscoveryCode3
MathArena: Evaluating LLMs on Uncontaminated Math CompetitionsCode3
Frequency-aware Feature Fusion for Dense Image PredictionCode3
VoiceBench: Benchmarking LLM-Based Voice AssistantsCode3
LN3Diff: Scalable Latent Neural Fields Diffusion for Speedy 3D GenerationCode3
MedAgentBench: A Realistic Virtual EHR Environment to Benchmark Medical LLM AgentsCode3
GS-SDF: LiDAR-Augmented Gaussian Splatting and Neural SDF for Geometrically Consistent Rendering and ReconstructionCode3
Co-Writing Screenplays and Theatre Scripts with Language Models: An Evaluation by Industry ProfessionalsCode3
Scoring Time Intervals using Non-Hierarchical Transformer For Automatic Piano TranscriptionCode3
PointCNN: Convolution On X-Transformed PointsCode3
OverleafCopilot: Empowering Academic Writing in Overleaf with Large Language ModelsCode3
Jailbreak Attacks and Defenses against Multimodal Generative Models: A SurveyCode3
Infrared and Visible Image Fusion: From Data Compatibility to Task AdaptionCode3
Game-theoretic LLM: Agent Workflow for Negotiation GamesCode3
Tracking Anything with Decoupled Video SegmentationCode3
Show:102550
← PrevPage 131 of 26463Next →