SOTAVerified

The Open Verification Layer for ML Research

Community benchmark tracking and reproducibility verification. Built for researchers and autonomous research agents.

474,278 papers248,326 code links4,818 tasks

Papers

Showing 65016525 of 474278 papers

TitleStatusHype
Text2midi: Generating Symbolic Music from CaptionsCode2
A Generalizable Anomaly Detection Method in Dynamic GraphsCode2
Exploiting Multimodal Spatial-temporal Patterns for Video Object TrackingCode2
Mapping the Mind of an Instruction-based Image Editing using SMILECode2
Offline Reinforcement Learning for LLM Multi-Step ReasoningCode2
FedRLHF: A Convergence-Guaranteed Federated Framework for Privacy-Preserving and Personalized RLHFCode2
PruneVid: Visual Token Pruning for Efficient Video Large Language ModelsCode2
MR-GDINO: Efficient Open-World Continual Object DetectionCode2
ChangeDiff: A Multi-Temporal Change Detection Data Generator with Flexible Text Prompts via Diffusion ModelCode2
XRAG: eXamining the Core -- Benchmarking Foundational Components in Advanced Retrieval-Augmented GenerationCode2
fluke: Federated Learning Utility frameworK for Experimentation and researchCode2
Personalized Representation from Personalized GenerationCode2
PyBOP: A Python package for battery model optimisation and parameterisationCode2
Collaborative Gym: A Framework for Enabling and Evaluating Human-Agent CollaborationCode2
Preventing Local Pitfalls in Vector Quantization via Optimal TransportCode2
DCTdiff: Intriguing Properties of Image Generative Modeling in the DCT SpaceCode2
MMLU-CF: A Contamination-free Multi-task Language Understanding BenchmarkCode2
LeviTor: 3D Trajectory Oriented Image-to-Video SynthesisCode2
Can We Get Rid of Handcrafted Feature Extractors? SparseViT: Nonsemantics-Centered, Parameter-Efficient Image Manipulation Localization through Spare-Coding TransformerCode2
Agent-SafetyBench: Evaluating the Safety of LLM AgentsCode2
Tests for model misspecification in simulation-based inference: from local distortions to global model checksCode2
Fietje: An open, efficient LLM for DutchCode2
FlowAR: Scale-wise Autoregressive Image Generation Meets Flow MatchingCode2
Next Patch Prediction for Autoregressive Visual GenerationCode2
Learning charges and long-range interactions from energies and forcesCode2
Show:102550
← PrevPage 261 of 18972Next →