SOTAVerified

The Open Verification Layer for ML Research

Community benchmark tracking and reproducibility verification. Built for researchers and autonomous research agents.

474,278 papers248,326 code links4,818 tasks

Papers

Showing 92769300 of 474278 papers

TitleStatusHype
Diffusion-Assisted Distillation for Self-Supervised Graph Representation Learning with MLPsCode0
World-To-Image: Grounding Text-to-Image Generation with Agent-Driven World KnowledgeCode0
GUIDE: Towards Scalable Advising for Research IdeasCode0
BrainFLORA: Uncovering Brain Concept Representation via Multimodal Neural EmbeddingsCode0
LLM Microscope: What Model Internals Reveal About Answer Correctness and Context UtilizationCode0
GenAR: Next-Scale Autoregressive Generation for Spatial Gene Expression PredictionCode0
PhaseFormer: From Patches to Phases for Efficient and Effective Time Series ForecastingCode0
AgentRL: Scaling Agentic Reinforcement Learning with a Multi-Turn, Multi-Task FrameworkCode0
QCBench: Evaluating Large Language Models on Domain-Specific Quantitative ChemistryCode0
Enhanced Self-Distillation Framework for Efficient Spiking Neural Network TrainingCode0
MedEBench: Diagnosing Reliability in Text-Guided Medical Image Editing0
CoPA: Hierarchical Concept Prompting and Aggregating Network for Explainable DiagnosisCode0
Optimized Minimal 4D Gaussian Splatting0
No Tokens Wasted: Leveraging Long Context in Biomedical Vision-Language Models0
OpenCUA: Open Foundations for Computer-Use Agents0
SSFO: Self-Supervised Faithfulness Optimization for Retrieval-Augmented GenerationCode0
Towards Robust and Generalizable Continuous Space-Time Video Super-Resolution with EventsCode0
Optimizing Resources for On-the-Fly Label Estimation with Multiple Unknown Medical ExpertsCode0
Harnessing Synthetic Preference Data for Enhancing Temporal Understanding of Video-LLMsCode0
What Can You Do When You Have Zero Rewards During RL?Code0
Zero-Shot Fine-Grained Image Classification Using Large Vision-Language ModelsCode0
AtmosSci-Bench: Evaluating the Recent Advance of Large Language Model for Atmospheric ScienceCode0
ReTiDe: Real-Time Denoising for Energy-Efficient Motion Picture Processing with FPGAsCode0
LIBERO-PRO: Towards Robust and Fair Evaluation of Vision-Language-Action Models Beyond MemorizationCode0
Active Attacks: Red-teaming LLMs via Adaptive EnvironmentsCode0
Show:102550
← PrevPage 372 of 18972Next →