SOTAVerified

The Open Verification Layer for ML Research

Community benchmark tracking and reproducibility verification. Built for researchers and autonomous research agents.

658,356 papers258,216 code links4,818 tasks

Papers

Showing 76100 of 658356 papers

TitleStatusHype
CogVideoX: Text-to-Video Diffusion Models with An Expert TransformerCode11
The AI Scientist: Towards Fully Automated Open-Ended Scientific DiscoveryCode11
SWIFT:A Scalable lightWeight Infrastructure for Fine-TuningCode11
SAM 2: Segment Anything in Images and VideosCode11
Very Large-Scale Multi-Agent Simulation in AgentScopeCode11
Gymnasium: A Standard Interface for Reinforcement Learning EnvironmentsCode11
Deep Time Series Models: A Comprehensive Survey and BenchmarkCode11
FlashAttention-3: Fast and Accurate Attention with Asynchrony and Low-precisionCode11
CosyVoice: A Scalable Multilingual Zero-shot Text-to-speech Synthesizer based on Supervised Semantic TokensCode11
FunAudioLLM: Voice Understanding and Generation Foundation Models for Natural Interaction Between Humans and LLMsCode11
LivePortrait: Efficient Portrait Animation with Stitching and Retargeting ControlCode11
Scaling Synthetic Data Creation with 1,000,000,000 PersonasCode11
NYU CTF Bench: A Scalable Open-Source Benchmark Dataset for Evaluating LLMs in Offensive SecurityCode11
Transformers are SSMs: Generalized Models and Efficient Algorithms Through Structured State Space DualityCode11
RLAIF-V: Open-Source AI Feedback Leads to Super GPT-4V TrustworthinessCode11
Bridging The Gap between Low-rank and Orthogonal Adaptation via Householder Reflection AdaptationCode11
YOLOv10: Real-Time End-to-End Object DetectionCode11
USP: A Unified Sequence Parallelism Approach for Long Context Generative AICode11
SWE-agent: Agent-Computer Interfaces Enable Automated Software EngineeringCode11
KAN: Kolmogorov-Arnold NetworksCode11
Demonstration of DB-GPT: Next Generation Data Interaction System Empowered by Large Language ModelsCode11
Eagle and Finch: RWKV with Matrix-Valued States and Dynamic RecurrenceCode11
AutoDev: Automated AI-Driven DevelopmentCode11
LangGPT: Rethinking Structured Reusable Prompt Design Framework for LLMs from the Programming LanguageCode11
AgentScope: A Flexible yet Robust Multi-Agent PlatformCode11
Show:102550
← PrevPage 4 of 26335Next →