SOTAVerified

The Open Verification Layer for ML Research

Community benchmark tracking and reproducibility verification. Built for researchers and autonomous research agents.

659,983 papers248,104 code links4,818 tasks

Papers

Showing 18261850 of 177339 papers

TitleStatusHype
Guiding Instruction-based Image Editing via Multimodal Large Language ModelsCode4
Taming Scalable Visual Tokenizer for Autoregressive Image GenerationCode4
SocialED: A Python Library for Social Event DetectionCode4
OLMoE: Open Mixture-of-Experts Language ModelsCode4
Eagle: Exploring The Design Space for Multimodal LLMs with Mixture of EncodersCode4
On the limits of agency in agent-based modelsCode4
Visual Attention NetworkCode4
Large Language Models for Time Series: A SurveyCode4
LLMMapReduce: Simplified Long-Sequence Processing using Large Language ModelsCode4
OvercookedV2: Rethinking Overcooked for Zero-Shot CoordinationCode4
TradeMaster: A Holistic Quantitative Trading Platform Empowered by Reinforcement LearningCode4
Cerebras-GPT: Open Compute-Optimal Language Models Trained on the Cerebras Wafer-Scale ClusterCode4
Large Language Model-Based Agents for Software Engineering: A SurveyCode4
R1-Onevision:An Open-Source Multimodal Large Language Model Capable of Deep ReasoningCode4
Training Sparse Mixture Of Experts Text Embedding ModelsCode4
PyTorch AdaptCode4
Tevatron 2.0: Unified Document Retrieval Toolkit across Scale, Language, and ModalityCode4
RecurrentGemma: Moving Past Transformers for Efficient Open Language ModelsCode4
MOS: Model Surgery for Pre-Trained Model-Based Class-Incremental LearningCode4
Images Speak in Images: A Generalist Painter for In-Context Visual LearningCode4
DreamGen: Unlocking Generalization in Robot Learning through Video World ModelsCode4
MM-Eureka: Exploring Visual Aha Moment with Rule-based Large-scale Reinforcement LearningCode4
Cognitive Architectures for Language AgentsCode4
AnimateLCM: Computation-Efficient Personalized Style Video Generation without Personalized Video DataCode4
Ignore This Title and HackAPrompt: Exposing Systemic Vulnerabilities of LLMs through a Global Scale Prompt Hacking CompetitionCode4
Show:102550
← PrevPage 74 of 7094Next →