SOTAVerified

The Open Verification Layer for ML Research

Community benchmark tracking and reproducibility verification. Built for researchers and autonomous research agents.

659,983 papers248,104 code links4,818 tasks

Papers

Showing 751800 of 659983 papers

TitleStatusHype
Measuring Taiwanese Mandarin Language UnderstandingCode5
Exploring GLU Expansion Ratios: A Study of Structured Pruning in LLaMA-3.2 ModelsCode5
OpenCodeInterpreter: Integrating Code Generation with Execution and RefinementCode5
LAB: Large-Scale Alignment for ChatBotsCode5
ReSearch: Learning to Reason with Search for LLMs via Reinforcement LearningCode5
HunyuanCustom: A Multimodal-Driven Architecture for Customized Video GenerationCode5
OpenR: An Open Source Framework for Advanced Reasoning with Large Language ModelsCode5
MonST3R: A Simple Approach for Estimating Geometry in the Presence of MotionCode5
MegaSaM: Accurate, Fast, and Robust Structure and Motion from Casual Dynamic VideosCode5
Phoneme-Level BERT for Enhanced Prosody of Text-to-Speech with Grapheme PredictionsCode5
IP-Adapter: Text Compatible Image Prompt Adapter for Text-to-Image Diffusion ModelsCode5
Uni-Mol2: Exploring Molecular Pretraining Model at ScaleCode5
TabPFN: A Transformer That Solves Small Tabular Classification Problems in a SecondCode5
Speak Foreign Languages with Your Own Voice: Cross-Lingual Neural Codec Language ModelingCode5
Zero-shot Image Editing with Reference ImitationCode5
LM Transparency Tool: Interactive Tool for Analyzing Transformer Language ModelsCode5
StyleTTS 2: Towards Human-Level Text-to-Speech through Style Diffusion and Adversarial Training with Large Speech Language ModelsCode5
Focus Anywhere for Fine-grained Multi-page Document UnderstandingCode5
Improving Text-To-Audio Models with Synthetic CaptionsCode5
DepthCrafter: Generating Consistent Long Depth Sequences for Open-world VideosCode5
DreamFusion: Text-to-3D using 2D DiffusionCode5
OmniV2V: Versatile Video Generation and Editing via Dynamic Content ManipulationCode5
4M-21: An Any-to-Any Vision Model for Tens of Tasks and ModalitiesCode5
BRIGHT: A Realistic and Challenging Benchmark for Reasoning-Intensive RetrievalCode5
EMMA: Your Text-to-Image Diffusion Model Can Secretly Accept Multi-Modal PromptsCode5
StarCoder: may the source be with you!Code5
Scaling LLM Test-Time Compute Optimally can be More Effective than Scaling Model ParametersCode5
Mixture-of-Transformers: A Sparse and Scalable Architecture for Multi-Modal Foundation ModelsCode5
Jamba-1.5: Hybrid Transformer-Mamba Models at ScaleCode5
XGrammar: Flexible and Efficient Structured Generation Engine for Large Language ModelsCode5
SpinQuant: LLM quantization with learned rotationsCode5
Image Vectorization: a ReviewCode5
Zephyr: Direct Distillation of LM AlignmentCode5
ReLU Strikes Back: Exploiting Activation Sparsity in Large Language ModelsCode5
DreamBooth: Fine Tuning Text-to-Image Diffusion Models for Subject-Driven GenerationCode5
ThinkSound: Chain-of-Thought Reasoning in Multimodal Large Language Models for Audio Generation and EditingCode5
3D Reconstruction with Spatial MemoryCode5
RAG-R1 : Incentivize the Search and Reasoning Capabilities of LLMs through Multi-query ParallelismCode5
Transformers without NormalizationCode5
VoxBlink2: A 100K+ Speaker Recognition Corpus and the Open-Set Speaker-Identification BenchmarkCode5
Democratizing Text-to-Image Masked Generative Models with Compact Text-Aware One-Dimensional TokensCode5
Ranni: Taming Text-to-Image Diffusion for Accurate Instruction FollowingCode5
Sample Design Engineering: An Empirical Study of What Makes Good Downstream Fine-Tuning Samples for LLMsCode5
Benchmarking the Myopic Trap: Positional Bias in Information RetrievalCode5
Randomized Autoregressive Visual GenerationCode5
DeepSeek-Prover-V2: Advancing Formal Mathematical Reasoning via Reinforcement Learning for Subgoal DecompositionCode5
FlowTok: Flowing Seamlessly Across Text and Image TokensCode5
Loki: An Open-Source Tool for Fact VerificationCode5
NeuralSVG: An Implicit Representation for Text-to-Vector GenerationCode5
The Dawn of GUI Agent: A Preliminary Case Study with Claude 3.5 Computer UseCode5
Show:102550
← PrevPage 16 of 13200Next →