SOTAVerified

The Open Verification Layer for ML Research

Community benchmark tracking and reproducibility verification. Built for researchers and autonomous research agents.

659,983 papers248,104 code links4,818 tasks

Papers

Showing 14511475 of 659983 papers

TitleStatusHype
Unified Multimodal Chain-of-Thought Reward Model through Reinforcement Fine-TuningCode4
TokenFormer: Rethinking Transformer Scaling with Tokenized Model ParametersCode4
A Closer Look at Deep Learning Methods on Tabular DatasetsCode4
Quiet-STaR: Language Models Can Teach Themselves to Think Before SpeakingCode4
Ada-KV: Optimizing KV Cache Eviction by Adaptive Budget Allocation for Efficient LLM InferenceCode4
Zero-Shot Whole-Body Humanoid Control via Behavioral Foundation ModelsCode4
Latent Consistency Models: Synthesizing High-Resolution Images with Few-Step InferenceCode4
XiYan-SQL: A Novel Multi-Generator Framework For Text-to-SQLCode4
VM-UNet: Vision Mamba UNet for Medical Image SegmentationCode4
FedCP: Separating Feature Information for Personalized Federated Learning via Conditional PolicyCode4
Chain-of-Discussion: A Multi-Model Framework for Complex Evidence-Based Question AnsweringCode4
NExT-GPT: Any-to-Any Multimodal LLMCode4
Co-Evolving LLM Coder and Unit Tester via Reinforcement LearningCode4
Eliminating Domain Bias for Federated Learning in Representation SpaceCode4
MotionClone: Training-Free Motion Cloning for Controllable Video GenerationCode4
Recent Advances in Large Langauge Model Benchmarks against Data Contamination: From Static to Dynamic EvaluationCode4
GIM: Learning Generalizable Image Matcher From Internet VideosCode4
Pixel-level and Semantic-level Adjustable Super-resolution: A Dual-LoRA ApproachCode4
Pearl: A Production-ready Reinforcement Learning AgentCode4
Towards All-in-One Medical Image Re-IdentificationCode4
FinBen: A Holistic Financial Benchmark for Large Language ModelsCode4
Rankify: A Comprehensive Python Toolkit for Retrieval, Re-Ranking, and Retrieval-Augmented GenerationCode4
Attention Mesh: High-fidelity Face Mesh Prediction in Real-timeCode4
LLaVA-Mini: Efficient Image and Video Large Multimodal Models with One Vision TokenCode4
Scaling and evaluating sparse autoencodersCode4
Show:102550
← PrevPage 59 of 26400Next →