SOTAVerified

The Open Verification Layer for ML Research

Community benchmark tracking and reproducibility verification. Built for researchers and autonomous research agents.

474,278 papers248,326 code links4,818 tasks

Papers

Showing 82768300 of 474278 papers

TitleStatusHype
Interactive Evolution: A Neural-Symbolic Self-Training Framework For Large Language ModelsCode2
Twin-Merging: Dynamic Integration of Modular Expertise in Model MergingCode2
ISR-DPO: Aligning Large Multimodal Models for Videos by Iterative Self-Retrospective DPOCode2
MMDU: A Multi-Turn Multi-Image Dialog Understanding Benchmark and Instruction-Tuning Dataset for LVLMsCode2
GUICourse: From General Vision Language Models to Versatile GUI AgentsCode2
Multimodal Needle in a Haystack: Benchmarking Long-Context Capability of Multimodal Large Language ModelsCode2
Understanding Multi-Granularity for Open-Vocabulary Part SegmentationCode2
Scaling the Codebook Size of VQGAN to 100,000 with a Utilization Rate of 99%Code2
STAR: Scale-wise Text-to-image generation via Auto-Regressive representationsCode2
Ontology Embedding: A Survey of Methods, Applications and ResourcesCode2
ViD-GPT: Introducing GPT-style Autoregressive Generation in Video Diffusion ModelsCode2
RWKU: Benchmarking Real-World Knowledge Unlearning for Large Language ModelsCode2
Kolmogorov Arnold Informed neural network: A physics-informed deep learning framework for solving forward and inverse problems based on Kolmogorov Arnold NetworksCode2
Unveiling the Ignorance of MLLMs: Seeing Clearly, Answering IncorrectlyCode2
Voxel Mamba: Group-Free State Space Models for Point Cloud based 3D Object DetectionCode2
CoMM: A Coherent Interleaved Image-Text Dataset for Multimodal Understanding and GenerationCode2
Text-space Graph Foundation Models: Comprehensive Benchmarks and New InsightsCode2
CrossFuse: A Novel Cross Attention Mechanism based Infrared and Visible Image Fusion ApproachCode2
CHiSafetyBench: A Chinese Hierarchical Safety Benchmark for Large Language ModelsCode2
GradeADreamer: Enhanced Text-to-3D Generation Using Gaussian Splatting and Multi-View DiffusionCode2
Make It Count: Text-to-Image Generation with an Accurate Number of ObjectsCode2
Evolving Self-Assembling Neural Networks: From Spontaneous Activity to Experience-Dependent LearningCode2
EFM3D: A Benchmark for Measuring Progress Towards 3D Egocentric Foundation ModelsCode2
SkySenseGPT: A Fine-Grained Instruction Tuning Dataset and Model for Remote Sensing Vision-Language UnderstandingCode2
Simul-Whisper: Attention-Guided Streaming Whisper with Truncation DetectionCode2
Show:102550
← PrevPage 332 of 18972Next →