SOTAVerified

The Open Verification Layer for ML Research

Community benchmark tracking and reproducibility verification. Built for researchers and autonomous research agents.

659,983 papers248,104 code links4,818 tasks

Papers

Showing 28512900 of 177339 papers

TitleStatusHype
Evaluation Report on MCP ServersCode3
ChartGalaxy: A Dataset for Infographic Chart Understanding and GenerationCode3
UniMLVG: Unified Framework for Multi-view Long Video Generation with Comprehensive Control Capabilities for Autonomous DrivingCode3
RadioDiff: An Effective Generative Diffusion Model for Sampling-Free Dynamic Radio Map ConstructionCode3
Privacy-Preserving Tree-Based Inference with TFHECode3
Retrieval Head Mechanistically Explains Long-Context FactualityCode3
Addressing Representation Collapse in Vector Quantized Models with One Linear LayerCode3
Simple and Fast Distillation of Diffusion ModelsCode3
The Mamba in the Llama: Distilling and Accelerating Hybrid ModelsCode3
Gaussian-SLAM: Photo-realistic Dense SLAM with Gaussian SplattingCode3
UNETR: Transformers for 3D Medical Image SegmentationCode3
A Survey of Multimodal-Guided Image Editing with Text-to-Image Diffusion ModelsCode3
Nimbus: Secure and Efficient Two-Party Inference for TransformersCode3
uniGradICON: A Foundation Model for Medical Image RegistrationCode3
Neural Speech Synthesis on a Shoestring: Improving the Efficiency of LPCNetCode3
ViT-CoMer: Vision Transformer with Convolutional Multi-scale Feature Interaction for Dense PredictionsCode3
Automatic Data Curation for Self-Supervised Learning: A Clustering-Based ApproachCode3
Taiwan LLM: Bridging the Linguistic Divide with a Culturally Aligned Language ModelCode3
MedSegDiff: Medical Image Segmentation with Diffusion Probabilistic ModelCode3
HierSpeech++: Bridging the Gap between Semantic and Acoustic Representation of Speech by Hierarchical Variational Inference for Zero-shot Speech SynthesisCode3
SceneCraft: Layout-Guided 3D Scene GenerationCode3
Flash-VStream: Efficient Real-Time Understanding for Long Video StreamsCode3
Local All-Pair Correspondence for Point TrackingCode3
The Hidden Dimensions of LLM Alignment: A Multi-Dimensional Safety AnalysisCode3
Unfolding the Headline: Iterative Self-Questioning for News Retrieval and Timeline SummarizationCode3
Deformable 3D Gaussians for High-Fidelity Monocular Dynamic Scene ReconstructionCode3
MARLlib: A Scalable and Efficient Multi-agent Reinforcement Learning LibraryCode3
UNetFormer: A Unified Vision Transformer Model and Pre-Training Framework for 3D Medical Image SegmentationCode3
GraphNeuralNetworks.jl: Deep Learning on Graphs with JuliaCode3
ArCHer: Training Language Model Agents via Hierarchical Multi-Turn RLCode3
A Simple Framework for Open-Vocabulary Segmentation and DetectionCode3
LinFusion: 1 GPU, 1 Minute, 16K ImageCode3
CHESS: Contextual Harnessing for Efficient SQL SynthesisCode3
Flexible and Scalable Deep Learning with MMLSparkCode3
A Comprehensive Survey of Small Language Models in the Era of Large Language Models: Techniques, Enhancements, Applications, Collaboration with LLMs, and TrustworthinessCode3
Why Transformers Need Adam: A Hessian PerspectiveCode3
LiftFeat: 3D Geometry-Aware Local Feature MatchingCode3
An Empirical Study on Prompt Compression for Large Language ModelsCode3
This Time is Different: An Observability Perspective on Time Series Foundation ModelsCode3
Image and Video Tokenization with Binary Spherical QuantizationCode3
VoiceStar: Robust Zero-Shot Autoregressive TTS with Duration Control and ExtrapolationCode3
Distilling LLM Agent into Small Models with Retrieval and Code ToolsCode3
Highly Compressed Tokenizer Can Generate Without TrainingCode3
When to use Graphs in RAG: A Comprehensive Analysis for Graph Retrieval-Augmented GenerationCode3
Machine Mental Imagery: Empower Multimodal Reasoning with Latent Visual TokensCode3
Discrete Diffusion in Large Language and Multimodal Models: A SurveyCode3
Efficient and Generalizable Speaker Diarization via Structured Pruning of Self-Supervised ModelsCode3
FineWeb2: One Pipeline to Scale Them All -- Adapting Pre-Training Data Processing to Every LanguageCode3
No time to train! Training-Free Reference-Based Instance SegmentationCode3
BRIGHT: A globally distributed multimodal building damage assessment dataset with very-high-resolution for all-weather disaster responseCode3
Show:102550
← PrevPage 58 of 3547Next →