SOTAVerified

The Open Verification Layer for ML Research

Community benchmark tracking and reproducibility verification. Built for researchers and autonomous research agents.

661,570 papers248,326 code links4,818 tasks

Papers

Showing 69517000 of 661570 papers

TitleStatusHype
Look Twice Before You Answer: Memory-Space Visual Retracing for Hallucination Mitigation in Multimodal Large Language ModelsCode2
TidalDecode: Fast and Accurate LLM Decoding with Position Persistent Sparse AttentionCode2
Hammer: Robust Function-Calling for On-Device Language Models via Function MaskingCode2
SyllableLM: Learning Coarse Semantic Units for Speech Language ModelsCode2
UniMuMo: Unified Text, Music and Motion GenerationCode2
TimeBridge: Non-Stationarity Matters for Long-term Time Series ForecastingCode2
A Simple yet Effective Training-free Prompt-free Approach to Chinese Spelling Correction Based on Large Language ModelsCode2
nnSAM: Plug-and-play Segment Anything Model Improves nnUNet PerformanceCode2
CursorCore: Assist Programming through Aligning AnythingCode2
Compositional Entailment Learning for Hyperbolic Vision-Language ModelsCode2
Deciphering Cross-Modal Alignment in Large Vision-Language Models with Modality Integration RateCode2
EvolveDirector: Approaching Advanced Text-to-Image Generation with Large Vision-Language ModelsCode2
MC-MoE: Mixture Compressor for Mixture-of-Experts LLMs Gains MoreCode2
SWIFT: On-the-Fly Self-Speculative Decoding for LLM Inference AccelerationCode2
Res-VMamba: Fine-Grained Food Category Visual Classification Using Selective State Space Models with Deep Residual LearningCode2
LS-EEND: Long-Form Streaming End-to-End Neural Diarization with Online Attractor ExtractionCode2
Progressive Autoregressive Video Diffusion ModelsCode2
IncEventGS: Pose-Free Gaussian Splatting from a Single Event CameraCode2
From Exploration to Mastery: Enabling LLMs to Master Tools via Self-Driven InteractionsCode2
An Undetectable Watermark for Generative Image ModelsCode2
From Cognition to Precognition: A Future-Aware Framework for Social NavigationCode2
VideoAgent: Self-Improving Video GenerationCode2
Derail Yourself: Multi-turn LLM Jailbreak Attack through Self-discovered CluesCode2
MMFuser: Multimodal Multi-Layer Feature Fuser for Fine-Grained Vision-Language UnderstandingCode2
Evaluating Morphological Compositional Generalization in Large Language ModelsCode2
IntersectionZoo: Eco-driving for Benchmarking Multi-Agent Contextual Reinforcement LearningCode2
DM-Codec: Distilling Multimodal Representations for Speech TokenizationCode2
GPT or BERT: why not both?Code2
Model merging with SVD to tie the KnotsCode2
SciPIP: An LLM-based Scientific Paper Idea ProposerCode2
Ada-MSHyper: Adaptive Multi-Scale Hypergraph Transformer for Time Series ForecastingCode2
DPU: Dynamic Prototype Updating for Multimodal Out-of-Distribution DetectionCode2
MetaOpenFOAM: an LLM-based multi-agent framework for CFDCode2
PyGen: A Collaborative Human-AI Approach to Python Package CreationCode2
Disentangling Memory and Reasoning Ability in Large Language ModelsCode2
MMGenBench: Evaluating the Limits of LMMs from the Text-to-Image Generation PerspectiveCode2
vesselFM: A Foundation Model for Universal 3D Blood Vessel SegmentationCode2
TryOffDiff: Virtual-Try-Off via High-Fidelity Garment Reconstruction using Diffusion ModelsCode2
TexGaussian: Generating High-quality PBR Material via Octree-based 3D Gaussian SplattingCode2
Lost & Found: Tracking Changes from Egocentric Observations in 3D Dynamic Scene GraphsCode2
X-Prompt: Towards Universal In-Context Image Generation in Auto-Regressive Vision Language Foundation ModelsCode2
CoRNStack: High-Quality Contrastive Data for Better Code Retrieval and RerankingCode2
FLAIR: VLM with Fine-grained Language-informed Image RepresentationsCode2
ComplexFuncBench: Exploring Multi-Step and Constrained Function Calling under Long-Context ScenarioCode2
SoRA: Singular Value Decomposed Low-Rank Adaptation for Domain Generalizable Representation LearningCode2
Divot: Diffusion Powers Video Tokenizer for Comprehension and GenerationCode2
JPC: Flexible Inference for Predictive Coding Networks in JAXCode2
MESA: Effective Matching Redundancy Reduction by Semantic Area SegmentationCode2
DriveMM: All-in-One Large Multimodal Model for Autonomous DrivingCode2
MAC-Ego3D: Multi-Agent Gaussian Consensus for Real-Time Collaborative Ego-Motion and Photorealistic 3D ReconstructionCode2
Show:102550
← PrevPage 140 of 13232Next →