SOTAVerified

The Open Verification Layer for ML Research

Community benchmark tracking and reproducibility verification. Built for researchers and autonomous research agents.

661,570 papers248,326 code links4,818 tasks

Papers

Showing 59015950 of 661570 papers

TitleStatusHype
HumanMM: Global Human Motion Recovery from Multi-shot VideosCode2
Griffin: Aerial-Ground Cooperative Detection and Tracking Dataset and BenchmarkCode2
DistiLLM-2: A Contrastive Approach Boosts the Distillation of LLMsCode2
Chameleon: Fast-slow Neuro-symbolic Lane Topology ExtractionCode2
Is CLIP ideal? No. Can we fix it? Yes!Code2
SemEval-2025 Task 11: Bridging the Gap in Text-Based Emotion DetectionCode2
Controllable 3D Outdoor Scene Generation via Scene GraphsCode2
A Multimodal Benchmark Dataset and Model for Crop Disease DiagnosisCode2
DaD: Distilled Reinforcement Learning for Diverse Keypoint DetectionCode2
MedAgentsBench: Benchmarking Thinking Models and Agent Frameworks for Complex Medical ReasoningCode2
Seedream 2.0: A Native Chinese-English Bilingual Image Generation Foundation ModelCode2
Similarity-Guided Layer-Adaptive Vision Transformer for UAV TrackingCode2
CLIMB: Data Foundations for Large Scale Multimodal Clinical Foundation ModelsCode2
Learning Few-Step Diffusion Models by Trajectory Distribution MatchingCode2
Axes that matter: PCA with a differenceCode2
Emulating Self-attention with Convolution for Efficient Image Super-ResolutionCode2
DiffCLIP: Differential Attention Meets CLIPCode2
DiffAtlas: GenAI-fying Atlas Segmentation via Image-Mask DiffusionCode2
Agent models: Internalizing Chain-of-Action Generation into Reasoning modelsCode2
X2I: Seamless Integration of Multimodal Understanding into Diffusion Transformer via Attention DistillationCode2
Can Atomic Step Decomposition Enhance the Self-structured Reasoning of Multimodal Large Models?Code2
RouterEval: A Comprehensive Benchmark for Routing LLMs to Explore Model-level Scaling Up in LLMsCode2
USP: Unified Self-Supervised Pretraining for Image Generation and UnderstandingCode2
Large Language Models Post-training: Surveying Techniques from Alignment to ReasoningCode2
Next Token Is Enough: Realistic Image Quality and Aesthetic Scoring with Multimodal Large Language ModelCode2
A Noise-Robust Turn-Taking System for Real-World Dialogue Robots: A Field ExperimentCode2
Slim attention: cut your context memory in half without loss of accuracy -- K-cache is all you need for MHACode2
EDM: Efficient Deep Feature MatchingCode2
A Survey of Large Language Model Empowered Agents for Recommendation and Search: Towards Next-Generation Information RetrievalCode2
Sketch-of-Thought: Efficient LLM Reasoning with Adaptive Cognitive-Inspired SketchingCode2
DriveTransformer: Unified Transformer for Scalable End-to-End Autonomous DrivingCode2
Encrypted Vector Similarity Computations Using Partially Homomorphic Encryption: Applications and Performance AnalysisCode2
PromptPex: Automatic Test Generation for Language Model PromptsCode2
CoMoGaussian: Continuous Motion-Aware Gaussian Splatting from Motion-Blurred ImagesCode2
D2GV: Deformable 2D Gaussian Splatting for Video Representation in 400FPSCode2
Linear-MoE: Linear Sequence Modeling Meets Mixture-of-ExpertsCode2
WritingBench: A Comprehensive Benchmark for Generative WritingCode2
Omnidirectional Multi-Object TrackingCode2
Bridging the Vision-Brain Gap with an Uncertainty-Aware Blur PriorCode2
Full-Duplex-Bench: A Benchmark to Evaluate Full-duplex Spoken Dialogue Models on Turn-taking CapabilitiesCode2
ProtComposer: Compositional Protein Structure Generation with 3D EllipsoidsCode2
Real-time Spatial-temporal Traversability Assessment via Feature-based Sparse Gaussian ProcessCode2
AnyAnomaly: Zero-Shot Customizable Video Anomaly Detection with LVLMCode2
An Egocentric Vision-Language Model based Portable Real-time Smart AssistantCode2
Scaling Rich Style-Prompted Text-to-Speech DatasetsCode2
Generalized Interpolating Discrete DiffusionCode2
Keeping Yourself is Important in Downstream Tuning Multimodal Large Language ModelCode2
PDX: A Data Layout for Vector Similarity SearchCode2
BANet: Bilateral Aggregation Network for Mobile Stereo MatchingCode2
Find First, Track Next: Decoupling Identification and Propagation in Referring Video Object SegmentationCode2
Show:102550
← PrevPage 119 of 13232Next →