SOTAVerified

The Open Verification Layer for ML Research

Community benchmark tracking and reproducibility verification. Built for researchers and autonomous research agents.

661,570 papers248,326 code links4,818 tasks

Papers

Showing 84018450 of 661570 papers

TitleStatusHype
An LLM-Assisted Easy-to-Trigger Backdoor Attack on Code Completion Models: Injecting Disguised Vulnerabilities against Strong DetectionCode2
ShiftAddLLM: Accelerating Pretrained LLMs via Post-Training Multiplication-Less ReparameterizationCode2
Recurrent Context Compression: Efficiently Expanding the Context Window of LLMCode2
VillagerAgent: A Graph-Based Multi-Agent Framework for Coordinating Complex Task Dependencies in MinecraftCode2
Separating the "Chirp" from the "Chat": Self-supervised Visual Grounding of Sound and LanguageCode2
Hello Again! LLM-powered Personalized Agent for Long-term DialogueCode2
How Alignment and Jailbreak Work: Explain LLM Safety through Intermediate Hidden StatesCode2
Binarized Diffusion Model for Image Super-ResolutionCode2
F-LMM: Grounding Frozen Large Multimodal ModelsCode2
A DeNoising FPN With Transformer R-CNN for Tiny Object DetectionCode2
WenetSpeech4TTS: A 12,800-hour Mandarin TTS Corpus for Large Speech Generation Model BenchmarkCode2
Bits-to-Photon: End-to-End Learned Scalable Point Cloud Compression for Direct RenderingCode2
Attention as a HypernetworkCode2
Flow of Reasoning:Training LLMs for Divergent Problem Solving with Minimal ExamplesCode2
Medical Vision Generalist: Unifying Medical Imaging Tasks in ContextCode2
LlavaGuard: An Open VLM-based Framework for Safeguarding Vision Datasets and ModelsCode2
Streaming Diffusion Policy: Fast Policy Synthesis with Variable Noise Diffusion ModelsCode2
CorDA: Context-Oriented Decomposition Adaptation of Large Language Models for Task-Aware Parameter-Efficient Fine-tuningCode2
Predictive Dynamic FusionCode2
The Russian Legislative CorpusCode2
Spectrum: Targeted Training on Signal to Noise RatioCode2
Hibou: A Family of Foundational Vision Transformers for PathologyCode2
Mixed-Curvature Decision Trees and Random ForestsCode2
LocLLM: Exploiting Generalizable Human Keypoint Localization via Large Language ModelCode2
STAR: Skeleton-aware Text-based 4D Avatar Generation with In-Network Motion RetargetingCode2
Faster Than Lies: Real-time Deepfake Detection using Binary Neural NetworksCode2
Split-and-Fit: Learning B-Reps via Structure-Aware Voronoi PartitioningCode2
3D-GRAND: A Million-Scale Dataset for 3D-LLMs with Better Grounding and Less HallucinationCode2
MeLFusion: Synthesizing Music from Image and Language Cues using Diffusion ModelsCode2
MoE Jetpack: From Dense Checkpoints to Adaptive Mixture of Experts for Vision TasksCode2
Differentiable Time-Varying Linear Prediction in the Context of End-to-End Analysis-by-SynthesisCode2
Adaptive Multi-Scale Decomposition Framework for Time Series ForecastingCode2
Tool-Planner: Task Planning with Clusters across Multiple ToolsCode2
GenAI Arena: An Open Evaluation Platform for Generative ModelsCode2
Your Absorbing Discrete Diffusion Secretly Models the Conditional Distributions of Clean DataCode2
Parameter-Inverted Image Pyramid NetworksCode2
Simplified and Generalized Masked Diffusion for Discrete DataCode2
BLSP-Emo: Towards Empathetic Large Speech-Language ModelsCode2
Generalization-Enhanced Code Vulnerability Detection via Multi-Task Instruction Fine-TuningCode2
How Far Can We Compress Instant-NGP-Based NeRF?Code2
MAIRA-2: Grounded Radiology Report GenerationCode2
UltraMedical: Building Specialized Generalists in BiomedicineCode2
Evaluating the World Model Implicit in a Generative ModelCode2
Jailbreak Vision Language Models via Bi-Modal Adversarial PromptCode2
Zero-Painter: Training-Free Layout Control for Text-to-Image SynthesisCode2
DIRECT-3D: Learning Direct Text-to-3D Generation on Massive Noisy 3D DataCode2
Superpoint Gaussian Splatting for Real-Time High-Fidelity Dynamic Scene ReconstructionCode2
Mini Honor of Kings: A Lightweight Environment for Multi-Agent Reinforcement LearningCode2
CDMamba: Incorporating Local Clues into Mamba for Remote Sensing Image Binary Change DetectionCode2
VidMuse: A Simple Video-to-Music Generation Framework with Long-Short-Term ModelingCode2
Show:102550
← PrevPage 169 of 13232Next →