SOTAVerified

The Open Verification Layer for ML Research

Community benchmark tracking and reproducibility verification. Built for researchers and autonomous research agents.

661,570 papers248,326 code links4,818 tasks

Papers

Showing 60516100 of 177340 papers

TitleStatusHype
Lenia - Biology of Artificial LifeCode2
WildTeaming at Scale: From In-the-Wild Jailbreaks to (Adversarially) Safer Language ModelsCode2
SGPT: GPT Sentence Embeddings for Semantic SearchCode2
AXIAL: Attention-based eXplainability for Interpretable Alzheimer's Localized Diagnosis using 2D CNNs on 3D MRI brain scansCode2
Model Uncertainty in Evolutionary Optimization and Bayesian Optimization: A Comparative AnalysisCode2
AGILE: A Novel Reinforcement Learning Framework of LLM AgentsCode2
Learning Local Equivariant Representations for Large-Scale Atomistic DynamicsCode2
How Can Time Series Analysis Benefit From Multiple Modalities? A Survey and OutlookCode2
Bayesian Neural Networks for One-to-Many Mapping in Image EnhancementCode2
BK-SDM: A Lightweight, Fast, and Cheap Version of Stable DiffusionCode2
Alphazero-like Tree-Search can Guide Large Language Model Decoding and TrainingCode2
MicroFlow: An Efficient Rust-Based Inference Engine for TinyMLCode2
Advancing Time Series Classification with Multimodal Language ModelingCode2
Trajectory balance: Improved credit assignment in GFlowNetsCode2
From Instance Training to Instruction Learning: Task Adapters Generation from InstructionsCode2
RoleLLM: Benchmarking, Eliciting, and Enhancing Role-Playing Abilities of Large Language ModelsCode2
φ-Decoding: Adaptive Foresight Sampling for Balanced Inference-Time Exploration and ExploitationCode2
Efficient Mixed Transformer for Single Image Super-ResolutionCode2
CoMoGaussian: Continuous Motion-Aware Gaussian Splatting from Motion-Blurred ImagesCode2
PAL: Proxy-Guided Black-Box Attack on Large Language ModelsCode2
PyReason: Software for Open World Temporal LogicCode2
mDPO: Conditional Preference Optimization for Multimodal Large Language ModelsCode2
In-Context MattingCode2
NTIRE 2025 Challenge on Image Super-Resolution (4): Methods and ResultsCode2
MME: A Comprehensive Evaluation Benchmark for Multimodal Large Language ModelsCode2
COVINS-G: A Generic Back-end for Collaborative Visual-Inertial SLAMCode2
Accelerating Transformer Pre-training with 2:4 SparsityCode2
Humanizing Machine-Generated Content: Evading AI-Text Detection through Adversarial AttackCode2
RaNDT SLAM: Radar SLAM Based on Intensity-Augmented Normal Distributions TransformCode2
Event Stream-based Visual Object Tracking: HDETrack V2 and A High-Definition BenchmarkCode2
RealCompo: Balancing Realism and Compositionality Improves Text-to-Image Diffusion ModelsCode2
EPTQ: Enhanced Post-Training Quantization via Hessian-guided Network-wise OptimizationCode2
DualDn: Dual-domain Denoising via Differentiable ISPCode2
An Empirical Study on Hyperparameter Optimization for Fine-Tuning Pre-trained Language ModelsCode2
MedUniSeg: 2D and 3D Medical Image Segmentation via a Prompt-driven Universal ModelCode2
EMR-Merging: Tuning-Free High-Performance Model MergingCode2
Flow Annealed Importance Sampling BootstrapCode2
Learn to Explain: Multimodal Reasoning via Thought Chains for Science Question AnsweringCode2
Online-LoRA: Task-free Online Continual Learning via Low Rank AdaptationCode2
Multi-Robot Motion Planning with Diffusion ModelsCode2
Random-Access Infinite Context Length for TransformersCode2
Unsupervised Real-Time Hallucination Detection based on the Internal States of Large Language ModelsCode2
GlueStick: Robust Image Matching by Sticking Points and Lines TogetherCode2
A Unified Image-Dense Annotation Generation Model for Underwater ScenesCode2
Attacking Vision-Language Computer Agents via Pop-upsCode2
Comparing Retrieval-Augmentation and Parameter-Efficient Fine-Tuning for Privacy-Preserving Personalization of Large Language ModelsCode2
OmniSearchSage: Multi-Task Multi-Entity Embeddings for Pinterest SearchCode2
ProGen2: Exploring the Boundaries of Protein Language ModelsCode2
From Pixels to Graphs: Open-Vocabulary Scene Graph Generation with Vision-Language ModelsCode2
DiffTF++: 3D-aware Diffusion Transformer for Large-Vocabulary 3D GenerationCode2
Show:102550
← PrevPage 122 of 3547Next →