SOTAVerified

The Open Verification Layer for ML Research

Community benchmark tracking and reproducibility verification. Built for researchers and autonomous research agents.

474,278 papers248,326 code links4,818 tasks

Papers

Showing 1735117400 of 474278 papers

TitleStatusHype
Deepfake-Eval-2024: A Multi-Modal In-the-Wild Benchmark of Deepfakes Circulated in 2024Code1
Unveiling the Potential of Segment Anything Model 2 for RGB-Thermal Semantic Segmentation with Language GuidanceCode1
DQO-MAP: Dual Quadrics Multi-Object mapping with Gaussian SplattingCode1
Generating Novel Brain Morphology by Deforming Learned TemplatesCode1
RACNN: Residual Attention Convolutional Neural Network for Near-Field Channel Estimation in 6G Wireless CommunicationsCode1
Q-Filters: Leveraging QK Geometry for Efficient KV Cache CompressionCode1
Federated nnU-Net for Privacy-Preserving Medical Image SegmentationCode1
Disentangled Knowledge Tracing for Alleviating Cognitive BiasCode1
Q-Eval-100K: Evaluating Visual Quality and Alignment Level for Text-to-Vision ContentCode1
Reconciling Stochastic and Deterministic Strategies for Zero-shot Image Restoration using Diffusion Model in DualCode1
SVDC: Consistent Direct Time-of-Flight Video Depth Completion with Frequency Selective FusionCode1
Linear Representations of Political Perspective Emerge in Large Language ModelsCode1
ToLo: A Two-Stage, Training-Free Layout-To-Image Generation Framework For High-Overlap LayoutsCode1
InversionGNN: A Dual Path Network for Multi-Property Molecular OptimizationCode1
How simple can you go? An off-the-shelf transformer approach to molecular dynamicsCode1
KoWit-24: A Richly Annotated Dataset of Wordplay in News HeadlinesCode1
MRI super-resolution reconstruction using efficient diffusion probabilistic model with residual shiftingCode1
What do Large Language Models Say About Animals? Investigating Risks of Animal Harm in Generated TextCode1
Nature-Inspired Population-Based Evolution of Large Language ModelsCode1
CrowdSelect: Synthetic Instruction Data Selection with Multi-LLM WisdomCode1
When Can You Get Away with Low Memory Adam?Code1
WeGen: A Unified Model for Interactive Multimodal Generation as We ChatCode1
RSQ: Learning from Important Tokens Leads to Better Quantized LLMsCode1
From Claims to Evidence: A Unified Framework and Critical Analysis of CNN vs. Transformer vs. Mamba in Medical Image SegmentationCode1
One-Shot Affordance Grounding of Deformable Objects in Egocentric Organizing ScenesCode1
Superscopes: Amplifying Internal Feature Representations for Language Model InterpretationCode1
m4: A Learned Flow-level Network SimulatorCode1
Recurrence-Enhanced Vision-and-Language Transformers for Robust Multimodal Document RetrievalCode1
AutoAdvExBench: Benchmarking autonomous exploitation of adversarial example defensesCode1
POPGym Arcade: Parallel Pixelated POMDPsCode1
Convex Hull-based Algebraic Constraint for Visual Quadric SLAMCode1
One ruler to measure them all: Benchmarking multilingual long-context language modelsCode1
MultiAgentBench: Evaluating the Collaboration and Competition of LLM agentsCode1
Generalized Diffusion Detector: Mining Robust Features from Diffusion Models for Domain-Generalized DetectionCode1
Measuring the Validity of Clustering Validation DatasetsCode1
A General Neural Network Potential for Energetic Materials with C, H, N, and O elementsCode1
Trajectory-Class-Aware Multi-Agent Reinforcement LearningCode1
SePer: Measure Retrieval Utility Through The Lens Of Semantic Perplexity ReductionCode1
Improve Representation for Imbalanced Regression through Geometric ConstraintsCode1
Underdamped Diffusion Bridges with Applications to SamplingCode1
Extrapolating and Decoupling Image-to-Video Generation Models: Motion Modeling is Easier Than You ThinkCode1
Delving into Out-of-Distribution Detection with Medical Vision-Language ModelsCode1
Molecule Generation for Target Protein Binding with Hierarchical Consistency Diffusion ModelCode1
DEAL: Data-Efficient Adversarial Learning for High-Quality Infrared ImagingCode1
DuoDecoding: Hardware-aware Heterogeneous Speculative Decoding with Dynamic Multi-Sequence DraftingCode1
STAR-Edge: Structure-aware Local Spherical Curve Representation for Thin-walled Edge Extraction from Unstructured Point CloudsCode1
GPIoT: Tailoring Small Language Models for IoT Program Synthesis and DevelopmentCode1
LesionDiffusion: Towards Text-controlled General Lesion SynthesisCode1
Dur360BEV: A Real-world 360-degree Single Camera Dataset and Benchmark for Bird-Eye View Mapping in Autonomous DrivingCode1
ALinFiK: Learning to Approximate Linearized Future Influence Kernel for Scalable Third-Party LLM Data ValuationCode1
Show:102550
← PrevPage 348 of 9486Next →