SOTAVerified

The Open Verification Layer for ML Research

Community benchmark tracking and reproducibility verification. Built for researchers and autonomous research agents.

474,278 papers248,326 code links4,818 tasks

Papers

Showing 1875118800 of 474278 papers

TitleStatusHype
Generating Traffic Scenarios via In-Context Learning to Learn Better Motion PlannerCode1
Beyond Gradient Averaging in Parallel Optimization: Improved Robustness through Gradient Agreement FilteringCode1
Extract Free Dense Misalignment from CLIPCode1
Towards Modality Generalization: A Benchmark and Prospective AnalysisCode1
LongDocURL: a Comprehensive Multimodal Long Document Benchmark Integrating Understanding, Reasoning, and LocatingCode1
An Automatic Graph Construction Framework based on Large Language Models for RecommendationCode1
Accelerating AIGC Services with Latent Action Diffusion Scheduling in Edge NetworksCode1
Underwater Image Restoration via Polymorphic Large Kernel CNNsCode1
Learning to engineer protein flexibilityCode1
VisionGRU: A Linear-Complexity RNN Model for Efficient Image AnalysisCode1
Improving Pareto Set Learning for Expensive Multi-objective Optimization via Stein Variational HypernetworksCode1
Towards Unsupervised Model Selection for Domain Adaptive Object DetectionCode1
Comprehensive Multi-Modal Prototypes are Simple and Effective Classifiers for Vast-Vocabulary Object DetectionCode1
The Superposition of Diffusion Models Using the Itô Density EstimatorCode1
Neural-MCRL: Neural Multimodal Contrastive Representation Learning for EEG-based Visual DecodingCode1
CARL-GT: Evaluating Causal Reasoning Capabilities of Large Language ModelsCode1
QTSeg: A Query Token-Based Architecture for Efficient 2D Medical Image SegmentationCode1
Brain-to-Text Benchmark '24: Lessons LearnedCode1
Resource-Aware Arabic LLM Creation: Model Adaptation, Integration, and Multi-Domain TestingCode1
A Survey on LLM-based Multi-Agent System: Recent Advances and New Frontiers in ApplicationCode1
GraphHash: Graph Clustering Enables Parameter Efficiency in Recommender SystemsCode1
LegalAgentBench: Evaluating LLM Agents in Legal DomainCode1
BrainMAP: Learning Multiple Activation Pathways in Brain NetworksCode1
Multi-Modal Grounded Planning and Efficient Replanning For Learning Embodied Agents with A Few ExamplesCode1
Progressive Boundary Guided Anomaly Synthesis for Industrial Anomaly DetectionCode1
On the Generalization Ability of Machine-Generated Text DetectorsCode1
Hierarchical Vector Quantization for Unsupervised Action SegmentationCode1
Efficient fine-tuning methodology of text embedding models for information retrieval: contrastive learning penalty (clp)Code1
Kernel-Aware Graph Prompt Learning for Few-Shot Anomaly DetectionCode1
Unity is Strength: Unifying Convolutional and Transformeral Features for Better Person Re-IdentificationCode1
AFANet: Adaptive Frequency-Aware Network for Weakly-Supervised Few-Shot Semantic SegmentationCode1
Multimodal Learning with Uncertainty Quantification based on Discounted Belief FusionCode1
CodeV: Issue Resolving with Visual DataCode1
Neural Spatial-Temporal Tensor Representation for Infrared Small Target DetectionCode1
Uncertainty-Participation Context Consistency Learning for Semi-supervised Semantic SegmentationCode1
SMAC-Hard: Enabling Mixed Opponent Strategy Script and Self-play on SMACCode1
LiveIdeaBench: Evaluating LLMs' Scientific Creativity and Idea Generation with Minimal ContextCode1
VarAD: Lightweight High-Resolution Image Anomaly Detection via Visual Autoregressive ModelingCode1
Knowledge Editing through Chain-of-ThoughtCode1
WildPPG: A Real-World PPG Dataset of Long Continuous RecordingsCode1
Seamless Detection: Unifying Salient Object Detection and Camouflaged Object DetectionCode1
Optimal signal transmission and timescale diversity in a model of human brain operating near criticalityCode1
Empirical evaluation of normalizing flows in Markov Chain Monte CarloCode1
Learning to Generate Gradients for Test-Time Adaptation via Test-Time Training LayersCode1
A Conditional Diffusion Model for Electrical Impedance Tomography Image ReconstructionCode1
SAIL: Sample-Centric In-Context Learning for Document Information ExtractionCode1
LLM-Powered User Simulator for Recommender SystemCode1
Grams: Gradient Descent with Adaptive Momentum ScalingCode1
Interactive Classification Metrics: A graphical application to build robust intuition for classification model evaluationCode1
Online Preference-based Reinforcement Learning with Self-augmented Feedback from Large Language ModelCode1
Show:102550
← PrevPage 376 of 9486Next →