SOTAVerified

The Open Verification Layer for ML Research

Community benchmark tracking and reproducibility verification. Built for researchers and autonomous research agents.

474,278 papers248,326 code links4,818 tasks

Papers

Showing 1780117850 of 474278 papers

TitleStatusHype
ControlText: Unlocking Controllable Fonts in Multilingual Text Rendering without Font AnnotationsCode1
Knowing Your Target: Target-Aware Transformer Makes Better Spatio-Temporal Video GroundingCode1
MMUnlearner: Reformulating Multimodal Machine Unlearning in the Era of Multimodal Large Language ModelsCode1
Exposing Numeracy Gaps: A Benchmark to Evaluate Fundamental Numerical Abilities in Large Language ModelsCode1
DEEPER Insight into Your User: Directed Persona Refinement for Dynamic Persona ModelingCode1
Learning Identifiable Structures Helps Avoid Bias in DNN-based Supervised Causal LearningCode1
Occlusion-aware Non-Rigid Point Cloud Registration via Unsupervised Neural Deformation CorrentropyCode1
LLM-Lasso: A Robust Framework for Domain-Informed Feature Selection and RegularizationCode1
A Comprehensive Survey of Deep Learning for Multivariate Time Series Forecasting: A Channel Strategy PerspectiveCode1
Reading Your Heart: Learning ECG Words and Sentences via Pre-training ECG Language ModelCode1
BASE-SQL: A powerful open source Text-To-SQL baseline approachCode1
CoPEFT: Fast Adaptation Framework for Multi-Agent Collaborative Perception with Parameter-Efficient Fine-TuningCode1
Reduced Order Modeling with Shallow Recurrent Decoder NetworksCode1
CalibQuant: 1-Bit KV Cache Quantization for Multimodal LLMsCode1
BalanceBenchmark: A Survey for Imbalanced LearningCode1
Forget the Data and Fine-Tuning! Just Fold the Network to CompressCode1
Can Large Language Model Agents Balance Energy Systems?Code1
SegX: Improving Interpretability of Clinical Image Diagnosis with Segmentation-based EnhancementCode1
Manual2Skill: Learning to Read Manuals and Acquire Robotic Skills for Furniture Assembly Using Vision-Language ModelsCode1
QMaxViT-Unet+: A Query-Based MaxViT-Unet with Edge Enhancement for Scribble-Supervised Segmentation of Medical ImagesCode1
Classifier-free Guidance with Adaptive ScalingCode1
Evaluating and Improving Graph-based Explanation Methods for Multi-Agent CoordinationCode1
AdaPTS: Adapting Univariate Foundation Models to Probabilistic Multivariate Time Series ForecastingCode1
A Lightweight and Effective Image Tampering Localization Network with Vision MambaCode1
X-Boundary: Establishing Exact Safety Boundary to Shield LLMs from Multi-Turn Jailbreaks without Compromising UsabilityCode1
MITO: Enabling Non-Line-of-Sight Perception using Millimeter-waves through Real-World Datasets and Simulation ToolsCode1
A synergistic CNN-transformer network with pooling attention fusion for hyperspectral image classificationCode1
CISSIR: Beam Codebooks with Self-Interference Reduction Guarantees for Integrated Sensing and Communication Beyond 5GCode1
Automated Muscle and Fat Segmentation in Computed Tomography for Comprehensive Body Composition AnalysisCode1
Large Images are Gaussians: High-Quality Large Image Representation with Levels of 2D Gaussian SplattingCode1
Wholly-WOOD: Wholly Leveraging Diversified-quality Labels for Weakly-supervised Oriented Object DetectionCode1
A Contextual-Aware Position Encoding for Sequential RecommendationCode1
Rethinking Evaluation Metrics for Grammatical Error Correction: Why Use a Different Evaluation Process than Human?Code1
Medicine on the Edge: Comparative Performance Analysis of On-Device LLMs for Clinical ReasoningCode1
QueryAttack: Jailbreaking Aligned Large Language Models Using Structured Non-natural Query LanguageCode1
Inverse Design with Dynamic Mode DecompositionCode1
SQ-GAN: Semantic Image Communications Using Masked Vector QuantizationCode1
You Do Not Fully Utilize Transformer's Representation CapacityCode1
AnomalyGFM: Graph Foundation Model for Zero/Few-shot Anomaly DetectionCode1
Biologically Plausible Brain Graph TransformerCode1
Do LLMs Recognize Your Preferences? Evaluating Personalized Preference Following in LLMsCode1
Reevaluating Policy Gradient Methods for Imperfect-Information GamesCode1
LOB-Bench: Benchmarking Generative AI for Finance -- an Application to Limit Order Book DataCode1
GAIA: A Global, Multi-modal, Multi-scale Vision-Language Dataset for Remote Sensing Image AnalysisCode1
PTZ-Calib: Robust Pan-Tilt-Zoom Camera CalibrationCode1
Language in the Flow of Time: Time-Series-Paired Texts Weaved into a Unified Temporal NarrativeCode1
Spiking Neural Networks for Temporal Processing: Status Quo and Future ProspectsCode1
A Deep Inverse-Mapping Model for a Flapping Robotic WingCode1
Enhancing the Utility of Higher-Order Information in Relational LearningCode1
MC2SleepNet: Multi-modal Cross-masking with Contrastive Learning for Sleep Stage ClassificationCode1
Show:102550
← PrevPage 357 of 9486Next →