SOTAVerified

The Open Verification Layer for ML Research

Community benchmark tracking and reproducibility verification. Built for researchers and autonomous research agents.

661,570 papers248,326 code links4,818 tasks

Papers

Showing 90019050 of 661570 papers

TitleStatusHype
Competition Report: Finding Universal Jailbreak Backdoors in Aligned LLMsCode2
Graphic Design with Large Multimodal ModelCode2
An empirical study of LLaMA3 quantization: from LLMs to MLLMsCode2
SwinFuSR: an image fusion-inspired model for RGB-guided thermal image super-resolutionCode2
FiLo: Zero-Shot Anomaly Detection by Fine-Grained Description and High-Quality LocalizationCode2
Bracketing Image Restoration and Enhancement with High-Low Frequency DecompositionCode2
How to Encode Domain Information in Relation ClassificationCode2
AdvPrompter: Fast Adaptive Adversarial Prompting for LLMsCode2
Dynamic in Static: Hybrid Visual Correspondence for Self-Supervised Video Object SegmentationCode2
Mixture of LoRA ExpertsCode2
Retrieval-Augmented Generation-based Relation ExtractionCode2
Vim4Path: Self-Supervised Vision Mamba for Histopathology ImagesCode2
HiVG: Hierarchical Multimodal Fine-grained Modulation for Visual GroundingCode2
Movie101v2: Improved Movie Narration BenchmarkCode2
FakeBench: Probing Explainable Fake Image Detection via Large Multimodal ModelsCode2
Augmented Object Intelligence with XR-ObjectsCode2
Large Language Models for Next Point-of-Interest RecommendationCode2
decoupleQ: Towards 2-bit Post-Training Uniform Quantization via decoupling Parameters into Integer and Floating PointsCode2
MambaMOS: LiDAR-based 3D Moving Object Segmentation with Motion-aware State Space ModelCode2
DeeperHistReg: Robust Whole Slide Images Registration FrameworkCode2
Linearly-evolved Transformer for Pan-sharpeningCode2
Ensemble Learning for Heterogeneous Large Language Models with Deep Parallel CollaborationCode2
FlagVNE: A Flexible and Generalizable Reinforcement Learning Framework for Network Resource AllocationCode2
LLM-R2: A Large Language Model Enhanced Rule-based Rewrite System for Boosting Query EfficiencyCode2
Training-and-Prompt-Free General Painterly Harmonization via Zero-Shot Disentenglement on Style and Content ReferencesCode2
MAexp: A Generic Platform for RL-based Multi-Agent ExplorationCode2
MoVA: Adapting Mixture of Vision Experts to Multimodal ContextCode2
Generalized Few-Shot Meets Remote Sensing: Discovering Novel Classes in Land Cover Mapping via Hybrid Semantic Segmentation FrameworkCode2
Token-level Direct Preference OptimizationCode2
Partial-to-Partial Shape Matching with Geometric ConsistencyCode2
Point-In-Context: Understanding Point Cloud via In-Context LearningCode2
GraphER: A Structure-aware Text-to-Graph Model for Entity and Relation ExtractionCode2
SPIdepth: Strengthened Pose Information for Self-supervised Monocular Depth EstimationCode2
Concept Induction: Analyzing Unstructured Text with High-Level Concepts Using LLooMCode2
Model-free quantification of completeness, uncertainties, and outliers in atomistic machine learning using information theoryCode2
MolCRAFT: Structure-Based Drug Design in Continuous Parameter SpaceCode2
LongEmbed: Extending Embedding Models for Long Context RetrievalCode2
6Img-to-3D: Few-Image Large-Scale Outdoor Driving Scene ReconstructionCode2
Partial Large Kernel CNNs for Efficient Super-ResolutionCode2
An Experimental Study on Exploring Strong Lightweight Vision Transformers via Masked Image Modeling Pre-TrainingCode2
Transformer tricks: Removing weights for skipless transformersCode2
Introducing v0.5 of the AI Safety Benchmark from MLCommonsCode2
ShadowRefiner: Towards Mask-free Shadow Removal via Fast Fourier TransformerCode2
Aligning language models with human preferencesCode2
Physics-informed active learning for accelerating quantum chemical simulationsCode2
RiboDiffusion: Tertiary Structure-based RNA Inverse Folding with Generative Diffusion ModelsCode2
VBR: A Vision Benchmark in RomeCode2
Training Transformer Models by Wavelet Losses Improves Quantitative and Visual Performance in Single Image Super-ResolutionCode2
Behavior Alignment: A New Perspective of Evaluating LLM-based Conversational Recommender SystemsCode2
Large Language Models meet Collaborative Filtering: An Efficient All-round LLM-based Recommender SystemCode2
Show:102550
← PrevPage 181 of 13232Next →