SOTAVerified

The Open Verification Layer for ML Research

Community benchmark tracking and reproducibility verification. Built for researchers and autonomous research agents.

661,570 papers248,326 code links4,818 tasks

Papers

Showing 59516000 of 661570 papers

TitleStatusHype
Mixed-Curvature Decision Trees and Random ForestsCode2
SimVG: A Simple Framework for Visual Grounding with Decoupled Multi-modal FusionCode2
GeoGround: A Unified Large Vision-Language Model for Remote Sensing Visual GroundingCode2
RecFlow: An Industrial Full Flow Recommendation DatasetCode2
LightGen: Efficient Image Generation through Knowledge Distillation and Direct Preference OptimizationCode2
ProxylessNAS: Direct Neural Architecture Search on Target Task and HardwareCode2
PerAct2: Benchmarking and Learning for Robotic Bimanual Manipulation TasksCode2
GPQA: A Graduate-Level Google-Proof Q&A BenchmarkCode2
PruneVid: Visual Token Pruning for Efficient Video Large Language ModelsCode2
Voice Conversion With Just Nearest NeighborsCode2
Learning Affinity from Attention: End-to-End Weakly-Supervised Semantic Segmentation with TransformersCode2
DreamLLM: Synergistic Multimodal Comprehension and CreationCode2
On-Device Domain GeneralizationCode2
Dynamic Early Exit in Reasoning ModelsCode2
Medical Vision Generalist: Unifying Medical Imaging Tasks in ContextCode2
AIR-Bench: Automated Heterogeneous Information Retrieval BenchmarkCode2
Revisiting Adversarial Training under Long-Tailed DistributionsCode2
Many-Shot In-Context Learning in Multimodal Foundation ModelsCode2
Towards Unified Keyframe Propagation ModelsCode2
Embodied-Reasoner: Synergizing Visual Search, Reasoning, and Action for Embodied Interactive TasksCode2
OS-Harm: A Benchmark for Measuring Safety of Computer Use AgentsCode2
A Versatile Framework for Multi-scene Person Re-identificationCode2
Measuring Massive Multitask Language UnderstandingCode2
CombatVLA: An Efficient Vision-Language-Action Model for Combat Tasks in 3D Action Role-Playing GamesCode2
Think or Not Think: A Study of Explicit Thinking in Rule-Based Visual Reinforcement Fine-TuningCode2
Tuning Large Neural Networks via Zero-Shot Hyperparameter TransferCode2
YOLO-UniOW: Efficient Universal Open-World Object DetectionCode2
Voxurf: Voxel-based Efficient and Accurate Neural Surface ReconstructionCode2
DySLIM: Dynamics Stable Learning by Invariant Measure for Chaotic SystemsCode2
CLRerNet: Improving Confidence of Lane Detection with LaneIoUCode2
Do we actually understand the impact of renewables on electricity prices? A causal inference approachCode2
Transformer Circuit Faithfulness Metrics are not RobustCode2
Retinexmamba: Retinex-based Mamba for Low-light Image EnhancementCode2
COVID-19 Image Data Collection: Prospective Predictions Are the FutureCode2
BRIGHTER: BRIdging the Gap in Human-Annotated Textual Emotion Recognition Datasets for 28 LanguagesCode2
Analyzing and Boosting the Power of Fine-Grained Visual Recognition for Multi-modal Large Language ModelsCode2
Source-free Subject Adaptation for EEG-based Visual RecognitionCode2
HiddenDetect: Detecting Jailbreak Attacks against Large Vision-Language Models via Monitoring Hidden StatesCode2
Training-Free Adaptive Diffusion with Bounded Difference Approximation StrategyCode2
LayoutDiffusion: Controllable Diffusion Model for Layout-to-image GenerationCode2
CLIPSelf: Vision Transformer Distills Itself for Open-Vocabulary Dense PredictionCode2
Order Constraints in Optimal TransportCode2
Real-time Scene Text Detection with Differentiable BinarizationCode2
An Image is Worth 16x16 Words: Transformers for Image Recognition at ScaleCode2
VideoLifter: Lifting Videos to 3D with Fast Hierarchical Stereo AlignmentCode2
Hopular: Modern Hopfield Networks for Tabular DataCode2
TOD3Cap: Towards 3D Dense Captioning in Outdoor ScenesCode2
Improving the Training of Rectified FlowsCode2
A Systematic Study of Joint Representation Learning on Protein Sequences and StructuresCode2
Evaluating the Performance of Large Language Models on GAOKAO BenchmarkCode2
Show:102550
← PrevPage 120 of 13232Next →