SOTAVerified

The Open Verification Layer for ML Research

Community benchmark tracking and reproducibility verification. Built for researchers and autonomous research agents.

661,570 papers248,326 code links4,818 tasks

Papers

Showing 69016950 of 177340 papers

TitleStatusHype
Tuning Large Multimodal Models for Videos using Reinforcement Learning from AI FeedbackCode2
EarthGPT: A Universal Multi-modal Large Language Model for Multi-sensor Image Comprehension in Remote Sensing DomainCode2
Efficiently Learning at Test-Time: Active Fine-Tuning of LLMsCode2
Evaluating Quantized Large Language ModelsCode2
Edu-ConvoKit: An Open-Source Library for Education Conversation DataCode2
Calibrated Self-Rewarding Vision Language ModelsCode2
PERT: Pre-training BERT with Permuted Language ModelCode2
LLaSE-G1: Incentivizing Generalization Capability for LLaMA-based Speech EnhancementCode2
Training Diffusion Models with Reinforcement LearningCode2
GoLLIE: Annotation Guidelines improve Zero-Shot Information-ExtractionCode2
All in One: Exploring Unified Video-Language Pre-trainingCode2
A Survey on Multimodal Large Language Models for Autonomous DrivingCode2
Towards A Unified Conformer Structure: from ASR to ASV TaskCode2
DocPrompting: Generating Code by Retrieving the DocsCode2
AllSpark: Reborn Labeled Features from Unlabeled in Transformer for Semi-Supervised Semantic SegmentationCode2
Open CaptchaWorld: A Comprehensive Web-based Platform for Testing and Benchmarking Multimodal LLM AgentsCode2
Exploring Video Quality Assessment on User Generated Contents from Aesthetic and Technical PerspectivesCode2
Unsupervised Representation Learning from Pre-trained Diffusion Probabilistic ModelsCode2
TGL: A General Framework for Temporal GNN Training on Billion-Scale GraphsCode2
Hidet: Task-Mapping Programming Paradigm for Deep Learning Tensor ProgramsCode2
Prompting Large Language Models to Tackle the Full Software Development Lifecycle: A Case StudyCode2
REEF: Representation Encoding Fingerprints for Large Language ModelsCode2
Modeling the Label Distributions for Weakly-Supervised Semantic SegmentationCode2
MotionDiffuse: Text-Driven Human Motion Generation with Diffusion ModelCode2
Large language models surpass human experts in predicting neuroscience resultsCode2
Owl-1: Omni World Model for Consistent Long Video GenerationCode2
Diving Deeper Into Pedestrian Behavior Understanding: Intention Estimation, Action Prediction, and Event Risk AssessmentCode2
K2: A Foundation Language Model for Geoscience Knowledge Understanding and UtilizationCode2
GenSim: A General Social Simulation Platform with Large Language Model based AgentsCode2
Metric Flow Matching for Smooth Interpolations on the Data ManifoldCode2
Harmonizer: Learning to Perform White-Box Image and Video HarmonizationCode2
Android in the Zoo: Chain-of-Action-Thought for GUI AgentsCode2
Knowledge Circuits in Pretrained TransformersCode2
PyMIC: A deep learning toolkit for annotation-efficient medical image segmentationCode2
PHemoNet: A Multimodal Network for Physiological SignalsCode2
From Sparse to Soft Mixtures of ExpertsCode2
ColorizeDiffusion: Adjustable Sketch Colorization with Reference Image and TextCode2
Diving into Underwater: Segment Anything Model Guided Underwater Salient Instance Segmentation and A Large-scale DatasetCode2
DifIISR: A Diffusion Model with Gradient Guidance for Infrared Image Super-ResolutionCode2
nuScenes: A multimodal dataset for autonomous drivingCode2
An LLM-Assisted Easy-to-Trigger Backdoor Attack on Code Completion Models: Injecting Disguised Vulnerabilities against Strong DetectionCode2
Shape, Light, and Material Decomposition from Images using Monte Carlo Rendering and DenoisingCode2
Video Prediction Transformers without Recurrence or ConvolutionCode2
TinyLLaVA-Video-R1: Towards Smaller LMMs for Video ReasoningCode2
DeepFilterNet: A Low Complexity Speech Enhancement Framework for Full-Band Audio based on Deep FilteringCode2
PoseScript: Linking 3D Human Poses and Natural LanguageCode2
SDEdit: Guided Image Synthesis and Editing with Stochastic Differential EquationsCode2
Satellite Image Time Series Semantic Change Detection: Novel Architecture and Analysis of Domain ShiftCode2
LLaMEA: A Large Language Model Evolutionary Algorithm for Automatically Generating MetaheuristicsCode2
Unsupervised Universal Image SegmentationCode2
Show:102550
← PrevPage 139 of 3547Next →