SOTAVerified

The Open Verification Layer for ML Research

Community benchmark tracking and reproducibility verification. Built for researchers and autonomous research agents.

661,570 papers248,326 code links4,818 tasks

Papers

Showing 88518900 of 661570 papers

TitleStatusHype
SiTH: Single-view Textured Human Reconstruction with Image-Conditioned DiffusionCode2
TLOB: A Novel Transformer Model with Dual Attention for Price Trend Prediction with Limit Order Book DataCode2
SAM-6D: Segment Anything Model Meets Zero-Shot 6D Object Pose EstimationCode2
Text-Driven Image Editing via Learnable RegionsCode2
LevelRAG: Enhancing Retrieval-Augmented Generation with Multi-hop Logic Planning over Rewriting Augmented SearchersCode2
Uni-ControlNet: All-in-One Control to Text-to-Image Diffusion ModelsCode2
RevColV2: Exploring Disentangled Representations in Masked Image ModelingCode2
Audio Deepfake Detection with Self-Supervised XLS-R and SLS ClassifierCode2
UniControl: A Unified Diffusion Model for Controllable Visual Generation In the WildCode2
SAM-Assisted Remote Sensing Imagery Semantic Segmentation with Object and Boundary ConstraintsCode2
Foundation Models for Weather and Climate Data Understanding: A Comprehensive SurveyCode2
HouseDiffusion: Vector Floorplan Generation via a Diffusion Model with Discrete and Continuous DenoisingCode2
Stronger, Fewer, & Superior: Harnessing Vision Foundation Models for Domain Generalized Semantic SegmentationCode2
DreamSim: Learning New Dimensions of Human Visual Similarity using Synthetic DataCode2
Training-Free Text-Guided Image Editing with Visual Autoregressive ModelCode2
SuperFlow++: Enhanced Spatiotemporal Consistency for Cross-Modal Data PretrainingCode2
Paint3D: Paint Anything 3D with Lighting-Less Texture Diffusion ModelsCode2
HD-Painter: High-Resolution and Prompt-Faithful Text-Guided Image Inpainting with Diffusion ModelsCode2
Visual Point Cloud Forecasting enables Scalable Autonomous DrivingCode2
PingPong: A Benchmark for Role-Playing Language Models with User Emulation and Multi-Model EvaluationCode2
Malla: Demystifying Real-world Large Language Model Integrated Malicious ServicesCode2
Multi-Modal Representation Learning for Molecular Property Prediction: Sequence, Graph, GeometryCode2
Galvatron: Efficient Transformer Training over Multiple GPUs Using Automatic ParallelismCode2
DialogStudio: Towards Richest and Most Diverse Unified Dataset Collection for Conversational AICode2
Mamba-YOLO-World: Marrying YOLO-World with Mamba for Open-Vocabulary DetectionCode2
TTS-GAN: A Transformer-based Time-Series Generative Adversarial NetworkCode2
Taming Data and Transformers for Audio GenerationCode2
Continual Test-Time Domain AdaptationCode2
InstructCV: Instruction-Tuned Text-to-Image Diffusion Models as Vision GeneralistsCode2
Test-Time Prompt Tuning for Zero-Shot Generalization in Vision-Language ModelsCode2
SimpleClick: Interactive Image Segmentation with Simple Vision TransformersCode2
Golden Cudgel Network for Real-Time Semantic SegmentationCode2
InfoBatch: Lossless Training Speed Up by Unbiased Dynamic Data PruningCode2
SegVG: Transferring Object Bounding Box to Segmentation for Visual GroundingCode2
PALO: A Polyglot Large Multimodal Model for 5B PeopleCode2
APEBench: A Benchmark for Autoregressive Neural Emulators of PDEsCode2
Scaling Video-Language Models to 10K Frames via Hierarchical Differential DistillationCode2
RecDiffusion: Rectangling for Image Stitching with Diffusion ModelsCode2
KaLM-Embedding: Superior Training Data Brings A Stronger Embedding ModelCode2
GPT4RoI: Instruction Tuning Large Language Model on Region-of-InterestCode2
Evaluating RAG-Fusion with RAGElo: an Automated Elo-based FrameworkCode2
NU-Wave 2: A General Neural Audio Upsampling Model for Various Sampling RatesCode2
MEMFOF: High-Resolution Training for Memory-Efficient Multi-Frame Optical Flow EstimationCode2
Deep PCB To COCO ConvertorCode2
Many Heads Are Better Than One: Improved Scientific Idea Generation by A LLM-Based Multi-Agent SystemCode2
InstructDoc: A Dataset for Zero-Shot Generalization of Visual Document Understanding with InstructionsCode2
AEM: Attention Entropy Maximization for Multiple Instance Learning based Whole Slide Image ClassificationCode2
Blockwise Parallel Transformers for Large Context ModelsCode2
Diff9D: Diffusion-Based Domain-Generalized Category-Level 9-DoF Object Pose EstimationCode2
Is One GPU Enough? Pushing Image Generation at Higher-Resolutions with Foundation ModelsCode2
Show:102550
← PrevPage 178 of 13232Next →