SOTAVerified

The Open Verification Layer for ML Research

Community benchmark tracking and reproducibility verification. Built for researchers and autonomous research agents.

474,278 papers248,326 code links4,818 tasks

Papers

Showing 1365113700 of 177340 papers

TitleStatusHype
TimeSuite: Improving MLLMs for Long Video Understanding via Grounded TuningCode2
Interactive Humanoid: Online Full-Body Motion Reaction Synthesis with Social Affordance Canonicalization and ForecastingCode2
HecVL: Hierarchical Video-Language Pretraining for Zero-shot Surgical Phase RecognitionCode2
System 2 Attention (is something you might need too)Code2
BiM-VFI: directional Motion Field-Guided Frame Interpolation for Video with Non-uniform MotionsCode2
ChangeCLIP: Remote sensing change detection with multimodal vision-language representation learningCode2
A Simple and Model-Free Path Filtering Algorithm for Smoothing and AccuracyCode2
CoDA: Collaborative Novel Box Discovery and Cross-modal Alignment for Open-vocabulary 3D Object DetectionCode2
Uni-SMART: Universal Science Multimodal Analysis and Research TransformerCode2
RAW-Adapter: Adapting Pre-trained Visual Model to Camera RAW Images and A BenchmarkCode2
FLOWR: Flow Matching for Structure-Aware De Novo, Interaction- and Fragment-Based Ligand GenerationCode2
Generative Modeling for Low Dimensional Speech Attributes with Neural Spline FlowsCode2
Refine3DNet: Scaling Precision in 3D Object Reconstruction from Multi-View RGB Images using AttentionCode2
Few-Shot Bearing Fault Diagnosis Via Ensembling Transformer-Based Model With Mahalanobis Distance Metric Learning From Multiscale FeaturesCode2
DGFont++: Robust Deformable Generative Networks for Unsupervised Font GenerationCode2
YOLOv5-6D: Advancing 6-DoF Instrument Pose Estimation in Variable X-Ray Imaging GeometriesCode2
Reusing Embeddings: Reproducible Reward Model Research in Large Language Model Alignment without GPUsCode2
Analysing the Residual Stream of Language Models Under Knowledge ConflictsCode2
JAILJUDGE: A Comprehensive Jailbreak Judge Benchmark with Multi-Agent Enhanced Explanation Evaluation FrameworkCode2
Hypergraph Neural NetworksCode2
Peeling Back the Layers: An In-Depth Evaluation of Encoder Architectures in Neural News RecommendersCode2
Efficient Non-stationary Online Learning by Wavelets with Applications to Online Distribution Shift AdaptationCode2
ViSpeak: Visual Instruction Feedback in Streaming VideosCode2
SatMAE: Pre-training Transformers for Temporal and Multi-Spectral Satellite ImageryCode2
Self-Prompting Polyp Segmentation in Colonoscopy using Hybrid Yolo-SAM 2 ModelCode2
Detection Transformer with Stable MatchingCode2
Chain-of-Thought Reasoning Without PromptingCode2
Domain Adaptation with a Single Vision-Language EmbeddingCode2
An Efficient Post-hoc Framework for Reducing Task Discrepancy of Text Encoders for Composed Image RetrievalCode2
HypoBench: Towards Systematic and Principled Benchmarking for Hypothesis GenerationCode2
Prototype-based Cross-Modal Object TrackingCode2
BatGPT: A Bidirectional Autoregessive Talker from Generative Pre-trained TransformerCode2
TurboEdit: Text-Based Image Editing Using Few-Step Diffusion ModelsCode2
1st Place Solution of Multiview Egocentric Hand Tracking Challenge ECCV2024Code2
C^2LEVA: Toward Comprehensive and Contamination-Free Language Model EvaluationCode2
Region Rebalance for Long-Tailed Semantic SegmentationCode2
NLLB-CLIP -- train performant multilingual image retrieval model on a budgetCode2
TMR: Text-to-Motion Retrieval Using Contrastive 3D Human Motion SynthesisCode2
Gaussian Processes for Big DataCode2
DetGPT: Detect What You Need via ReasoningCode2
HPT++: Hierarchically Prompting Vision-Language Models with Multi-Granularity Knowledge Generation and Improved Structure ModelingCode2
GAIA: a benchmark for General AI AssistantsCode2
WMT24++: Expanding the Language Coverage of WMT24 to 55 Languages & DialectsCode2
Seeing through Satellite Images at Street ViewsCode2
Large Language Models are In-Context Molecule LearnersCode2
Adaptive Guidance: Training-free Acceleration of Conditional Diffusion ModelsCode2
Reinforced Internal-External Knowledge Synergistic Reasoning for Efficient Adaptive Search AgentCode2
Deduplicating Training Data Mitigates Privacy Risks in Language ModelsCode2
RandAugment: Practical automated data augmentation with a reduced search spaceCode2
Mamba-R: Vision Mamba ALSO Needs RegistersCode2
Show:102550
← PrevPage 274 of 3547Next →