SOTAVerified

The Open Verification Layer for ML Research

Community benchmark tracking and reproducibility verification. Built for researchers and autonomous research agents.

474,278 papers248,326 code links4,818 tasks

Papers

Showing 1690116950 of 474278 papers

TitleStatusHype
SymmCompletion: High-Fidelity and High-Consistency Point Cloud Completion with Symmetry GuidanceCode1
Debiasing Multimodal Large Language Models via Noise-Aware Preference OptimizationCode1
DeLoRA: Decoupling Angles and Strength in Low-rank AdaptationCode1
Unraveling the Effects of Synthetic Data on End-to-End Autonomous DrivingCode1
Unseen from Seen: Rewriting Observation-Instruction Using Foundation Models for Augmenting Vision-Language NavigationCode1
Reinforcement Learning-based Self-adaptive Differential Evolution through Automated Landscape Feature LearningCode1
GUI-Xplore: Empowering Generalizable GUI Agents with One ExplorationCode1
Accelerating and enhancing thermodynamic simulations of electrochemical interfacesCode1
BackMix: Regularizing Open Set Recognition by Removing Underlying Fore-Background PriorsCode1
ParsiPy: NLP Toolkit for Historical Persian Texts in PythonCode1
MEPNet: Medical Entity-balanced Prompting Network for Brain CT Report GenerationCode1
GOAL: Global-local Object Alignment LearningCode1
V2P-Bench: Evaluating Video-Language Understanding with Visual Prompts for Better Human-Model InteractionCode1
Reducing Class-wise Confusion for Incremental Learning with Disentangled ManifoldsCode1
HiLoTs: High-Low Temporal Sensitive Representation Learning for Semi-Supervised LiDAR Segmentation in Autonomous DrivingCode1
SGFormer: Satellite-Ground Fusion for 3D Semantic Scene CompletionCode1
ColabSfM: Collaborative Structure-from-Motion by Point Cloud RegistrationCode1
Exploring a Principled Framework for Deep Subspace ClusteringCode1
Offline Model-Based Optimization: Comprehensive ReviewCode1
Auto-Regressive Diffusion for Generating 3D Human-Object InteractionsCode1
HCAST: Human-Calibrated Autonomy Software TasksCode1
A Survey on Personalized Alignment -- The Missing Piece for Large Language Models in Real-World ApplicationsCode1
ProDehaze: Prompting Diffusion Models Toward Faithful Image DehazingCode1
LEMMA: Learning from Errors for MatheMatical Advancement in LLMsCode1
Steady Progress Beats Stagnation: Mutual Aid of Foundation and Conventional Models in Mixed Domain Semi-Supervised Medical Image SegmentationCode1
Missing Target-Relevant Information Prediction with World Model for Accurate Zero-Shot Composed Image RetrievalCode1
Unsupervised Joint Learning of Optical Flow and Intensity with Event CamerasCode1
DiTEC-WDN: A Large-Scale Dataset of Hydraulic Scenarios across Multiple Water Distribution NetworksCode1
Local Projections or VARs? A Primer for MacroeconomistsCode1
MTBench: A Multimodal Time Series Benchmark for Temporal Reasoning and Question AnsweringCode1
Not Only Text: Exploring Compositionality of Visual Representations in Vision-Language ModelsCode1
MAPS: A Multi-Agent Framework Based on Big Seven Personality and Socratic Guidance for Multimodal Scientific Problem SolvingCode1
SafeMERGE: Preserving Safety Alignment in Fine-Tuned Large Language Models via Selective Layer-Wise Model MergingCode1
FFaceNeRF: Few-shot Face Editing in Neural Radiance FieldsCode1
LLM+MAP: Bimanual Robot Task Planning using Large Language Models and Planning Domain Definition LanguageCode1
HiFi-Stream: Streaming Speech Enhancement with Generative Adversarial NetworksCode1
Superpowering Open-Vocabulary Object Detectors for X-ray VisionCode1
Tuning LLMs by RAG Principles: Towards LLM-native MemoryCode1
MarkushGrapher: Joint Visual and Textual Recognition of Markush StructuresCode1
Distributed LLMs and Multimodal Large Language Models: A Survey on Advances, Challenges, and Future DirectionsCode1
Enhancing Close-up Novel View Synthesis via Pseudo-labelingCode1
OpenMIBOOD: Open Medical Imaging Benchmarks for Out-Of-Distribution DetectionCode1
ATOM: A Framework of Detecting Query-Based Model Extraction Attacks for Graph Neural NetworksCode1
Aligning Text-to-Music Evaluation with Human PreferencesCode1
The Emperor's New Clothes in Benchmarking? A Rigorous Examination of Mitigation Strategies for LLM Benchmark Data ContaminationCode1
MiLA: Multi-view Intensive-fidelity Long-term Video Generation World Model for Autonomous DrivingCode1
Cross-Modal and Uncertainty-Aware Agglomeration for Open-Vocabulary 3D Scene UnderstandingCode1
Efficiently Vectorized MCMC on Modern AcceleratorsCode1
CausalCLIPSeg: Unlocking CLIP's Potential in Referring Medical Image Segmentation with Causal InterventionCode1
Neural Combinatorial Optimization for Real-World RoutingCode1
Show:102550
← PrevPage 339 of 9486Next →