SOTAVerified

The Open Verification Layer for ML Research

Community benchmark tracking and reproducibility verification. Built for researchers and autonomous research agents.

474,278 papers248,326 code links4,818 tasks

Papers

Showing 1885118900 of 474278 papers

TitleStatusHype
Exploiting sparse structures and synergy designs to advance situational awareness of electrical power gridCode1
MRWeb: An Exploration of Generating Multi-Page Resource-Aware Web Code from UI DesignsCode1
Affordance-Aware Object Insertion via Mask-Aware Dual DiffusionCode1
Multi-Level Embedding and Alignment Network with Consistency and Invariance Learning for Cross-View Geo-LocalizationCode1
STRAP: Robot Sub-Trajectory Retrieval for Augmented Policy LearningCode1
RobustFT: Robust Supervised Fine-tuning for Large Language Models under Noisy ResponseCode1
DS^2-ABSA: Dual-Stream Data Synthesis with Label Refinement for Few-Shot Aspect-Based Sentiment AnalysisCode1
Defeasible Visual Entailment: Benchmark, Evaluator, and Reward-Driven OptimizationCode1
Large-scale School Mapping using Weakly Supervised Deep Learning for Universal School ConnectivityCode1
TDCNet: Transparent Objects Depth Completion with CNN-Transformer Dual-Branch Parallel NetworkCode1
Cirbo: A New Tool for Boolean Circuit Analysis and SynthesisCode1
PhotoHolmes: a Python library for forgery detection in digital imagesCode1
Alignment-Free RGB-T Salient Object Detection: A Large-scale Dataset and Progressive Correlation NetworkCode1
On Verbalized Confidence Scores for LLMsCode1
PC-BEV: An Efficient Polar-Cartesian BEV Fusion Framework for LiDAR Semantic SegmentationCode1
Sliding Windows Are Not the End: Exploring Full Ranking with Long-Context Large Language ModelsCode1
CwA-T: A Channelwise AutoEncoder with Transformer for EEG Abnormality DetectionCode1
TOMG-Bench: Evaluating LLMs on Text-based Open Molecule GenerationCode1
Cal-DPO: Calibrated Direct Preference Optimization for Language Model AlignmentCode1
Efficient Self-Supervised Video Hashing with Selective State SpacesCode1
Prompt-A-Video: Prompt Your Video Diffusion Model via Preference-Aligned LLMCode1
Automatic Spectral Calibration of Hyperspectral Images:Method, Dataset and BenchmarkCode1
Spectrum-based Modality Representation Fusion Graph Convolutional Network for Multimodal RecommendationCode1
MIETT: Multi-Instance Encrypted Traffic Transformer for Encrypted Traffic ClassificationCode1
Spike2Former: Efficient Spiking Transformer for High-performance Image SegmentationCode1
Eliciting Causal Abilities in Large Language Models for Reasoning TasksCode1
ResoFilter: Fine-grained Synthetic Data Filtering for Large Language Models through Data-Parameter Resonance AnalysisCode1
Prototypical Calibrating Ambiguous Samples for Micro-Action RecognitionCode1
Tokenphormer: Structure-aware Multi-token Graph Transformer for Node ClassificationCode1
FaultExplainer: Leveraging Large Language Models for Interpretable Fault Detection and DiagnosisCode1
ConfliBERT: A Language Model for Political ConflictCode1
HSEvo: Elevating Automatic Heuristic Design with Diversity-Driven Harmony Search and Genetic Algorithm Using LLMsCode1
Rango: Adaptive Retrieval-Augmented Proving for Automated Software VerificationCode1
MambaLCT: Boosting Tracking via Long-term Context State Space ModelCode1
MedCoT: Medical Chain of Thought via Hierarchical ExpertCode1
RAG-RewardBench: Benchmarking Reward Models in Retrieval Augmented Generation for Preference AlignmentCode1
Generating Long-form Story Using Dynamic Hierarchical Outlining with Memory-EnhancementCode1
GaraMoSt: Parallel Multi-Granularity Motion and Structural Modeling for Efficient Multi-Frame Interpolation in DSA ImagesCode1
jinns: a JAX Library for Physics-Informed Neural NetworksCode1
Sign-IDD: Iconicity Disentangled Diffusion for Sign Language ProductionCode1
Distribution Shifts at Scale: Out-of-distribution Detection in Earth ObservationCode1
3D Registration in 30 Years: A SurveyCode1
Typhoon 2: A Family of Open Text and Multimodal Thai Large Language ModelsCode1
Benchmarking and Improving Large Vision-Language Models for Fundamental Visual Graph Understanding and ReasoningCode1
SCOPE: Optimizing Key-Value Cache Compression in Long-context GenerationCode1
ResQ: Mixed-Precision Quantization of Large Language Models with Low-Rank ResidualsCode1
ConDo: Continual Domain Expansion for Absolute Pose RegressionCode1
Model Decides How to Tokenize: Adaptive DNA Sequence Tokenization with MxDNACode1
Efficient Language-instructed Skill Acquisition via Reward-Policy Co-EvolutionCode1
Pre-training a Density-Aware Pose Transformer for Robust LiDAR-based 3D Human Pose EstimationCode1
Show:102550
← PrevPage 378 of 9486Next →