SOTAVerified

The Open Verification Layer for ML Research

Community benchmark tracking and reproducibility verification. Built for researchers and autonomous research agents.

474,278 papers248,326 code links4,818 tasks

Papers

Showing 2060120650 of 474278 papers

TitleStatusHype
Reliable Probabilistic Human Trajectory Prediction for Autonomous ApplicationsCode1
To Preserve or To Compress: An In-Depth Study of Connector Selection in Multimodal Large Language ModelsCode1
HFH-Font: Few-shot Chinese Font Synthesis with Higher Quality, Faster Speed, and Higher ResolutionCode1
Continual Learning in the Frequency DomainCode1
ST-WebAgentBench: A Benchmark for Evaluating Safety and Trustworthiness in Web AgentsCode1
SEAL: Safety-enhanced Aligned LLM Fine-tuning via Bilevel Data SelectionCode1
Mitigating Time Discretization Challenges with WeatherODE: A Sandwich Physics-Driven Neural ODE for Weather ForecastingCode1
Learning Evolving Tools for Large Language ModelsCode1
Cluster-wise Graph Transformer with Dual-granularity Kernelized AttentionCode1
DiffGAD: A Diffusion-based Unsupervised Graph Anomaly DetectorCode1
InstructG2I: Synthesizing Images from Multimodal Attributed GraphsCode1
Does Spatial Cognition Emerge in Frontier Models?Code1
Iterative Optimization Annotation Pipeline and ALSS-YOLO-Seg for Efficient Banana Plantation Segmentation in UAV ImageryCode1
Is C4 Dataset Optimal for Pruning? An Investigation of Calibration Data for LLM PruningCode1
Rejecting Hallucinated State Targets during PlanningCode1
Steering Large Language Models using Conceptors: Improving Addition-Based Activation EngineeringCode1
BiC-MPPI: Goal-Pursuing, Sampling-Based Bidirectional Rollout Clustering Path Integral for Trajectory OptimizationCode1
LLM Embeddings Improve Test-time Adaptation to Tabular Y|X-ShiftsCode1
ING-VP: MLLMs cannot Play Easy Vision-based Games YetCode1
Enhancing Legal Case Retrieval via Scaling High-quality Synthetic Query-Candidate PairsCode1
Toward Physics-guided Time Series EmbeddingCode1
TinyLidarNet: 2D LiDAR-based End-to-End Deep Learning Model for F1TENTH Autonomous RacingCode1
A Gentle Introduction and Tutorial on Deep Generative Models in Transportation ResearchCode1
Simplicity Prevails: Rethinking Negative Preference Optimization for LLM UnlearningCode1
Adaptive High-Frequency Transformer for Diverse Wildlife Re-IdentificationCode1
Bridge the Points: Graph-based Few-shot Segment Anything SemanticallyCode1
IterGen: Iterative Semantic-aware Structured LLM Generation with BacktrackingCode1
Equi-GSPR: Equivariant SE(3) Graph Network Model for Sparse Point Cloud RegistrationCode1
GlucoBench: Curated List of Continuous Glucose Monitoring Datasets with Prediction BenchmarksCode1
Generative Artificial Intelligence (GAI) for Mobile Communications: A Diffusion Model PerspectiveCode1
ToolBridge: An Open-Source Dataset to Equip LLMs with External Tool CapabilitiesCode1
Evaluating Performance and Bias of Negative Sampling in Large-Scale Sequential Recommendation ModelsCode1
Efficient Few-shot Learning for Multi-label Classification of Scientific Documents with Many ClassesCode1
FACMIC: Federated Adaptative CLIP Model for Medical Image ClassificationCode1
NegMerge: Consensual Weight Negation for Strong Machine UnlearningCode1
SeeClear: Semantic Distillation Enhances Pixel Condensation for Video Super-ResolutionCode1
UnSeGArmaNet: Unsupervised Image Segmentation using Graph Neural Networks with Convolutional ARMA FiltersCode1
Less is More: Making Smaller Language Models Competent Subgraph Retrievers for Multi-hop KGQACode1
QT-DoG: Quantization-aware Training for Domain GeneralizationCode1
Coevolving with the Other You: Fine-Tuning LLM with Sequential Cooperative Multi-Agent Reinforcement LearningCode1
Multi-Behavioral Sequential RecommendationCode1
Underwater Object Detection in the Era of Artificial Intelligence: Current, Challenge, and FutureCode1
Estimating the Number of HTTP/3 Responses in QUIC Using Deep LearningCode1
Continuous Contrastive Learning for Long-Tailed Semi-Supervised RecognitionCode1
Entering Real Social World! Benchmarking the Social Intelligence of Large Language Models from a First-person PerspectiveCode1
Amortized Control of Continuous State Space Feynman-Kac Model for Irregular Time SeriesCode1
MEXA: Multilingual Evaluation of English-Centric LLMs via Cross-Lingual AlignmentCode1
DataEnvGym: Data Generation Agents in Teacher Environments with Student FeedbackCode1
A mechanistically interpretable neural network for regulatory genomicsCode1
Tackling the Abstraction and Reasoning Corpus with Vision Transformers: the Importance of 2D Representation, Positions, and ObjectsCode1
Show:102550
← PrevPage 413 of 9486Next →