SOTAVerified

Benchmarking

Papers

Showing 20262050 of 5548 papers

TitleStatusHype
Question-Answering Dense Video EventsCode0
Aux-Drop: Handling Haphazard Inputs in Online Learning Using Auxiliary DropoutsCode0
Benchmarking White Blood Cell Classification Under Domain ShiftCode0
IJCB 2022 Mobile Behavioral Biometrics Competition (MobileB2C)Code0
Illuminating the Diversity-Fitness Trade-Off in Black-Box OptimizationCode0
IHCV: Discovery of Hidden Time-Dependent Control Variables in Non-Linear Dynamical SystemsCode0
Illusory VQA: Benchmarking and Enhancing Multimodal Models on Visual IllusionsCode0
Benchmarking Vision-Language Contrastive Methods for Medical Representation LearningCode0
Identifying and Benchmarking Natural Out-of-Context Prediction ProblemsCode0
Benchmarking GPT-4 against Human Translators: A Comprehensive Evaluation Across Languages, Domains, and Expertise LevelsCode0
RCP-Bench: Benchmarking Robustness for Collaborative Perception Under Diverse CorruptionsCode0
IdeaBench: Benchmarking Large Language Models for Research Idea GenerationCode0
Identifying Money Laundering Subgraphs on the BlockchainCode0
IceBench: A Benchmark for Deep Learning based Sea Ice Type ClassificationCode0
HypoTermQA: Hypothetical Terms Dataset for Benchmarking Hallucination Tendency of LLMsCode0
Identifying the Smallest Adversarial Load Perturbations that Render DC-OPF InfeasibleCode0
Benchmarking Unsupervised Strategies for Anomaly Detection in Multivariate Time SeriesCode0
Hyperopt-Sklearn: Automatic Hyperparameter Configuration for Scikit-LearnCode0
Benchmarking Unsupervised Online IDS for Masquerade Attacks in CANCode0
Deep Metric Learning Meets Deep Clustering: An Novel Unsupervised Approach for Feature EmbeddingCode0
ACCESS DENIED INC: The First Benchmark Environment for Sensitivity AwarenessCode0
AutoMIR: Effective Zero-Shot Medical Information Retrieval without Relevance LabelsCode0
Hyperbolic Benchmarking Unveils Network Topology-Feature Relationship in GNN PerformanceCode0
Hyperparameter-Free Losses for Model-Based Monocular ReconstructionCode0
Hybrid Machine Learning Models of Classifying Residential Requests for Smart DispatchingCode0
Show:102550
← PrevPage 82 of 222Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1GPT-4 TurboACC0.56Unverified