SOTAVerified

Benchmarking

Papers

Showing 751775 of 5548 papers

TitleStatusHype
M4-SAR: A Multi-Resolution, Multi-Polarization, Multi-Scene, Multi-Source Dataset and Benchmark for Optical-SAR Fusion Object DetectionCode1
Benchmarking Classical and Learning-Based Multibeam Point Cloud RegistrationCode1
A Multifaceted Benchmarking of Synthetic Electronic Health Record Generation ModelsCode1
BeHonest: Benchmarking Honesty in Large Language ModelsCode1
ClearPose: Large-scale Transparent Object Dataset and BenchmarkCode1
AdaPool: Exponential Adaptive Pooling for Information-Retaining DownsamplingCode1
CHILI: Chemically-Informed Large-scale Inorganic Nanomaterials Dataset for Advancing Graph Machine LearningCode1
A Japanese Dataset for Subjective and Objective Sentiment Polarity Classification in Micro Blog DomainCode1
CIBench: Evaluating Your LLMs with a Code Interpreter PluginCode1
Benchmarking Chinese Text Recognition: Datasets, Baselines, and an Empirical StudyCode1
A multi-schematic classifier-independent oversampling approach for imbalanced datasetsCode1
Emotion and Intent Joint Understanding in Multimodal Conversation: A Benchmarking DatasetCode1
End-to-end Emotion-Cause Pair Extraction via Learning to LinkCode1
Bencher: Simple and Reproducible Benchmarking for Black-Box OptimizationCode1
Enhancing Biomedical Relation Extraction with DirectionalityCode1
BenchLMM: Benchmarking Cross-style Visual Capability of Large Multimodal ModelsCode1
Geometric Deep Learning for Structure-Based Drug Design: A SurveyCode1
A Comprehensive Study of the Robustness for LiDAR-based 3D Object Detectors against Adversarial AttacksCode1
CheXphoto: 10,000+ Photos and Transformations of Chest X-rays for Benchmarking Deep Learning RobustnessCode1
CIDEr: Consensus-based Image Description EvaluationCode1
ClimART: A Benchmark Dataset for Emulating Atmospheric Radiative Transfer in Weather and Climate ModelsCode1
Evaluating Adversarial Attacks on ImageNet: A Reality Check on Misclassification ClassesCode1
A Systematic Benchmarking Analysis of Transfer Learning for Medical Image AnalysisCode1
Evaluating histopathology transfer learning with ChampKitCode1
On the Detectability of ChatGPT Content: Benchmarking, Methodology, and Evaluation through the Lens of Academic WritingCode1
Show:102550
← PrevPage 31 of 222Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1GPT-4 TurboACC0.56Unverified