Benchmarking

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 3751–3800 of 5548 papers

Title	Date	Tasks	Status
Unsupervised Spectral Demosaicing with Lightweight Spectral Attention Networks	Jul 5, 2023	BenchmarkingDemosaicking	—Unverified
OpenSiteRec: An Open Dataset for Site Recommendation	Jul 3, 2023	BenchmarkingInformation Retrieval	—Unverified
A Synthetic Benchmarking Pipeline to Compare Camera Calibration Algorithms	Jul 3, 2023	BenchmarkingCamera Calibration	—Unverified
Conditionally Invariant Representation Learning for Disentangling Cellular Heterogeneity	Jul 2, 2023	BenchmarkingData Integration	—Unverified
SysNoise: Exploring and Benchmarking Training-Deployment System Inconsistency	Jul 1, 2023	BenchmarkingData Augmentation	—Unverified
InstructEval: Systematic Evaluation of Instruction Selection Methods	Jul 1, 2023	BenchmarkingIn-Context Learning	—Unverified
Generative AI for Programming Education: Benchmarking ChatGPT, GPT-4, and Human Tutors	Jun 29, 2023	Benchmarking	—Unverified
Learning Environment Models with Continuous Stochastic Dynamics	Jun 29, 2023	AcrobotBenchmarking	—Unverified
Principles and Guidelines for Evaluating Social Robot Navigation Algorithms	Jun 29, 2023	BenchmarkingRobot Navigation	—Unverified
Benchmarking Large Language Model Capabilities for Conditional Generation	Jun 29, 2023	BenchmarkingFew-Shot Learning	—Unverified
Emotion Analysis of Tweets Banning Education in Afghanistan	Jun 28, 2023	BenchmarkingEmotion Classification	—Unverified
Benchmarking Zero-Shot Recognition with Vision-Language Models: Challenges on Granularity and Specificity	Jun 28, 2023	BenchmarkingImage Captioning	—Unverified
Effective Transfer of Pretrained Large Visual Model for Fabric Defect Segmentation via Specifc Knowledge Injection	Jun 28, 2023	BenchmarkingDiversity	—Unverified
Benchmarking Stroke Forecasting with Stroke-Level Badminton Dataset	Jun 27, 2023	Benchmarking	—Unverified
Enhancing Navigation Benchmarking and Perception Data Generation for Row-based Crops in Simulation	Jun 27, 2023	Autonomous NavigationBenchmarking	—Unverified
Pulse Shape-Aided Multipath Delay Estimation for Fine-Grained WiFi Sensing	Jun 27, 2023	Benchmarking	—Unverified
Paradigm Shift in Sustainability Disclosure Analysis: Empowering Stakeholders with CHATREPORT, a Language Model-Based Tool	Jun 27, 2023	BenchmarkingLanguage Modeling	—Unverified
Hybrid Precoder and Combiner Designs for Decentralized Parameter Estimation in mmWave MIMO Wireless Sensor Networks	Jun 25, 2023	Benchmarkingparameter estimation	—Unverified
Improving Reference-based Distinctive Image Captioning with Contrastive Rewards	Jun 25, 2023	BenchmarkingContrastive Learning	—Unverified
My Boli: Code-mixed Marathi-English Corpora, Pretrained Language Models and Evaluation Benchmarks	Jun 24, 2023	BenchmarkingHate Speech Detection	—Unverified
OptIForest: Optimal Isolation Forest for Anomaly Detection	Jun 22, 2023	Anomaly DetectionBenchmarking	CodeCode Available
On Evaluation of Document Classification using RVL-CDIP	Jun 21, 2023	BenchmarkingClassification	—Unverified
Evaluation of Popular XAI Applied to Clinical Prediction Models: Can They be Trusted?	Jun 21, 2023	BenchmarkingExplainable artificial intelligence	—Unverified
A Comprehensive Study on the Robustness of Image Classification and Object Detection in Remote Sensing: Surveying and Benchmarking	Jun 21, 2023	Adversarial RobustnessBenchmarking	—Unverified
On-orbit model training for satellite imagery with label proportions	Jun 21, 2023	BenchmarkingEarth Observation	CodeCode Available
Diverse Community Data for Benchmarking Data Privacy Algorithms	Jun 20, 2023	Benchmarking	—Unverified
Did the Models Understand Documents? Benchmarking Models for Language Understanding in Document-Level Relation Extraction	Jun 20, 2023	BenchmarkingDocument-level Relation Extraction	CodeCode Available
Benchmarking Robustness of Deep Reinforcement Learning approaches to Online Portfolio Management	Jun 19, 2023	BenchmarkingDeep Reinforcement Learning	—Unverified
Fairness Index Measures to Evaluate Bias in Biometric Recognition	Jun 19, 2023	BenchmarkingFairness	—Unverified
Using Motif Transitions for Temporal Graph Generation	Jun 19, 2023	BenchmarkingGraph Generation	CodeCode Available
Formal Covariate Benchmarking to Bound Omitted Variable Bias	Jun 18, 2023	BenchmarkingSensitivity	—Unverified
MA-BBOB: Many-Affine Combinations of BBOB Functions for Evaluating AutoML Approaches in Noiseless Numerical Black-Box Optimization Contexts	Jun 18, 2023	AutoMLBenchmarking	—Unverified
Benchmarking Deep Learning Architectures for Urban Vegetation Point Cloud Semantic Segmentation from MLS	Jun 17, 2023	BenchmarkingSegmentation	—Unverified
Framework and Benchmarks for Combinatorial and Mixed-variable Bayesian Optimization	Jun 16, 2023	Bayesian OptimizationBenchmarking	—Unverified
ALP: Action-Aware Embodied Learning for Perception	Jun 16, 2023	Benchmarkingobject-detection	—Unverified
Acoustic Identification of Ae. aegypti Mosquitoes using Smartphone Apps and Residual Convolutional Neural Networks	Jun 16, 2023	Benchmarking	CodeCode Available
Convolutional and Deep Learning based techniques for Time Series Ordinal Classification	Jun 16, 2023	BenchmarkingOrdinal Classification	—Unverified
Dissecting Multimodality in VideoQA Transformer Models by Impairing Modality Fusion	Jun 15, 2023	Benchmarkingcounterfactual	—Unverified
One Law, Many Languages: Benchmarking Multilingual Legal Reasoning for Judicial Support	Jun 15, 2023	BenchmarkingInformation Retrieval	CodeCode Available
Large-Scale Quantum Separability Through a Reproducible Machine Learning Lens	Jun 15, 2023	Benchmarking	—Unverified
DISC: a Dataset for Integrated Sensing and Communication in mmWave Systems	Jun 15, 2023	Activity RecognitionBenchmarking	—Unverified
DiPlomat: A Dialogue Dataset for Situated Pragmatic Reasoning	Jun 15, 2023	BenchmarkingConversational Question Answering	—Unverified
BED: Bi-Encoder-Based Detectors for Out-of-Distribution Detection	Jun 15, 2023	BenchmarkingOut-of-Distribution Detection	CodeCode Available
Re-Benchmarking Pool-Based Active Learning for Binary Classification	Jun 15, 2023	Active LearningBenchmarking	CodeCode Available
RRSIS: Referring Remote Sensing Image Segmentation	Jun 14, 2023	BenchmarkingImage Segmentation	—Unverified
MUBen: Benchmarking the Uncertainty of Molecular Representation Models	Jun 14, 2023	BenchmarkingDrug Discovery	CodeCode Available
A Cloud-based Machine Learning Pipeline for the Efficient Extraction of Insights from Customer Reviews	Jun 13, 2023	BenchmarkingKeyword Extraction	—Unverified
detrex: Benchmarking Detection Transformers	Jun 12, 2023	Benchmarkingobject-detection	—Unverified
Contribution à l'Optimisation d'un Comportement Collectif pour un Groupe de Robots Autonomes	Jun 10, 2023	BenchmarkingDiversity	—Unverified
A Large-Scale Analysis on Self-Supervised Video Representation Learning	Jun 9, 2023	BenchmarkingRepresentation Learning	—Unverified

Show:10 25 50

← PrevPage 76 of 111Next →

Benchmark Results

#	Model	Metric	Claimed	Verified	Status
1	GPT-4 Turbo	ACC	0.56	—	Unverified