Benchmarking

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 3776–3800 of 5548 papers

Title	Date	Tasks	Status
Diverse Community Data for Benchmarking Data Privacy Algorithms	Jun 20, 2023	Benchmarking	—Unverified
Did the Models Understand Documents? Benchmarking Models for Language Understanding in Document-Level Relation Extraction	Jun 20, 2023	BenchmarkingDocument-level Relation Extraction	CodeCode Available
Benchmarking Robustness of Deep Reinforcement Learning approaches to Online Portfolio Management	Jun 19, 2023	BenchmarkingDeep Reinforcement Learning	—Unverified
Fairness Index Measures to Evaluate Bias in Biometric Recognition	Jun 19, 2023	BenchmarkingFairness	—Unverified
Using Motif Transitions for Temporal Graph Generation	Jun 19, 2023	BenchmarkingGraph Generation	CodeCode Available
Formal Covariate Benchmarking to Bound Omitted Variable Bias	Jun 18, 2023	BenchmarkingSensitivity	—Unverified
MA-BBOB: Many-Affine Combinations of BBOB Functions for Evaluating AutoML Approaches in Noiseless Numerical Black-Box Optimization Contexts	Jun 18, 2023	AutoMLBenchmarking	—Unverified
Benchmarking Deep Learning Architectures for Urban Vegetation Point Cloud Semantic Segmentation from MLS	Jun 17, 2023	BenchmarkingSegmentation	—Unverified
Framework and Benchmarks for Combinatorial and Mixed-variable Bayesian Optimization	Jun 16, 2023	Bayesian OptimizationBenchmarking	—Unverified
ALP: Action-Aware Embodied Learning for Perception	Jun 16, 2023	Benchmarkingobject-detection	—Unverified
Acoustic Identification of Ae. aegypti Mosquitoes using Smartphone Apps and Residual Convolutional Neural Networks	Jun 16, 2023	Benchmarking	CodeCode Available
Convolutional and Deep Learning based techniques for Time Series Ordinal Classification	Jun 16, 2023	BenchmarkingOrdinal Classification	—Unverified
Dissecting Multimodality in VideoQA Transformer Models by Impairing Modality Fusion	Jun 15, 2023	Benchmarkingcounterfactual	—Unverified
One Law, Many Languages: Benchmarking Multilingual Legal Reasoning for Judicial Support	Jun 15, 2023	BenchmarkingInformation Retrieval	CodeCode Available
Large-Scale Quantum Separability Through a Reproducible Machine Learning Lens	Jun 15, 2023	Benchmarking	—Unverified
DISC: a Dataset for Integrated Sensing and Communication in mmWave Systems	Jun 15, 2023	Activity RecognitionBenchmarking	—Unverified
DiPlomat: A Dialogue Dataset for Situated Pragmatic Reasoning	Jun 15, 2023	BenchmarkingConversational Question Answering	—Unverified
BED: Bi-Encoder-Based Detectors for Out-of-Distribution Detection	Jun 15, 2023	BenchmarkingOut-of-Distribution Detection	CodeCode Available
Re-Benchmarking Pool-Based Active Learning for Binary Classification	Jun 15, 2023	Active LearningBenchmarking	CodeCode Available
RRSIS: Referring Remote Sensing Image Segmentation	Jun 14, 2023	BenchmarkingImage Segmentation	—Unverified
MUBen: Benchmarking the Uncertainty of Molecular Representation Models	Jun 14, 2023	BenchmarkingDrug Discovery	CodeCode Available
A Cloud-based Machine Learning Pipeline for the Efficient Extraction of Insights from Customer Reviews	Jun 13, 2023	BenchmarkingKeyword Extraction	—Unverified
detrex: Benchmarking Detection Transformers	Jun 12, 2023	Benchmarkingobject-detection	—Unverified
Contribution à l'Optimisation d'un Comportement Collectif pour un Groupe de Robots Autonomes	Jun 10, 2023	BenchmarkingDiversity	—Unverified
A Large-Scale Analysis on Self-Supervised Video Representation Learning	Jun 9, 2023	BenchmarkingRepresentation Learning	—Unverified

Show:10 25 50

← PrevPage 152 of 222Next →

Benchmark Results

#	Model	Metric	Claimed	Verified	Status
1	GPT-4 Turbo	ACC	0.56	—	Unverified