Benchmarking

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 4251–4300 of 5548 papers

Title	Date	Tasks	Status
RCC-GAN: Regularized Compound Conditional GAN for Large-Scale Tabular Data Synthesis	May 24, 2022	BenchmarkingGenerative Adversarial Network	—Unverified
Advanced Manufacturing Configuration by Sample-efficient Batch Bayesian Optimization	May 24, 2022	Bayesian OptimizationBenchmarking	—Unverified
Graph-theoretical approach to robust 3D normal extraction of LiDAR data	May 23, 2022	Benchmarking	CodeCode Available
Diversity Over Size: On the Effect of Sample and Topic Sizes for Topic-Dependent Argument Mining Datasets	May 23, 2022	Argument MiningBenchmarking	CodeCode Available
Generalization, Mayhems and Limits in Recurrent Proximal Policy Optimization	May 23, 2022	BenchmarkingDeep Reinforcement Learning	—Unverified
Paddy Doctor: A Visual Image Dataset for Automated Paddy Disease Classification and Benchmarking	May 23, 2022	BenchmarkingClassification	—Unverified
Deep Learning vs. Gradient Boosting: Benchmarking state-of-the-art machine learning algorithms for credit scoring	May 21, 2022	BenchmarkingBinary Classification	—Unverified
Self-Supervised Speech Representation Learning: A Review	May 21, 2022	Automatic Speech RecognitionAutomatic Speech Recognition (ASR)	—Unverified
SNaC: Coherence Error Detection for Narrative Summarization	May 19, 2022	BenchmarkingCoherence Evaluation	CodeCode Available
Entity Alignment For Knowledge Graphs: Progress, Challenges, and Empirical Studies	May 18, 2022	BenchmarkingEntity Alignment	—Unverified
Accented Speech Recognition: Benchmarking, Pre-training, and Diverse Data	May 16, 2022	Accented Speech RecognitionBenchmarking	—Unverified
Uncertainty estimation for Cross-dataset performance in Trajectory prediction	May 15, 2022	BenchmarkingPrediction	—Unverified
Provably Safe Reinforcement Learning: Conceptual Analysis, Survey, and Benchmarking	May 13, 2022	Benchmarkingreinforcement-learning	—Unverified
Beyond Static Models and Test Sets: Benchmarking the Potential of Pre-trained Models Across Tasks and Languages	May 12, 2022	BenchmarkingDiversity	—Unverified
Individual Fairness Guarantees for Neural Networks	May 11, 2022	BenchmarkingFairness	CodeCode Available
Subspace Learning Machine (SLM): Methodology and Performance	May 11, 2022	Benchmarking	—Unverified
Towards Intersectionality in Machine Learning: Including More Identities, Handling Underrepresentation, and Performing Evaluation	May 10, 2022	AttributeBenchmarking	CodeCode Available
LayoutXLM vs. GNN: An Empirical Evaluation of Relation Extraction for Documents	May 9, 2022	BenchmarkingGraph Neural Network	—Unverified
Assigning Species Information to Corresponding Genes by a Sequence Labeling Framework	May 8, 2022	BenchmarkingBinary Classification	CodeCode Available
Design Target Achievement Index: A Differentiable Metric to Enhance Deep Generative Models in Multi-Objective Inverse Design	May 6, 2022	Benchmarking	—Unverified
VFHQ: A High-Quality Dataset and Benchmark for Video Face Super-Resolution	May 6, 2022	BenchmarkingSpeaker Identification	—Unverified
Surface Reconstruction from Point Clouds: A Survey and a Benchmark	May 5, 2022	BenchmarkingSurface Reconstruction	—Unverified
Learn-to-Race Challenge 2022: Benchmarking Safe Learning and Cross-domain Generalisation in Autonomous Racing	May 5, 2022	Autonomous DrivingAutonomous Racing	—Unverified
On Continual Model Refinement in Out-of-Distribution Data Streams	May 4, 2022	BenchmarkingContinual Learning	—Unverified
Training Mixed-Domain Translation Models via Federated Learning	May 3, 2022	BenchmarkingFederated Learning	—Unverified
MSAMSum: Towards Benchmarking Multi-lingual Dialogue Summarization	May 1, 2022	Benchmarkingdialogue summary	CodeCode Available
MMCoQA: Conversational Question Answering over Text, Tables, and Images	May 1, 2022	BenchmarkingConversational Question Answering	CodeCode Available
Fantastic Questions and Where to Find Them: FairytaleQA – An Authentic Dataset for Narrative Comprehension	May 1, 2022	BenchmarkingQuestion Answering	—Unverified
To Find Waldo You Need Contextual Cues: Debiasing Who’s Waldo	May 1, 2022	BenchmarkingPerson-centric Visual Grounding	CodeCode Available
Benchmarking Post-Hoc Interpretability Approaches for Transformer-based Misogyny Detection	May 1, 2022	BenchmarkingHate Speech Detection	CodeCode Available
Answer Consolidation: Formulation and Benchmarking	Apr 29, 2022	BenchmarkingQuestion Answering	CodeCode Available
Foundations for learning from noisy quantum experiments	Apr 28, 2022	Benchmarking	—Unverified
Watts: Infrastructure for Open-Ended Learning	Apr 28, 2022	Benchmarking	CodeCode Available
A Collection of Quality Diversity Optimization Problems Derived from Hyperparameter Optimization of Machine Learning Models	Apr 28, 2022	BenchmarkingDiversity	CodeCode Available
Benchmarking the Hooke-Jeeves Method, MTS-LS1, and BSrr on the Large-scale BBOB Function Set	Apr 28, 2022	Benchmarking	CodeCode Available
Deeper Insights into the Robustness of ViTs towards Common Corruptions	Apr 26, 2022	BenchmarkingData Augmentation	—Unverified
Causal Reasoning Meets Visual Representation Learning: A Prospective Study	Apr 26, 2022	BenchmarkingOut-of-Distribution Generalization	—Unverified
Label Anchored Contrastive Learning for Language Understanding	Apr 26, 2022	BenchmarkingContrastive Learning	—Unverified
Transformation-Interaction-Rational Representation for Symbolic Regression	Apr 25, 2022	BenchmarkingForm	CodeCode Available
MOLE: Digging Tunnels Through Multimodal Multi-Objective Landscapes	Apr 22, 2022	Benchmarking	CodeCode Available
Benchmarking Answer Verification Methods for Question Answering-Based Summarization Evaluation Metrics	Apr 21, 2022	AttributeBenchmarking	—Unverified
Changepoint Detection in Noisy Data Using a Novel Residuals Permutation-Based Method (RESPERM): Benchmarking and Application to Single Trial ERPs	Apr 21, 2022	BenchmarkingChange Point Detection	CodeCode Available
Learning to Fold Real Garments with One Arm: A Case Study in Cloud-Based Robotics Research	Apr 21, 2022	BenchmarkingDiversity	—Unverified
Multi-label classification for biomedical literature: an overview of the BioCreative VII LitCovid Track for COVID-19 literature topic annotations	Apr 20, 2022	ArticlesBenchmarking	—Unverified
Analyzing the Impact of Undersampling on the Benchmarking and Configuration of Evolutionary Algorithms	Apr 20, 2022	BenchmarkingEvolutionary Algorithms	—Unverified
Label Efficient Regularization and Propagation for Graph Node Classification	Apr 19, 2022	AttributeBenchmarking	—Unverified
Radio Galaxy Zoo: Using semi-supervised learning to leverage large unlabelled data-sets for radio galaxy classification under data-set shift	Apr 19, 2022	BenchmarkingClassification	CodeCode Available
Benchmarking Domain Generalization on EEG-based Emotion Recognition	Apr 18, 2022	BenchmarkingDomain Adaptation	—Unverified
SoccerNet-Tracking: Multiple Object Tracking Dataset and Benchmark in Soccer Videos	Apr 14, 2022	BenchmarkingMultiple Object Tracking	—Unverified
From Environmental Sound Representation to Robustness of 2D CNN Models Against Adversarial Attacks	Apr 14, 2022	Adversarial AttackAdversarial Robustness	—Unverified

Show:10 25 50

← PrevPage 86 of 111Next →

Benchmark Results

#	Model	Metric	Claimed	Verified	Status
1	GPT-4 Turbo	ACC	0.56	—	Unverified