Benchmarking

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 1251–1300 of 5548 papers

Title	Date	Tasks	Status	Hype
Chaos as an interpretable benchmark for forecasting and data-driven modelling	Oct 11, 2021	BenchmarkingSymbolic Regression	CodeCode Available	1
SERAB: A multi-lingual benchmark for speech emotion recognition	Oct 7, 2021	BenchmarkingEmotion Recognition	CodeCode Available	1
EntQA: Entity Linking as Question Answering	Oct 5, 2021	BenchmarkingEntity Linking	CodeCode Available	1
Revisiting Self-Training for Few-Shot Learning of Language Model	Oct 4, 2021	BenchmarkingFew-Shot Learning	CodeCode Available	1
Machine Learning with Knowledge Constraints for Process Optimization of Open-Air Perovskite Solar Cell Manufacturing	Oct 1, 2021	Bayesian OptimizationBenchmarking	CodeCode Available	1
Phonetic Word Embeddings	Sep 30, 2021	BenchmarkingWord Embeddings	CodeCode Available	1
MedPerf: Open Benchmarking Platform for Medical Artificial Intelligence using Federated Evaluation	Sep 29, 2021	BenchmarkingPhilosophy	CodeCode Available	1
Benchmarking Graph Neural Networks on Dynamic Link Prediction	Sep 29, 2021	BenchmarkingDynamic Link Prediction	CodeCode Available	1
"How Robust r u?": Evaluating Task-Oriented Dialogue Systems on Spoken Conversations	Sep 28, 2021	BenchmarkingDialogue State Tracking	CodeCode Available	1
FewNLU: Benchmarking State-of-the-Art Methods for Few-Shot Natural Language Understanding	Sep 27, 2021	BenchmarkingNatural Language Understanding	CodeCode Available	1
PASS: An ImageNet replacement for self-supervised pretraining without humans	Sep 27, 2021	BenchmarkingEthics	CodeCode Available	1
Disentangled Feature Representation for Few-shot Image Classification	Sep 26, 2021	BenchmarkingClassification	CodeCode Available	1
Don't be Contradicted with Anything! CI-ToD: Towards Benchmarking Consistency for Task-oriented Dialogue System	Sep 23, 2021	BenchmarkingResponse Generation	CodeCode Available	1
SubseasonalClimateUSA: A Dataset for Subseasonal Forecasting and Benchmarking	Sep 21, 2021	Benchmarking	CodeCode Available	1
AI Accelerator Survey and Trends	Sep 18, 2021	BenchmarkingComputational Efficiency	CodeCode Available	1
Benchmarking the Combinatorial Generalizability of Complex Query Answering on Knowledge Graphs	Sep 18, 2021	BenchmarkingComplex Query Answering	CodeCode Available	1
Benchmarking Commonsense Knowledge Base Population with an Effective Evaluation Dataset	Sep 16, 2021	BenchmarkingKnowledge Base Population	CodeCode Available	1
OPV2V: An Open Benchmark Dataset and Fusion Pipeline for Perception with Vehicle-to-Vehicle Communication	Sep 16, 2021	3D Object DetectionBenchmarking	CodeCode Available	1
Benchmarking the Spectrum of Agent Capabilities	Sep 14, 2021	Benchmarking	CodeCode Available	1
RobustART: Benchmarking Robustness on Architecture Design and Training Techniques	Sep 11, 2021	Adversarial RobustnessBenchmarking	CodeCode Available	1
Does BERT Learn as Humans Perceive? Understanding Linguistic Styles through Lexica	Sep 6, 2021	Benchmarking	CodeCode Available	1
Scikit-dimension: a Python package for intrinsic dimension estimation	Sep 6, 2021	Benchmarking	CodeCode Available	1
Biomedical Data-to-Text Generation via Fine-Tuning Transformers	Sep 3, 2021	BenchmarkingData-to-Text Generation	CodeCode Available	1
ReMeDi: Resources for Multi-domain, Multi-service, Medical Dialogues	Sep 1, 2021	BenchmarkingContrastive Learning	CodeCode Available	1
Semi-Supervised Exaggeration Detection of Health Science Press Releases	Aug 30, 2021	ArticlesBenchmarking	CodeCode Available	1
Tune It or Don't Use It: Benchmarking Data-Efficient Image Classification	Aug 30, 2021	Benchmarkingimage-classification	CodeCode Available	1
KO codes: Inventing Nonlinear Encoding and Decoding for Reliable Wireless Communication via Deep-learning	Aug 29, 2021	BenchmarkingDecoder	CodeCode Available	1
Searching for an Effective Defender: Benchmarking Defense against Adversarial Word Substitution	Aug 29, 2021	Benchmarking	CodeCode Available	1
Pulling Up by the Causal Bootstraps: Causal Data Augmentation for Pre-training Debiasing	Aug 27, 2021	BenchmarkingData Augmentation	CodeCode Available	1
A Unified Taxonomy and Multimodal Dataset for Events in Invasion Games	Aug 25, 2021	BenchmarkingVideo Classification	CodeCode Available	1
Generative Wind Power Curve Modeling Via Machine Vision: A Self-learning Deep Convolutional Network Based Method	Aug 19, 2021	BenchmarkingSynthetic Data Generation	CodeCode Available	1
SSH: A Self-Supervised Framework for Image Harmonization	Aug 15, 2021	BenchmarkingData Augmentation	CodeCode Available	1
A Dataset for Answering Time-Sensitive Questions	Aug 13, 2021	Benchmarking	CodeCode Available	1
Hatemoji: A Test Suite and Adversarially-Generated Dataset for Benchmarking and Detecting Emoji-based Hate	Aug 12, 2021	Benchmarking	CodeCode Available	1
A Systematic Benchmarking Analysis of Transfer Learning for Medical Image Analysis	Aug 12, 2021	BenchmarkingMedical Image Analysis	CodeCode Available	1
Webly Supervised Fine-Grained Recognition: Benchmark Datasets and An Approach	Aug 5, 2021	Benchmarking	CodeCode Available	1
CARLA: A Python Library to Benchmark Algorithmic Recourse and Counterfactual Explanation Algorithms	Aug 2, 2021	Benchmarkingcounterfactual	CodeCode Available	1
Quantum machine learning of large datasets using randomized measurements	Aug 2, 2021	BenchmarkingBIG-bench Machine Learning	CodeCode Available	1
Benchmarking: Past, Present and Future	Aug 1, 2021	BenchmarkingReading Comprehension	CodeCode Available	1
Contemporary Symbolic Regression Methods and their Relative Performance	Jul 29, 2021	Benchmarkingparameter estimation	CodeCode Available	1
A multi-schematic classifier-independent oversampling approach for imbalanced datasets	Jul 15, 2021	Benchmarking	CodeCode Available	1
Hierarchical graph neural nets can capture long-range interactions	Jul 15, 2021	BenchmarkingMolecular Property Prediction	CodeCode Available	1
Generative and reproducible benchmarks for comprehensive evaluation of machine learning classifiers	Jul 14, 2021	BenchmarkingBIG-bench Machine Learning	CodeCode Available	1
MECT: Multi-Metadata Embedding based Cross-Transformer for Chinese Named Entity Recognition	Jul 12, 2021	BenchmarkingChinese Named Entity Recognition	CodeCode Available	1
Benchmarking for Biomedical Natural Language Processing Tasks with a Domain Specific ALBERT	Jul 9, 2021	BenchmarkingDocument Classification	CodeCode Available	1
Benchpress: A Scalable and Versatile Workflow for Benchmarking Structure Learning Algorithms	Jul 8, 2021	Benchmarking	CodeCode Available	1
The RSNA-ASNR-MICCAI BraTS 2021 Benchmark on Brain Tumor Segmentation and Radiogenomic Classification	Jul 5, 2021	BenchmarkingBrain Tumor Segmentation	CodeCode Available	1
Systematic Evaluation of Causal Discovery in Visual Model Based Reinforcement Learning	Jul 2, 2021	BenchmarkingCausal Discovery	CodeCode Available	1
Benchmarking Knowledge-driven Zero-shot Learning	Jun 29, 2021	AttributeBenchmarking	CodeCode Available	1
Kimera-Multi: Robust, Distributed, Dense Metric-Semantic SLAM for Multi-Robot Systems	Jun 28, 2021	3D ReconstructionBenchmarking	CodeCode Available	1

Show:10 25 50

← PrevPage 26 of 111Next →

Benchmark Results

#	Model	Metric	Claimed	Verified	Status
1	GPT-4 Turbo	ACC	0.56	—	Unverified