Benchmarking

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 4251–4300 of 5548 papers

Title	Date	Tasks	Status	Hype
Extensible Logging and Empirical Attainment Function for IOHexperimenter	Sep 28, 2021	Benchmarking	—Unverified	0
Context-guided Triple Matching for Multiple Choice Question Answering	Sep 27, 2021	BenchmarkingMultiple-choice	—Unverified	0
PASS: An ImageNet replacement for self-supervised pretraining without humans	Sep 27, 2021	BenchmarkingEthics	CodeCode Available	1
FewNLU: Benchmarking State-of-the-Art Methods for Few-Shot Natural Language Understanding	Sep 27, 2021	BenchmarkingNatural Language Understanding	CodeCode Available	1
Disentangled Feature Representation for Few-shot Image Classification	Sep 26, 2021	BenchmarkingClassification	CodeCode Available	1
MetaDrive: Composing Diverse Driving Scenarios for Generalizable Reinforcement Learning	Sep 26, 2021	BenchmarkingDecision Making	CodeCode Available	2
Curb Your Carbon Emissions: Benchmarking Carbon Emissions in Machine Translation	Sep 26, 2021	BenchmarkingMachine Translation	—Unverified	0
Don't be Contradicted with Anything! CI-ToD: Towards Benchmarking Consistency for Task-oriented Dialogue System	Sep 23, 2021	BenchmarkingResponse Generation	CodeCode Available	1
Benchmarking Lane-changing Decision-making for Deep Reinforcement Learning	Sep 22, 2021	Autonomous DrivingBenchmarking	—Unverified	0
Benchmarking Augmentation Methods for Learning Robust Navigation Agents: the Winning Entry of the 2021 iGibson Challenge	Sep 22, 2021	BenchmarkingData Augmentation	—Unverified	0
SubseasonalClimateUSA: A Dataset for Subseasonal Forecasting and Benchmarking	Sep 21, 2021	Benchmarking	CodeCode Available	1
Efficiently solving the thief orienteering problem with a max-min ant colony optimization approach	Sep 21, 2021	Benchmarking	CodeCode Available	0
A Novel Cluster Detection of COVID-19 Patients and Medical Disease Conditions Using Improved Evolutionary Clustering Algorithm Star	Sep 20, 2021	BenchmarkingClustering	—Unverified	0
Hybrid Transceiver Design for Tera-Hertz MIMO Systems Relying on Bayesian Learning Aided Sparse Channel Estimation	Sep 20, 2021	Benchmarking	—Unverified	0
AI Accelerator Survey and Trends	Sep 18, 2021	BenchmarkingComputational Efficiency	CodeCode Available	1
Benchmarking the Combinatorial Generalizability of Complex Query Answering on Knowledge Graphs	Sep 18, 2021	BenchmarkingComplex Query Answering	CodeCode Available	1
Benchmarking Answer Verification Methods for Question Answering-Based Summarization Evaluation Metrics	Sep 17, 2021	AttributeBenchmarking	—Unverified	0
DiS-ReX: A Multilingual Dataset for Distantly Supervised Relation Extraction	Sep 17, 2021	BenchmarkingRelation	—Unverified	0
WiSoSuper: Benchmarking Super-Resolution Methods on Wind and Solar Data	Sep 17, 2021	BenchmarkingBIG-bench Machine Learning	—Unverified	0
Messing Up 3D Virtual Environments: Transferable Adversarial 3D Objects	Sep 17, 2021	BenchmarkingBIG-bench Machine Learning	CodeCode Available	0
Benchmarking Feature-based Algorithm Selection Systems for Black-box Numerical Optimization	Sep 17, 2021	Benchmarking	CodeCode Available	0
A Survey on Temporal Sentence Grounding in Videos	Sep 16, 2021	Action LocalizationBenchmarking	—Unverified	0
OPV2V: An Open Benchmark Dataset and Fusion Pipeline for Perception with Vehicle-to-Vehicle Communication	Sep 16, 2021	3D Object DetectionBenchmarking	CodeCode Available	1
Benchmarking Commonsense Knowledge Base Population with an Effective Evaluation Dataset	Sep 16, 2021	BenchmarkingKnowledge Base Population	CodeCode Available	1
Benchmarking the Spectrum of Agent Capabilities	Sep 14, 2021	Benchmarking	CodeCode Available	1
A Continuous Optimisation Benchmark Suite from Neural Network Regression	Sep 12, 2021	BenchmarkingEvolutionary Algorithms	CodeCode Available	0
RobustART: Benchmarking Robustness on Architecture Design and Training Techniques	Sep 11, 2021	Adversarial RobustnessBenchmarking	CodeCode Available	1
Benchmarking Processor Performance by Multi-Threaded Machine Learning Algorithms	Sep 11, 2021	BenchmarkingBIG-bench Machine Learning	—Unverified	0
Application of DEA in International Market Selection for the export of products from Spain	Sep 10, 2021	BenchmarkingDecision Making	—Unverified	0
A framework for benchmarking uncertainty in deep regression	Sep 10, 2021	Benchmarkingregression	—Unverified	0
Characterization of Constrained Continuous Multiobjective Optimization Problems: A Feature Space Perspective	Sep 9, 2021	BenchmarkingMultiobjective Optimization	—Unverified	0
CrowdDriven: A New Challenging Dataset for Outdoor Visual Localization	Sep 9, 2021	BenchmarkingSelf-Driving Cars	—Unverified	0
Towards Efficient Synchronous Federated Training: A Survey on System Optimization Strategies	Sep 9, 2021	BenchmarkingFederated Learning	CodeCode Available	0
Resistive Neural Hardware Accelerators	Sep 8, 2021	Benchmarking	—Unverified	0
Panoptic nuScenes: A Large-Scale Benchmark for LiDAR Panoptic Segmentation and Tracking	Sep 8, 2021	BenchmarkingDiversity	CodeCode Available	2
Fine-grained Hand Gesture Recognition in Multi-viewpoint Hand Hygiene	Sep 7, 2021	BenchmarkingFine-Grained Image Recognition	CodeCode Available	0
Does BERT Learn as Humans Perceive? Understanding Linguistic Styles through Lexica	Sep 6, 2021	Benchmarking	CodeCode Available	1
Scikit-dimension: a Python package for intrinsic dimension estimation	Sep 6, 2021	Benchmarking	CodeCode Available	1
Biomedical Data-to-Text Generation via Fine-Tuning Transformers	Sep 3, 2021	BenchmarkingData-to-Text Generation	CodeCode Available	1
Benchmarking the Robustness of Instance Segmentation Models	Sep 2, 2021	BenchmarkingDomain Adaptation	—Unverified	0
Towards Sentiment Analysis of Tobacco Products’ Usage in Social Media	Sep 1, 2021	BenchmarkingSentiment Analysis	—Unverified	0
Benchmarking down-scaled (not so large) pre-trained language models	Sep 1, 2021	Benchmarking	CodeCode Available	0
ReMeDi: Resources for Multi-domain, Multi-service, Medical Dialogues	Sep 1, 2021	BenchmarkingContrastive Learning	CodeCode Available	1
Cross-Lingual Text Classification of Transliterated Hindi and Malayalam	Aug 31, 2021	BenchmarkingClassification	CodeCode Available	0
Europarl-ASR: A Large Corpus of Parliamentary Debates for Streaming ASR Benchmarking and Speech Data Filtering/Verbatimization	Aug 30, 2021	BenchmarkingData Augmentation	—Unverified	0
Benchmarking the Accuracy and Robustness of Feedback Alignment Algorithms	Aug 30, 2021	Benchmarking	—Unverified	0
Semi-Supervised Exaggeration Detection of Health Science Press Releases	Aug 30, 2021	ArticlesBenchmarking	CodeCode Available	1
Tune It or Don't Use It: Benchmarking Data-Efficient Image Classification	Aug 30, 2021	Benchmarkingimage-classification	CodeCode Available	1
BioFors: A Large Biomedical Image Forensics Dataset	Aug 30, 2021	BenchmarkingImage Forensics	CodeCode Available	0
KO codes: Inventing Nonlinear Encoding and Decoding for Reliable Wireless Communication via Deep-learning	Aug 29, 2021	BenchmarkingDecoder	CodeCode Available	1

Show:10 25 50

← PrevPage 86 of 111Next →

Benchmark Results

#	Model	Metric	Claimed	Verified	Status
1	GPT-4 Turbo	ACC	0.56	—	Unverified