Benchmarking

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 4251–4275 of 5548 papers

Title	Date	Tasks	Status	Hype
Extensible Logging and Empirical Attainment Function for IOHexperimenter	Sep 28, 2021	Benchmarking	—Unverified	0
Context-guided Triple Matching for Multiple Choice Question Answering	Sep 27, 2021	BenchmarkingMultiple-choice	—Unverified	0
PASS: An ImageNet replacement for self-supervised pretraining without humans	Sep 27, 2021	BenchmarkingEthics	CodeCode Available	1
FewNLU: Benchmarking State-of-the-Art Methods for Few-Shot Natural Language Understanding	Sep 27, 2021	BenchmarkingNatural Language Understanding	CodeCode Available	1
Disentangled Feature Representation for Few-shot Image Classification	Sep 26, 2021	BenchmarkingClassification	CodeCode Available	1
MetaDrive: Composing Diverse Driving Scenarios for Generalizable Reinforcement Learning	Sep 26, 2021	BenchmarkingDecision Making	CodeCode Available	2
Curb Your Carbon Emissions: Benchmarking Carbon Emissions in Machine Translation	Sep 26, 2021	BenchmarkingMachine Translation	—Unverified	0
Don't be Contradicted with Anything! CI-ToD: Towards Benchmarking Consistency for Task-oriented Dialogue System	Sep 23, 2021	BenchmarkingResponse Generation	CodeCode Available	1
Benchmarking Lane-changing Decision-making for Deep Reinforcement Learning	Sep 22, 2021	Autonomous DrivingBenchmarking	—Unverified	0
Benchmarking Augmentation Methods for Learning Robust Navigation Agents: the Winning Entry of the 2021 iGibson Challenge	Sep 22, 2021	BenchmarkingData Augmentation	—Unverified	0
SubseasonalClimateUSA: A Dataset for Subseasonal Forecasting and Benchmarking	Sep 21, 2021	Benchmarking	CodeCode Available	1
Efficiently solving the thief orienteering problem with a max-min ant colony optimization approach	Sep 21, 2021	Benchmarking	CodeCode Available	0
A Novel Cluster Detection of COVID-19 Patients and Medical Disease Conditions Using Improved Evolutionary Clustering Algorithm Star	Sep 20, 2021	BenchmarkingClustering	—Unverified	0
Hybrid Transceiver Design for Tera-Hertz MIMO Systems Relying on Bayesian Learning Aided Sparse Channel Estimation	Sep 20, 2021	Benchmarking	—Unverified	0
AI Accelerator Survey and Trends	Sep 18, 2021	BenchmarkingComputational Efficiency	CodeCode Available	1
Benchmarking the Combinatorial Generalizability of Complex Query Answering on Knowledge Graphs	Sep 18, 2021	BenchmarkingComplex Query Answering	CodeCode Available	1
Benchmarking Answer Verification Methods for Question Answering-Based Summarization Evaluation Metrics	Sep 17, 2021	AttributeBenchmarking	—Unverified	0
DiS-ReX: A Multilingual Dataset for Distantly Supervised Relation Extraction	Sep 17, 2021	BenchmarkingRelation	—Unverified	0
WiSoSuper: Benchmarking Super-Resolution Methods on Wind and Solar Data	Sep 17, 2021	BenchmarkingBIG-bench Machine Learning	—Unverified	0
Messing Up 3D Virtual Environments: Transferable Adversarial 3D Objects	Sep 17, 2021	BenchmarkingBIG-bench Machine Learning	CodeCode Available	0
Benchmarking Feature-based Algorithm Selection Systems for Black-box Numerical Optimization	Sep 17, 2021	Benchmarking	CodeCode Available	0
A Survey on Temporal Sentence Grounding in Videos	Sep 16, 2021	Action LocalizationBenchmarking	—Unverified	0
OPV2V: An Open Benchmark Dataset and Fusion Pipeline for Perception with Vehicle-to-Vehicle Communication	Sep 16, 2021	3D Object DetectionBenchmarking	CodeCode Available	1
Benchmarking Commonsense Knowledge Base Population with an Effective Evaluation Dataset	Sep 16, 2021	BenchmarkingKnowledge Base Population	CodeCode Available	1
Benchmarking the Spectrum of Agent Capabilities	Sep 14, 2021	Benchmarking	CodeCode Available	1

Show:10 25 50

← PrevPage 171 of 222Next →

Benchmark Results

#	Model	Metric	Claimed	Verified	Status
1	GPT-4 Turbo	ACC	0.56	—	Unverified