Benchmarking

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 4026–4050 of 5548 papers

Title	Date	Tasks	Status	Hype
SUTD-PRCM Dataset and Neural Architecture Search Approach for Complex Metasurface Design	Feb 24, 2022	Benchmarkingimage-classification	—Unverified	0
Measuring CLEVRness: Blackbox testing of Visual Reasoning Models	Feb 24, 2022	BenchmarkingDiagnostic	—Unverified	0
Benchmarking Generative Latent Variable Models for Speech	Feb 22, 2022	BenchmarkingImage Generation	CodeCode Available	0
Evaluating Feature Attribution Methods in the Image Domain	Feb 22, 2022	Benchmarking	CodeCode Available	0
Benchmarking the Linear Algebra Awareness of TensorFlow and PyTorch	Feb 20, 2022	Benchmarking	CodeCode Available	0
How to Manage Tiny Machine Learning at Scale: An Industrial Perspective	Feb 18, 2022	BenchmarkingBIG-bench Machine Learning	CodeCode Available	0
Rethinking Pareto Frontier for Performance Evaluation of Deep Neural Networks	Feb 18, 2022	BenchmarkingDeep Learning	—Unverified	0
MultiRes-NetVLAD: Augmenting Place Recognition Training with Low-Resolution Imagery	Feb 18, 2022	BenchmarkingRepresentation Learning	CodeCode Available	1
Benchmarking missing-values approaches for predictive models on health databases	Feb 17, 2022	AttributeBenchmarking	CodeCode Available	0
On loss functions and evaluation metrics for music source separation	Feb 16, 2022	Audio Source SeparationBenchmarking	—Unverified	0
Benchmarking of DL Libraries and Models on Mobile Devices	Feb 14, 2022	BenchmarkingGPU	CodeCode Available	1
Benchmarking Online Sequence-to-Sequence and Character-based Handwriting Recognition from IMU-Enhanced Pens	Feb 14, 2022	BenchmarkingHandwriting Recognition	—Unverified	0
Benchmarking Robot Manipulation with the Rubik's Cube	Feb 14, 2022	BenchmarkingRobot Manipulation	—Unverified	0
MetaShift: A Dataset of Datasets for Evaluating Contextual Distribution Shifts and Training Conflicts	Feb 14, 2022	Benchmarking	CodeCode Available	1
Wukong: A 100 Million Large-scale Chinese Cross-modal Pre-training Benchmark	Feb 14, 2022	BenchmarkingContrastive Learning	CodeCode Available	0
Dual Task Framework for Improving Persona-grounded Dialogue Dataset	Feb 11, 2022	Benchmarking	—Unverified	0
High Fidelity RF Clutter Modeling and Simulation	Feb 10, 2022	BenchmarkingVocal Bursts Intensity Prediction	—Unverified	0
Lightweight Jet Reconstruction and Identification as an Object Detection Task	Feb 9, 2022	Benchmarkingobject-detection	—Unverified	0
BIQ2021: A Large-Scale Blind Image Quality Assessment Database	Feb 8, 2022	BenchmarkingBlind Image Quality Assessment	—Unverified	0
ECRECer: Enzyme Commission Number Recommendation and Benchmarking based on Multiagent Dual-core Learning	Feb 8, 2022	BenchmarkingLanguage Modelling	CodeCode Available	1
Comparative Study Between Distance Measures On Supervised Optimum-Path Forest Classification	Feb 8, 2022	Anomaly DetectionBenchmarking	CodeCode Available	0
What are the best systems? New perspectives on NLP Benchmarking	Feb 8, 2022	Benchmarking	CodeCode Available	1
RECOVER: sequential model optimization platform for combination drug repurposing identifies novel synergistic compounds in vitro	Feb 7, 2022	BenchmarkingModel Optimization	CodeCode Available	1
Theory-inspired Parameter Control Benchmarks for Dynamic Algorithm Configuration	Feb 7, 2022	BenchmarkingEvolutionary Algorithms	CodeCode Available	0
Benchmarking and Analyzing Point Cloud Classification under Corruptions	Feb 7, 2022	BenchmarkingClassification	CodeCode Available	1

Show:10 25 50

← PrevPage 162 of 222Next →

Benchmark Results

#	Model	Metric	Claimed	Verified	Status
1	GPT-4 Turbo	ACC	0.56	—	Unverified