Benchmarking

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 2951–2975 of 5548 papers

Title	Date	Tasks	Status	Hype
Demographic Parity: Mitigating Biases in Real-World Data	Sep 27, 2023	Benchmarking	—Unverified	0
NLPBench: Evaluating Large Language Models on Solving NLP Problems	Sep 27, 2023	BenchmarkingMath	CodeCode Available	1
A Content-Driven Micro-Video Recommendation Dataset at Scale	Sep 27, 2023	BenchmarkingRecommendation Systems	CodeCode Available	2
Unified Long-Term Time-Series Forecasting Benchmark	Sep 27, 2023	BenchmarkingTime Series	CodeCode Available	1
Node-Aligned Graph-to-Graph (NAG2G): Elevating Template-Free Deep Learning Approaches in Single-Step Retrosynthesis	Sep 27, 2023	BenchmarkingGraph Generation	CodeCode Available	1
Advancing The Rate-Distortion-Computation Frontier For Neural Image Compression	Sep 26, 2023	BenchmarkingImage Compression	—Unverified	0
A Toolkit for Reliable Benchmarking and Research in Multi-Objective Reinforcement Learning	Sep 26, 2023	BenchmarkingMulti-Objective Reinforcement Learning	CodeCode Available	2
Thalamic nuclei segmentation from T_1-weighted MRI: unifying and benchmarking state-of-the-art methods with young and old cohorts	Sep 26, 2023	BenchmarkingSegmentation	—Unverified	0
On quantifying and improving realism of images generated with diffusion	Sep 26, 2023	AttributeBenchmarking	—Unverified	0
Optimization Techniques for a Physical Model of Human Vocalisation	Sep 26, 2023	Benchmarking	—Unverified	0
Benchmarking Local Robustness of High-Accuracy Binary Neural Networks for Enhanced Traffic Sign Recognition	Sep 25, 2023	Autonomous DrivingBenchmarking	CodeCode Available	1
Efficient Pauli channel estimation with logarithmic quantum memory	Sep 25, 2023	Benchmarking	—Unverified	0
Machine-assisted quantitizing designs: augmenting humanities and social sciences with artificial intelligence	Sep 24, 2023	BenchmarkingChange Detection	CodeCode Available	0
Categorization and analysis of 14 computational methods for estimating cell potency from single-cell RNA-seq data	Sep 24, 2023	Benchmarking	—Unverified	0
Benchmarking Encoder-Decoder Architectures for Biplanar X-ray to 3D Shape Reconstruction	Sep 24, 2023	3D Shape ReconstructionAnatomy	CodeCode Available	1
VisionKG: Unleashing the Power of Visual Datasets via Knowledge Graph	Sep 24, 2023	BenchmarkingKnowledge Graphs	—Unverified	0
Grad DFT: a software library for machine learning enhanced density functional theory	Sep 23, 2023	Benchmarking	CodeCode Available	1
Turbulence in Focus: Benchmarking Scaling Behavior of 3D Volumetric Super-Resolution with BLASTNet 2.0 Data	Sep 23, 2023	BenchmarkingSuper-Resolution	—Unverified	0
Domain Adaptation for Arabic Machine Translation: The Case of Financial Texts	Sep 22, 2023	ArticlesBenchmarking	—Unverified	0
Benchmarking quantized LLaMa-based models on the Brazilian Secondary School Exam	Sep 21, 2023	BenchmarkingComputational Efficiency	—Unverified	0
Prompt Tuned Embedding Classification for Multi-Label Industry Sector Allocation	Sep 21, 2023	BenchmarkingClassification	CodeCode Available	1
Multimodal Deep Learning for Scientific Imaging Interpretation	Sep 21, 2023	ArticlesBenchmarking	—Unverified	0
On the relationship between Benchmarking, Standards and Certification in Robotics and AI	Sep 21, 2023	Benchmarking	—Unverified	0
Towards Effective Disambiguation for Machine Translation with Large Language Models	Sep 20, 2023	BenchmarkingIn-Context Learning	—Unverified	0
An Evaluation of Machine Learning Approaches for Early Diagnosis of Autism Spectrum Disorder	Sep 20, 2023	BenchmarkingClustering	CodeCode Available	0

Show:10 25 50

← PrevPage 119 of 222Next →

Benchmark Results

#	Model	Metric	Claimed	Verified	Status
1	GPT-4 Turbo	ACC	0.56	—	Unverified