Benchmarking

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 3001–3025 of 5548 papers

Title	Date	Tasks	Status	Hype
Are SNNs Truly Energy-efficient? - A Hardware Perspective	Sep 6, 2023	Benchmarking	—Unverified	0
AGIBench: A Multi-granularity, Multimodal, Human-referenced, Auto-scoring Benchmark for Large Language Models	Sep 5, 2023	BenchmarkingZero-Shot Learning	—Unverified	0
A skeletonization algorithm for gradient-based optimization	Sep 5, 2023	BenchmarkingDeep Learning	CodeCode Available	1
A survey on efficient vision transformers: algorithms, techniques, and performance benchmarking	Sep 5, 2023	BenchmarkingKnowledge Distillation	—Unverified	0
Transfer Learning between Motor Imagery Datasets using Deep Learning -- Validation of Framework and Comparison of Datasets	Sep 4, 2023	BenchmarkingMotor Imagery	CodeCode Available	0
Benchmarking Large Language Models in Retrieval-Augmented Generation	Sep 4, 2023	Benchmarkingcounterfactual	CodeCode Available	2
Hybrid data driven/thermal simulation model for comfort assessment	Sep 4, 2023	Benchmarking	—Unverified	0
Benchmarking Autoregressive Conditional Diffusion Models for Turbulent Flow Simulation	Sep 4, 2023	Benchmarking	CodeCode Available	1
Orientation-Independent Chinese Text Recognition in Scene Images	Sep 3, 2023	BenchmarkingImage Reconstruction	CodeCode Available	2
FOR-instance: a UAV laser scanning benchmark dataset for semantic and instance segmentation of individual trees	Sep 3, 2023	BenchmarkingInstance Segmentation	—Unverified	0
Holistic Dynamic Frequency Transformer for Image Fusion and Exposure Correction	Sep 3, 2023	BenchmarkingExposure Correction	—Unverified	0
NeMig -- A Bilingual News Collection and Knowledge Graph about Migration	Sep 1, 2023	ArticlesBenchmarking	CodeCode Available	0
FederatedScope-LLM: A Comprehensive Package for Fine-tuning Large Language Models in Federated Learning	Sep 1, 2023	BenchmarkingFederated Learning	—Unverified	0
Can humans help BERT gain "confidence"?	Aug 31, 2023	BenchmarkingEEG	—Unverified	0
Developing a Scalable Benchmark for Assessing Large Language Models in Knowledge Graph Engineering	Aug 31, 2023	BenchmarkingDataset Generation	CodeCode Available	1
Benchmarking Robustness and Generalization in Multi-Agent Systems: A Case Study on Neural MMO	Aug 30, 2023	BenchmarkingReinforcement Learning (RL)	—Unverified	0
Benchmarking Multilabel Topic Classification in the Kyrgyz Language	Aug 30, 2023	BenchmarkingClassification	CodeCode Available	0
Benchmarking the Generation of Fact Checking Explanations	Aug 29, 2023	Abstractive Text SummarizationArticles	CodeCode Available	1
Towards quantitative precision for ECG analysis: Leveraging state space models, self-supervision and patient metadata	Aug 29, 2023	BenchmarkingDiagnostic	CodeCode Available	1
Matbench Discovery -- A framework to evaluate machine learning crystal stability predictions	Aug 28, 2023	BenchmarkingFormation Energy	CodeCode Available	3
Speech Self-Supervised Representations Benchmarking: a Case for Larger Probing Heads	Aug 28, 2023	BenchmarkingSelf-Supervised Learning	—Unverified	0
MLLM-DataEngine: An Iterative Refinement Approach for MLLM	Aug 25, 2023	Benchmarking	CodeCode Available	1
Benchmarking Data Efficiency and Computational Efficiency of Temporal Action Localization Models	Aug 24, 2023	Action LocalizationBenchmarking	—Unverified	0
Beyond Document Page Classification: Design, Datasets, and Challenges	Aug 24, 2023	BenchmarkingClassification	CodeCode Available	0
Topical-Chat: Towards Knowledge-Grounded Open-Domain Conversations	Aug 23, 2023	BenchmarkingDecoder	CodeCode Available	2

Show:10 25 50

← PrevPage 121 of 222Next →

Benchmark Results

#	Model	Metric	Claimed	Verified	Status
1	GPT-4 Turbo	ACC	0.56	—	Unverified