Benchmarking

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 4551–4575 of 5548 papers

Title	Date	Tasks	Status
Improvements & Evaluations on the MLCommons CloudMask Benchmark	Mar 7, 2024	Benchmarking	CodeCode Available
The current state of single-cell proteomics data analysis	Oct 3, 2022	Benchmarking	CodeCode Available
Retrieval or Global Context Understanding? On Many-Shot In-Context Learning for Long-Context Evaluation	Nov 11, 2024	16kBenchmarking	CodeCode Available
BOLD: Dataset and Metrics for Measuring Biases in Open-Ended Language Generation	Jan 27, 2021	BenchmarkingText Generation	CodeCode Available
Improve Machine Learning carbon footprint using Parquet dataset format and Mixed Precision training for regression models -- Part II	Sep 17, 2024	BenchmarkingDescriptive	CodeCode Available
BN-AuthProf: Benchmarking Machine Learning for Bangla Author Profiling on Social Media Texts	Dec 3, 2024	Age And Gender ClassificationAge and Gender Estimation	CodeCode Available
LLM Benchmarking with LLaMA2: Evaluating Code Development Performance Across Multiple Programming Languages	Mar 24, 2025	Benchmarking	CodeCode Available
Improve Machine Learning carbon footprint using Nvidia GPU and Mixed Precision training for classification models -- Part I	Sep 12, 2024	BenchmarkingCPU	CodeCode Available
LLM Detectors Still Fall Short of Real World: Case of LLM-Generated Short News-Like Posts	Sep 5, 2024	Benchmarking	CodeCode Available
Improved Target-specific Stance Detection on Social Media Platforms by Delving into Conversation Threads	Nov 6, 2022	BenchmarkingOpinion Mining	CodeCode Available
BLESS: Benchmarking Large Language Models on Sentence Simplification	Oct 24, 2023	BenchmarkingDiversity	CodeCode Available
Improved Multilingual Language Model Pretraining for Social Media Text via Translation Pair Prediction	Oct 20, 2021	BenchmarkingLanguage Modeling	CodeCode Available
Importance of Disjoint Sampling in Conventional and Transformer Models for Hyperspectral Image Classification	Apr 23, 2024	BenchmarkingHyperspectral Image Classification	CodeCode Available
BanglaNLP at BLP-2023 Task 2: Benchmarking different Transformer Models for Sentiment Analysis of Bangla Social Media Posts	Oct 13, 2023	BenchmarkingSentiment Analysis	CodeCode Available
LLM Performance for Code Generation on Noisy Tasks	May 29, 2025	BenchmarkingCode Generation	CodeCode Available
ImpliRet: Benchmarking the Implicit Fact Retrieval Challenge	Jun 17, 2025	BenchmarkingRetrieval	CodeCode Available
A Dataset for Web-Scale Knowledge Base Population	Jun 3, 2018	BenchmarkingKnowledge Base Population	CodeCode Available
The Devil is in the Prompts: De-Identification Traces Enhance Memorization Risks in Synthetic Chest X-Ray Generation	Feb 11, 2025	BenchmarkingDe-identification	CodeCode Available
Impact of ImageNet Model Selection on Domain Adaptation	Feb 6, 2020	BenchmarkingDomain Adaptation	CodeCode Available
Immunofluorescence Capillary Imaging Segmentation: Cases Study	Jul 14, 2022	BenchmarkingImage Segmentation	CodeCode Available
Analyzing the Feature Extractor Networks for Face Image Synthesis	Jun 4, 2024	BenchmarkingImage Generation	CodeCode Available
ImmersePro: End-to-End Stereo Video Synthesis Via Implicit Disparity Learning	Sep 30, 2024	BenchmarkingDisparity Estimation	CodeCode Available
LLpowershap: Logistic Loss-based Automated Shapley Values Feature Selection Method	Jan 23, 2024	BenchmarkingFairness	CodeCode Available
Revisiting and Benchmarking Graph Autoencoders: A Contrastive Learning Perspective	Oct 14, 2024	BenchmarkingContrastive Learning	CodeCode Available
Illusory VQA: Benchmarking and Enhancing Multimodal Models on Visual Illusions	Dec 11, 2024	BenchmarkingQuestion Answering	CodeCode Available

Show:10 25 50

← PrevPage 183 of 222Next →

Benchmark Results

#	Model	Metric	Claimed	Verified	Status
1	GPT-4 Turbo	ACC	0.56	—	Unverified