Benchmarking

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 4926–4950 of 5548 papers

Title	Date	Tasks	Status
Exploiting Out-of-Domain Parallel Data through Multilingual Transfer Learning for Low-Resource Neural Machine Translation	Jul 6, 2019	BenchmarkingDomain Adaptation	CodeCode Available
Zero-shot generation of synthetic neurosurgical data with large language models	Feb 13, 2025	BenchmarkingDe-identification	CodeCode Available
Benchmarking Pathology Foundation Models: Adaptation Strategies and Scenarios	Oct 21, 2024	BenchmarkingFew-Shot Learning	CodeCode Available
Three Revisits to Node-Level Graph Anomaly Detection: Outliers, Message Passing and Hyperbolic Neural Networks	Mar 6, 2024	Anomaly DetectionBenchmarking	CodeCode Available
Multiple Instance Learning: A Survey of Problem Characteristics and Applications	Dec 11, 2016	BenchmarkingDocument Classification	CodeCode Available
Self-Adjusting Weighted Expected Improvement for Bayesian Optimization	Jun 7, 2023	Bayesian OptimizationBenchmarking	CodeCode Available
Multiple Light Source Dataset for Colour Research	Aug 16, 2019	BenchmarkingImage Segmentation	CodeCode Available
Experimental Analysis of Large-scale Learnable Vector Storage Compression	Nov 27, 2023	Benchmarking	CodeCode Available
Benchmarking Parameter Control Methods in Differential Evolution for Mixed-Integer Black-Box Optimization	Apr 4, 2024	Benchmarking	CodeCode Available
ThrowBench: Benchmarking LLMs by Predicting Runtime Exceptions	Mar 6, 2025	BenchmarkingHumanEval	CodeCode Available
Benchmarking Domain Adaptation for Chemical Processes on the Tennessee Eastman Process	Aug 22, 2023	BenchmarkingDomain Adaptation	CodeCode Available
AttackSeqBench: Benchmarking Large Language Models' Understanding of Sequential Patterns in Cyber Attacks	Mar 5, 2025	Benchmarkinggraph construction	CodeCode Available
Expecting The Unexpected: Towards Broad Out-Of-Distribution Detection	Aug 22, 2023	BenchmarkingOut-of-Distribution Detection	CodeCode Available
exHarmony: Authorship and Citations for Benchmarking the Reviewer Assignment Problem	Feb 11, 2025	BenchmarkingDiversity	CodeCode Available
Benchmarking optimality of time series classification methods in distinguishing diffusions	Jan 30, 2023	BenchmarkingGaussian Processes	CodeCode Available
ExEBench: Benchmarking Foundation Models on Extreme Earth Events	May 13, 2025	BenchmarkingManagement	CodeCode Available
MULTITAT: Benchmarking Multilingual Table-and-Text Question Answering	Feb 24, 2025	BenchmarkingQuestion Answering	CodeCode Available
Evolving Evolutionary Algorithms with Patterns	Oct 10, 2021	BenchmarkingEvolutionary Algorithms	CodeCode Available
Semantic Hilbert Space for Text Representation Learning	Feb 26, 2019	BenchmarkingGeneral Classification	CodeCode Available
A Continuous Information Gain Measure to Find the Most Discriminatory Problems for AI Benchmarking	Sep 9, 2018	BenchmarkingGame Design	CodeCode Available
Timage -- A Robust Time Series Classification Pipeline	Sep 19, 2019	BenchmarkingClassification	CodeCode Available
AttackNet: Enhancing Biometric Security via Tailored Convolutional Neural Network Architectures for Liveness Detection	Feb 6, 2024	Benchmarking	CodeCode Available
EvoLearner: Learning Description Logics with Evolutionary Algorithms	Nov 8, 2021	BenchmarkingEvolutionary Algorithms	CodeCode Available
Evidential Deep Learning for Uncertainty Quantification and Out-of-Distribution Detection in Jet Identification using Deep Neural Networks	Jan 10, 2025	Anomaly DetectionBenchmarking	CodeCode Available
Integrating Large Language Models and Knowledge Graphs for Extraction and Validation of Textual Test Data	Aug 3, 2024	BenchmarkingKnowledge Graphs	CodeCode Available

Show:10 25 50

← PrevPage 198 of 222Next →

Benchmark Results

#	Model	Metric	Claimed	Verified	Status
1	GPT-4 Turbo	ACC	0.56	—	Unverified