Benchmarking

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 3351–3375 of 5548 papers

Title	Date	Tasks	Status
BdSLW60: A Word-Level Bangla Sign Language Dataset	Feb 13, 2024	BenchmarkingGesture Recognition	CodeCode Available
Impact of spatial transformations on landscape features of CEC2022 basic benchmark problems	Feb 12, 2024	Benchmarking	—Unverified
Benchmarking and Building Long-Context Retrieval Models with LoCo and M2-BERT	Feb 12, 2024	BenchmarkingChunking	—Unverified
EvoGPT-f: An Evolutionary GPT Framework for Benchmarking Formal Math Languages	Feb 12, 2024	Automated Theorem ProvingBenchmarking	—Unverified
Can Tree Based Approaches Surpass Deep Learning in Anomaly Detection? A Benchmarking Study	Feb 11, 2024	Anomaly DetectionBenchmarking	CodeCode Available
Estimating the Effect of Crosstalk Error on Circuit Fidelity Using Noisy Intermediate-Scale Quantum Devices	Feb 10, 2024	Benchmarking	—Unverified
ProtIR: Iterative Refinement between Retrievers and Predictors for Protein Function Annotation	Feb 10, 2024	BenchmarkingLanguage Modeling	—Unverified
Improving 2D-3D Dense Correspondences with Diffusion Models for 6D Object Pose Estimation	Feb 9, 2024	6D Pose Estimation using RGBBenchmarking	—Unverified
A Functional Analysis Approach to Symbolic Regression	Feb 9, 2024	Benchmarkingregression	—Unverified
LLaVA-Docent: Instruction Tuning with Multimodal Large Language Model to Support Art Appreciation Education	Feb 9, 2024	BenchmarkingChatbot	—Unverified
Efficient Expression Neutrality Estimation with Application to Face Recognition Utility Prediction	Feb 8, 2024	BenchmarkingFace Image Quality	—Unverified
Transparent and Scrutable Recommendations Using Natural Language User Profiles	Feb 8, 2024	BenchmarkingDescriptive	CodeCode Available
Benchmarking Large Language Models on Communicative Medical Coaching: a Novel System and Dataset	Feb 8, 2024	Benchmarking	CodeCode Available
Towards Biologically Plausible and Private Gene Expression Data Generation	Feb 7, 2024	Benchmarking	CodeCode Available
BRI3L: A Brightness Illusion Image Dataset for Identification and Localization of Regions of Illusory Perception	Feb 7, 2024	Benchmarking	CodeCode Available
AttackNet: Enhancing Biometric Security via Tailored Convolutional Neural Network Architectures for Liveness Detection	Feb 6, 2024	Benchmarking	CodeCode Available
Are Machines Better at Complex Reasoning? Unveiling Human-Machine Inference Gaps in Entailment Verification	Feb 6, 2024	BenchmarkingMultiple-choice	—Unverified
Quantitative Metrics for Benchmarking Medical Image Harmonization	Feb 6, 2024	AnatomyBenchmarking	—Unverified
PowerGraph: A power grid benchmark dataset for graph neural networks	Feb 5, 2024	ArticlesBenchmarking	—Unverified
Architecture Analysis and Benchmarking of 3D U-shaped Deep Learning Models for Thoracic Anatomical Segmentation	Feb 5, 2024	BenchmarkingImage Segmentation	CodeCode Available
Vi(E)va LLM! A Conceptual Stack for Evaluating and Interpreting Generative AI-based Visualizations	Feb 3, 2024	Benchmarking	CodeCode Available
Probing Critical Learning Dynamics of PLMs for Hate Speech Detection	Feb 3, 2024	BenchmarkingHate Speech Detection	CodeCode Available
Can LLMs perform structured graph reasoning?	Feb 2, 2024	BenchmarkingNavigate	CodeCode Available
Variational Quantum Circuits Enhanced Generative Adversarial Network	Feb 2, 2024	BenchmarkingGenerative Adversarial Network	—Unverified
Benchmarking Spiking Neural Network Learning Methods with Varying Locality	Feb 1, 2024	Benchmarking	—Unverified

Show:10 25 50

← PrevPage 135 of 222Next →

Benchmark Results

#	Model	Metric	Claimed	Verified	Status
1	GPT-4 Turbo	ACC	0.56	—	Unverified