Benchmarking

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 3701–3750 of 5548 papers

Title	Date	Tasks	Status
Benchmarking Algorithmic Bias in Face Recognition: An Experimental Approach Using Synthetic Faces and Human Evaluation	Aug 10, 2023	AttributeBenchmarking	—Unverified
Spintronics for image recognition: performance benchmarking via ultrafast data-driven simulations	Aug 10, 2023	BenchmarkingClassification	—Unverified
Enhancing Architecture Frameworks by Including Modern Stakeholders and their Views/Viewpoints	Aug 9, 2023	Benchmarking	—Unverified
Benchmarking LLM powered Chatbots: Methods and Metrics	Aug 8, 2023	BenchmarkingChatbot	—Unverified
RECipe: Does a Multi-Modal Recipe Knowledge Graph Fit a Multi-Purpose Recommendation System?	Aug 8, 2023	BenchmarkingCollaborative Filtering	—Unverified
Microvasculature Segmentation in Human BioMolecular Atlas Program (HuBMAP)	Aug 6, 2023	BenchmarkingImage Segmentation	—Unverified
Precise Benchmarking of Explainable AI Attribution Methods	Aug 6, 2023	Benchmarkingimage-classification	CodeCode Available
A Survey of Spanish Clinical Language Models	Aug 4, 2023	BenchmarkingSurvey	—Unverified
ChatGPT for GTFS: Benchmarking LLMs on GTFS Understanding and Retrieval	Aug 4, 2023	BenchmarkingInformation Retrieval	CodeCode Available
RobustMQ: Benchmarking Robustness of Quantized Models	Aug 4, 2023	Adversarial RobustnessBenchmarking	—Unverified
Benchmarking Adaptative Variational Quantum Algorithms on QUBO Instances	Aug 3, 2023	Benchmarking	—Unverified
Differential Privacy for Adaptive Weight Aggregation in Federated Tumor Segmentation	Aug 1, 2023	BenchmarkingBrain Tumor Segmentation	—Unverified
Capsa: A Unified Framework for Quantifying Risk in Deep Neural Networks	Aug 1, 2023	Benchmarking	—Unverified
CLAMS: A Cluster Ambiguity Measure for Estimating Perceptual Variability in Visual Clustering	Aug 1, 2023	BenchmarkingClustering	—Unverified
Benchmarking Ultra-High-Definition Image Reflection Removal	Aug 1, 2023	BenchmarkingImage Restoration	CodeCode Available
Deep Learning and Computer Vision for Glaucoma Detection: A Review	Jul 31, 2023	BenchmarkingDeep Learning	—Unverified
TMPNN: High-Order Polynomial Regression Based on Taylor Map Factorization	Jul 30, 2023	BenchmarkingMulti-target regression	CodeCode Available
Benchmarking Jetson Edge Devices with an End-to-end Video-based Anomaly Detection System	Jul 28, 2023	Anomaly DetectionAutonomous Driving	CodeCode Available
Benchmarking Performance of Deep Learning Model for Material Segmentation on Two HPC Systems	Jul 27, 2023	BenchmarkingGPU	—Unverified
Quantitative Metrics for Benchmarking Human-Aware Robot Navigation	Jul 26, 2023	BenchmarkingRobot Navigation	CodeCode Available
Fluorescent Neuronal Cells v2: Multi-Task, Multi-Format Annotations for Deep Learning in Microscopy	Jul 26, 2023	Benchmarkingobject-detection	—Unverified
YOLOBench: Benchmarking Efficient Object Detectors on Embedded Systems	Jul 26, 2023	BenchmarkingCPU	CodeCode Available
Towards an AI Accountability Policy	Jul 25, 2023	BenchmarkingFairness	—Unverified
Implementing and Benchmarking the Locally Competitive Algorithm on the Loihi 2 Neuromorphic Processor	Jul 25, 2023	BenchmarkingCPU	—Unverified
Towards Long-Term predictions of Turbulence using Neural Operators	Jul 25, 2023	Benchmarking	—Unverified
Benchmarking and Analyzing Generative Data for Visual Recognition	Jul 25, 2023	BenchmarkingRetrieval	—Unverified
When Multi-Task Learning Meets Partial Supervision: A Computer Vision Review	Jul 25, 2023	BenchmarkingMulti-Task Learning	CodeCode Available
UPREVE: An End-to-End Causal Discovery Benchmarking System	Jul 25, 2023	BenchmarkingCausal Discovery	—Unverified
The Impact of Genomic Variation on Function (IGVF) Consortium	Jul 24, 2023	Benchmarking	—Unverified
Selecting the motion ground truth for loose-fitting wearables: benchmarking optical MoCap methods	Jul 21, 2023	Benchmarking	CodeCode Available
The Extractive-Abstractive Axis: Measuring Content "Borrowing" in Generative Language Models	Jul 20, 2023	Benchmarking	—Unverified
Efficient and Accurate Optimal Transport with Mirror Descent and Conjugate Gradients	Jul 17, 2023	BenchmarkingGPU	CodeCode Available
Benchmarking fixed-length Fingerprint Representations across different Embedding Sizes and Sensor Types	Jul 17, 2023	Benchmarking	—Unverified
Approaches for benchmarking single-cell gene regulatory network inference methods	Jul 17, 2023	Benchmarking	—Unverified
On the Real-Time Semantic Segmentation of Aphid Clusters in the Wild	Jul 17, 2023	BenchmarkingReal-Time Semantic Segmentation	—Unverified
Machine Learning for Ranking f-wave Extraction Methods in Single-Lead ECGs	Jul 17, 2023	Benchmarking	—Unverified
Revisiting Implicit Models: Sparsity Trade-offs Capability in Weight-tied Model for Vision Tasks	Jul 16, 2023	Benchmarking	—Unverified
Benchmarking the Effectiveness of Classification Algorithms and SVM Kernels for Dry Beans	Jul 15, 2023	BenchmarkingDimensionality Reduction	—Unverified
Joint Batching and Scheduling for High-Throughput Multiuser Edge AI with Asynchronous Task Arrivals	Jul 15, 2023	BenchmarkingScheduling	—Unverified
Benchmarking Explanatory Models for Inertia Forecasting using Public Data of the Nordic Area	Jul 14, 2023	BenchmarkingTime Series	—Unverified
Challenge Results Are Not Reproducible	Jul 14, 2023	BenchmarkingImage Segmentation	—Unverified
Pathway: a fast and flexible unified stream data processing framework for analytical and Machine Learning applications	Jul 12, 2023	Benchmarking	—Unverified
Deep Generative Models for Physiological Signals: A Systematic Literature Review	Jul 12, 2023	BenchmarkingEEG	—Unverified
Benchmarking Bayesian Causal Discovery Methods for Downstream Treatment Effect Estimation	Jul 11, 2023	BenchmarkingCausal Discovery	—Unverified
Temporal Graphs Anomaly Emergence Detection: Benchmarking For Social Media Interactions	Jul 11, 2023	Anomaly DetectionBenchmarking	—Unverified
Assessing the efficacy of large language models in generating accurate teacher responses	Jul 9, 2023	BenchmarkingIn-Context Learning	—Unverified
Fast Empirical Scenarios	Jul 8, 2023	BenchmarkingDecision Making	—Unverified
Fairness-Aware Graph Neural Networks: A Survey	Jul 8, 2023	BenchmarkingFairness	—Unverified
Performance Modeling of Data Storage Systems using Generative Models	Jul 5, 2023	Benchmarking	CodeCode Available
Structural Property Prediction	Jul 5, 2023	BenchmarkingPrediction	—Unverified

Show:10 25 50

← PrevPage 75 of 111Next →

Benchmark Results

#	Model	Metric	Claimed	Verified	Status
1	GPT-4 Turbo	ACC	0.56	—	Unverified