Benchmarking

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 2726–2750 of 5548 papers

Title	Date	Tasks	Status	Hype
Experimental Analysis of Large-scale Learnable Vector Storage Compression	Nov 27, 2023	Benchmarking	CodeCode Available	0
Lightly Weighted Automatic Audio Parameter Extraction for the Quality Assessment of Consensus Auditory-Perceptual Evaluation of Voice	Nov 27, 2023	Benchmarking	—Unverified	0
Syn3DWound: A Synthetic Dataset for 3D Wound Bed Analysis	Nov 27, 2023	BenchmarkingDiagnostic	—Unverified	0
Benchmarking Large Language Model Volatility	Nov 26, 2023	BenchmarkingDecision Making	—Unverified	0
UHGEval: Benchmarking the Hallucination of Chinese Large Language Models via Unconstrained Generation	Nov 26, 2023	BenchmarkingHallucination	CodeCode Available	1
ASI: Accuracy-Stability Index for Evaluating Deep Learning Models	Nov 26, 2023	BenchmarkingDeep Learning	—Unverified	0
An Empirical Investigation into Benchmarking Model Multiplicity for Trustworthy Machine Learning: A Case Study on Image Classification	Nov 24, 2023	Benchmarkingimage-classification	—Unverified	0
Benchmarking Robustness of Text-Image Composed Retrieval	Nov 24, 2023	AttributeBenchmarking	CodeCode Available	1
Large Language Models as Automated Aligners for benchmarking Vision-Language Models	Nov 24, 2023	BenchmarkingWorld Knowledge	—Unverified	0
Dialogue Quality and Emotion Annotations for Customer Support Conversations	Nov 23, 2023	BenchmarkingDiversity	CodeCode Available	0
Creating and Leveraging a Synthetic Dataset of Cloud Optical Thickness Measures for Cloud Detection in MSI	Nov 23, 2023	BenchmarkingCloud Detection	CodeCode Available	0
Automated 3D Tumor Segmentation using Temporal Cubic PatchGAN (TCuP-GAN)	Nov 23, 2023	BenchmarkingBrain Tumor Segmentation	—Unverified	0
Learning Dynamic Selection and Pricing of Out-of-Home Deliveries	Nov 23, 2023	BenchmarkingDecision Making	CodeCode Available	0
Benchmarking Toxic Molecule Classification using Graph Neural Networks and Few Shot Learning	Nov 22, 2023	BenchmarkingDrug Discovery	—Unverified	0
PG-Video-LLaVA: Pixel Grounding Large Video-Language Models	Nov 22, 2023	BenchmarkingPhrase Grounding	CodeCode Available	2
A projected nonlinear state-space model for forecasting time series signals	Nov 22, 2023	BenchmarkingComputational Efficiency	CodeCode Available	0
Deep State-Space Model for Predicting Cryptocurrency Price	Nov 21, 2023	BenchmarkingUncertainty Quantification	—Unverified	0
IMGTB: A Framework for Machine-Generated Text Detection Benchmarking	Nov 21, 2023	BenchmarkingText Detection	CodeCode Available	1
Benchmarking bias: Expanding clinical AI model card to incorporate bias reporting of social and non-social factors	Nov 21, 2023	Benchmarking	—Unverified	0
BEND: Benchmarking DNA Language Models on biologically meaningful tasks	Nov 21, 2023	BenchmarkingLanguage Modeling	CodeCode Available	1
Towards a more inductive world for drug repurposing approaches	Nov 21, 2023	BenchmarkingPrediction	CodeCode Available	1
Demonstrating Almost Linear Time Complexity of Bus Admittance Matrix-Based Distribution Network Power Flow: An Empirical Approach	Nov 20, 2023	Benchmarking	—Unverified	0
LogLead -- Fast and Integrated Log Loader, Enhancer, and Anomaly Detector	Nov 20, 2023	Anomaly DetectionBenchmarking	CodeCode Available	1
Holistic Inverse Rendering of Complex Facade via Aerial 3D Scanning	Nov 20, 2023	BenchmarkingInverse Rendering	—Unverified	0
Segment Together: A Versatile Paradigm for Semi-Supervised Medical Image Segmentation	Nov 20, 2023	BenchmarkingImage Segmentation	—Unverified	0

Show:10 25 50

← PrevPage 110 of 222Next →

Benchmark Results

#	Model	Metric	Claimed	Verified	Status
1	GPT-4 Turbo	ACC	0.56	—	Unverified