Benchmarking

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 1651–1675 of 5548 papers

Title	Date	Tasks	Status	Hype
Benchmarking VLMs' Reasoning About Persuasive Atypical Images	Sep 16, 2024	BenchmarkingObject Recognition	—Unverified	0
Benchmarking Large Language Model Uncertainty for Prompt Optimization	Sep 16, 2024	BenchmarkingDiversity	CodeCode Available	0
Benchmarking LLMs in Political Content Text-Annotation: Proof-of-Concept with Toxicity and Incivility Data	Sep 15, 2024	Benchmarkingtext annotation	—Unverified	0
Byzantine-Robust and Communication-Efficient Distributed Learning via Compressed Momentum Filtering	Sep 13, 2024	BenchmarkingBinary Classification	—Unverified	0
LLM-Powered Grapheme-to-Phoneme Conversion: Benchmark and Case Study	Sep 13, 2024	BenchmarkingGrapheme-to-Phoneme Conversion	—Unverified	0
Text-To-Speech Synthesis In The Wild	Sep 13, 2024	BenchmarkingSpeaker Recognition	—Unverified	0
ODAQ: Open Dataset of Audio Quality - Benchmark on GitHub	Sep 13, 2024	Audio Quality AssessmentBenchmarking	CodeCode Available	1
Introducing CausalBench: A Flexible Benchmark Framework for Causal Analysis and Machine Learning	Sep 12, 2024	BenchmarkingFairness	—Unverified	0
Linear energy storage and flexibility model with ramp rate, ramping, deadline and capacity constraints	Sep 12, 2024	Benchmarking	CodeCode Available	0
Enhancing Q&A Text Retrieval with Ranking Models: Benchmarking, fine-tuning and deploying Rerankers for RAG	Sep 12, 2024	BenchmarkingQuestion Answering	—Unverified	0
Online vs Offline: A Comparative Study of First-Party and Third-Party Evaluations of Social Chatbots	Sep 12, 2024	BenchmarkingChatbot	—Unverified	0
Efficient Sparse Coding with the Adaptive Locally Competitive Algorithm for Speech Classification	Sep 12, 2024	BenchmarkingClassification	—Unverified	0
The JPEG Pleno Learning-based Point Cloud Coding Standard: Serving Man and Machine	Sep 12, 2024	Autonomous DrivingBenchmarking	—Unverified	0
Improve Machine Learning carbon footprint using Nvidia GPU and Mixed Precision training for classification models -- Part I	Sep 12, 2024	BenchmarkingCPU	CodeCode Available	0
The CLC-UKET Dataset: Benchmarking Case Outcome Prediction for the UK Employment Tribunal	Sep 12, 2024	BenchmarkingLanguage Modeling	—Unverified	0
Benchmarking and Validation of Sub-mW 30GHz VG-LNAs in 22nm FDSOI CMOS for 5G/6G Phased-Array Receivers	Sep 11, 2024	Benchmarking	—Unverified	0
Understanding Foundation Models: Are We Back in 1924?	Sep 11, 2024	Benchmarking	—Unverified	0
Unsupervised Novelty Detection Methods Benchmarking with Wavelet Decomposition	Sep 11, 2024	BenchmarkingNovelty Detection	CodeCode Available	0
Benchmarking 2D Egocentric Hand Pose Datasets	Sep 11, 2024	Activity RecognitionBenchmarking	—Unverified	0
VoiceWukong: Benchmarking Deepfake Voice Detection	Sep 10, 2024	BenchmarkingFace Swapping	—Unverified	0
Mahalanobis k-NN: A Statistical Lens for Robust Point-Cloud Registrations	Sep 10, 2024	BenchmarkingPoint Cloud Registration	CodeCode Available	0
Benchmarking Sub-Genre Classification For Mainstage Dance Music	Sep 10, 2024	BenchmarkingClassification	—Unverified	0
MIP-GAF: A MLLM-annotated Benchmark for Most Important Person Localization and Group Context Understanding	Sep 10, 2024	BenchmarkingLanguage Modeling	CodeCode Available	0
Ransomware Detection Using Machine Learning in the Linux Kernel	Sep 10, 2024	Benchmarking	—Unverified	0
Selecting Differential Splicing Methods: Practical Considerations	Sep 9, 2024	Benchmarking	—Unverified	0

Show:10 25 50

← PrevPage 67 of 222Next →

Benchmark Results

#	Model	Metric	Claimed	Verified	Status
1	GPT-4 Turbo	ACC	0.56	—	Unverified