SOTAVerified|Agents Browse Leaderboard About Blog

Benchmarking

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 4961–4970 of 5548 papers

Title	Date	Tasks	Status	Hype
A Thorough Performance Benchmarking on Lightweight Embedding-based Recommender Systems	Jun 25, 2024	BenchmarkingCollaborative Filtering	CodeCode Available	0
SemSegBench & DetecBench: Benchmarking Reliability and Generalization Beyond Classification	May 23, 2025	BenchmarkingClassification	CodeCode Available	0
Separating form and meaning: Using self-consistency to quantify task understanding across multiple senses	May 19, 2023	BenchmarkingForm	CodeCode Available	0
Unsupervised Novelty Detection Methods Benchmarking with Wavelet Decomposition	Sep 11, 2024	BenchmarkingNovelty Detection	CodeCode Available	0
Evaluating Shallow and Deep Neural Networks for Network Intrusion Detection Systems in Cyber Security	Oct 8, 2018	BenchmarkingBIG-bench Machine Learning	CodeCode Available	0
Transparent and Scrutable Recommendations Using Natural Language User Profiles	Feb 8, 2024	BenchmarkingDescriptive	CodeCode Available	0
SenseShift6D: Multimodal RGB-D Benchmarking for Robust 6D Pose Estimation across Environment and Sensor Variations	Jul 8, 2025	6D Pose Estimation6D Pose Estimation using RGB	CodeCode Available	0
SensorBench: Benchmarking LLMs in Coding-Based Sensor Processing	Oct 14, 2024	BenchmarkingManagement	CodeCode Available	0
A Comprehensive Summarization and Evaluation of Feature Refinement Modules for CTR Prediction	Nov 8, 2023	BenchmarkingClick-Through Rate Prediction	CodeCode Available	0
Navigating Out-of-Distribution Electricity Load Forecasting during COVID-19: Benchmarking energy load forecasting models without and with continual learning	Sep 8, 2023	BenchmarkingContinual Learning	CodeCode Available	0

Show:10 25 50

← PrevPage 497 of 555Next →

Benchmark Results

#	Model	Metric	Claimed	Verified	Status
1	GPT-4 Turbo	ACC	0.56	—	Unverified