SOTAVerified|Agents Browse Leaderboard About Blog

Benchmarking

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 3701–3710 of 5548 papers

Title	Date	Tasks	Status	Hype
Benchmarking Algorithmic Bias in Face Recognition: An Experimental Approach Using Synthetic Faces and Human Evaluation	Aug 10, 2023	AttributeBenchmarking	—Unverified	0
Spintronics for image recognition: performance benchmarking via ultrafast data-driven simulations	Aug 10, 2023	BenchmarkingClassification	—Unverified	0
Enhancing Architecture Frameworks by Including Modern Stakeholders and their Views/Viewpoints	Aug 9, 2023	Benchmarking	—Unverified	0
Benchmarking LLM powered Chatbots: Methods and Metrics	Aug 8, 2023	BenchmarkingChatbot	—Unverified	0
RECipe: Does a Multi-Modal Recipe Knowledge Graph Fit a Multi-Purpose Recommendation System?	Aug 8, 2023	BenchmarkingCollaborative Filtering	—Unverified	0
Microvasculature Segmentation in Human BioMolecular Atlas Program (HuBMAP)	Aug 6, 2023	BenchmarkingImage Segmentation	—Unverified	0
Precise Benchmarking of Explainable AI Attribution Methods	Aug 6, 2023	Benchmarkingimage-classification	CodeCode Available	0
A Survey of Spanish Clinical Language Models	Aug 4, 2023	BenchmarkingSurvey	—Unverified	0
ChatGPT for GTFS: Benchmarking LLMs on GTFS Understanding and Retrieval	Aug 4, 2023	BenchmarkingInformation Retrieval	CodeCode Available	0
RobustMQ: Benchmarking Robustness of Quantized Models	Aug 4, 2023	Adversarial RobustnessBenchmarking	—Unverified	0

Show:10 25 50

← PrevPage 371 of 555Next →

Benchmark Results

#	Model	Metric	Claimed	Verified	Status
1	GPT-4 Turbo	ACC	0.56	—	Unverified