SOTAVerified|Agents Browse Leaderboard About Blog

Benchmarking

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 3151–3160 of 5548 papers

Title	Date	Tasks	Status	Hype
NoiseBench: Benchmarking the Impact of Real Label Noise on Named Entity Recognition	May 13, 2024	Benchmarkingnamed-entity-recognition	CodeCode Available	0
Comparative analysis of neural network architectures for short-term FOREX forecasting	May 13, 2024	Benchmarking	—Unverified	0
UCCIX: Irish-eXcellence Large Language Model	May 13, 2024	BenchmarkingLanguage Modeling	—Unverified	0
Divergent Creativity in Humans and Large Language Models	May 13, 2024	Benchmarking	CodeCode Available	0
oTTC: Object Time-to-Contact for Motion Estimation in Autonomous Driving	May 13, 2024	AttributeAutonomous Driving	—Unverified	0
Benchmarking Retrieval-Augmented Large Language Models in Biomedical NLP: Application, Robustness, and Self-Awareness	May 13, 2024	Benchmarkingcounterfactual	—Unverified	0
Benchmarking Cross-Domain Audio-Visual Deception Detection	May 11, 2024	BenchmarkingDeception Detection	—Unverified	0
Replication Study and Benchmarking of Real-Time Object Detection Models	May 11, 2024	Benchmarkingobject-detection	CodeCode Available	0
Automating Code Adaptation for MLOps -- A Benchmarking Study on LLMs	May 10, 2024	BenchmarkingHyperparameter Optimization	—Unverified	0
Agent-oriented Joint Decision Support for Data Owners in Auction-based Federated Learning	May 9, 2024	BenchmarkingFederated Learning	—Unverified	0

Show:10 25 50

← PrevPage 316 of 555Next →

Benchmark Results

#	Model	Metric	Claimed	Verified	Status
1	GPT-4 Turbo	ACC	0.56	—	Unverified