SOTAVerified|Agents Browse Leaderboard About Blog

Benchmarking

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 2221–2230 of 5548 papers

Title	Date	Tasks	Status	Hype
Revisiting a Pain in the Neck: Semantic Phrase Processing Benchmark for Language Models	May 5, 2024	Benchmarking	CodeCode Available	0
iSEARLE: Improving Textual Inversion for Zero-Shot Composed Image Retrieval	May 5, 2024	BenchmarkingComposed Image Retrieval (CoIR)	CodeCode Available	2
Performance Evaluation of Real-Time Object Detection for Electric Scooters	May 5, 2024	Autonomous VehiclesBenchmarking	CodeCode Available	0
PhilHumans: Benchmarking Machine Learning for Personal Health	May 4, 2024	Action AnticipationBenchmarking	—Unverified	0
Position: Quo Vadis, Unsupervised Time Series Anomaly Detection?	May 4, 2024	Anomaly DetectionBenchmarking	CodeCode Available	1
Systematic Review: Anomaly Detection in Connected and Autonomous Vehicles	May 4, 2024	Anomaly DetectionArticles	—Unverified	0
A Normative Framework for Benchmarking Consumer Fairness in Large Language Model Recommender System	May 3, 2024	BenchmarkingCollaborative Filtering	—Unverified	0
Single and Multi-Hop Question-Answering Datasets for Reticular Chemistry with GPT-4-Turbo	May 3, 2024	BenchmarkingMulti-hop Question Answering	CodeCode Available	0
Toward end-to-end interpretable convolutional neural networks for waveform signals	May 3, 2024	BenchmarkingEmotion Recognition	—Unverified	0
CityLearn v2: Energy-flexible, resilient, occupant-centric, and carbon-aware management of grid-interactive communities	May 2, 2024	BenchmarkingManagement	—Unverified	0

Show:10 25 50

← PrevPage 223 of 555Next →

Benchmark Results

#	Model	Metric	Claimed	Verified	Status
1	GPT-4 Turbo	ACC	0.56	—	Unverified