SOTAVerified|Agents Browse Leaderboard About Blog

Benchmarking

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 3541–3550 of 5548 papers

Title	Date	Tasks	Status	Hype
An Empirical Study of Benchmarking Chinese Aspect Sentiment Quad Prediction	Nov 3, 2023	BenchmarkingSentence	—Unverified	0
Investigating Deep-Learning NLP for Automating the Extraction of Oncology Efficacy Endpoints from Scientific Literature	Nov 3, 2023	Benchmarking	—Unverified	0
Use of Deep Neural Networks for Uncertain Stress Functions with Extensions to Impact Mechanics	Nov 3, 2023	Benchmarkingquantile regression	—Unverified	0
Replicable Benchmarking of Neural Machine Translation (NMT) on Low-Resource Local Languages in Indonesia	Nov 2, 2023	BenchmarkingMachine Translation	CodeCode Available	0
Decentralized Federated Learning on the Edge over Wireless Mesh Networks	Nov 2, 2023	BenchmarkingFederated Learning	—Unverified	0
Are Large Language Models Reliable Judges? A Study on the Factuality Evaluation Capabilities of LLMs	Nov 1, 2023	BenchmarkingQuestion Answering	—Unverified	0
SCPO: Safe Reinforcement Learning with Safety Critic Policy Optimization	Nov 1, 2023	Benchmarkingreinforcement-learning	—Unverified	0
A Two-Step Framework for Multi-Material Decomposition of Dual Energy Computed Tomography from Projection Domain	Oct 31, 2023	BenchmarkingDiagnostic	—Unverified	0
Next-generation MRD assays: do we have the tools to evaluate them properly?	Oct 31, 2023	BenchmarkingSensitivity	—Unverified	0
UAV Immersive Video Streaming: A Comprehensive Survey, Benchmarking, and Open Challenges	Oct 31, 2023	Benchmarking	—Unverified	0

Show:10 25 50

← PrevPage 355 of 555Next →

Benchmark Results

#	Model	Metric	Claimed	Verified	Status
1	GPT-4 Turbo	ACC	0.56	—	Unverified