SOTAVerified|Agents Browse Leaderboard About Blog

Benchmarking

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 3011–3020 of 5548 papers

Title	Date	Tasks	Status	Hype
Holistic Dynamic Frequency Transformer for Image Fusion and Exposure Correction	Sep 3, 2023	BenchmarkingExposure Correction	—Unverified	0
NeMig -- A Bilingual News Collection and Knowledge Graph about Migration	Sep 1, 2023	ArticlesBenchmarking	CodeCode Available	0
FederatedScope-LLM: A Comprehensive Package for Fine-tuning Large Language Models in Federated Learning	Sep 1, 2023	BenchmarkingFederated Learning	—Unverified	0
Can humans help BERT gain "confidence"?	Aug 31, 2023	BenchmarkingEEG	—Unverified	0
Developing a Scalable Benchmark for Assessing Large Language Models in Knowledge Graph Engineering	Aug 31, 2023	BenchmarkingDataset Generation	CodeCode Available	1
Benchmarking Robustness and Generalization in Multi-Agent Systems: A Case Study on Neural MMO	Aug 30, 2023	BenchmarkingReinforcement Learning (RL)	—Unverified	0
Benchmarking Multilabel Topic Classification in the Kyrgyz Language	Aug 30, 2023	BenchmarkingClassification	CodeCode Available	0
Benchmarking the Generation of Fact Checking Explanations	Aug 29, 2023	Abstractive Text SummarizationArticles	CodeCode Available	1
Towards quantitative precision for ECG analysis: Leveraging state space models, self-supervision and patient metadata	Aug 29, 2023	BenchmarkingDiagnostic	CodeCode Available	1
Matbench Discovery -- A framework to evaluate machine learning crystal stability predictions	Aug 28, 2023	BenchmarkingFormation Energy	CodeCode Available	3

Show:10 25 50

← PrevPage 302 of 555Next →

Benchmark Results

#	Model	Metric	Claimed	Verified	Status
1	GPT-4 Turbo	ACC	0.56	—	Unverified