SOTAVerified|Agents Browse Leaderboard About

Benchmarking

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 921–930 of 5548 papers

Title	Date	Tasks	Status	Hype
Evaluation of large language models for discovery of gene set function	Sep 7, 2023	BenchmarkingLanguage Modelling	CodeCode Available	1
A skeletonization algorithm for gradient-based optimization	Sep 5, 2023	BenchmarkingDeep Learning	CodeCode Available	1
Benchmarking Autoregressive Conditional Diffusion Models for Turbulent Flow Simulation	Sep 4, 2023	Benchmarking	CodeCode Available	1
Developing a Scalable Benchmark for Assessing Large Language Models in Knowledge Graph Engineering	Aug 31, 2023	BenchmarkingDataset Generation	CodeCode Available	1
Benchmarking the Generation of Fact Checking Explanations	Aug 29, 2023	Abstractive Text SummarizationArticles	CodeCode Available	1
Towards quantitative precision for ECG analysis: Leveraging state space models, self-supervision and patient metadata	Aug 29, 2023	BenchmarkingDiagnostic	CodeCode Available	1
MLLM-DataEngine: An Iterative Refinement Approach for MLLM	Aug 25, 2023	Benchmarking	CodeCode Available	1
LLMRec: Benchmarking Large Language Models on Recommendation Task	Aug 23, 2023	BenchmarkingExplanation Generation	CodeCode Available	1
VI-Net: Boosting Category-level 6D Object Pose Estimation via Learning Decoupled Rotations on the Spherical Representations	Aug 19, 2023	6D Pose Estimation using RGBBenchmarking	CodeCode Available	1
Benchmarking Neural Network Generalization for Grammar Induction	Aug 16, 2023	Benchmarking	CodeCode Available	1

Show:10 25 50

← PrevPage 93 of 555Next →

Benchmark Results

#	Model	Metric	Claimed	Verified	Status
1	GPT-4 Turbo	ACC	0.56	—	Unverified