SOTAVerified|Agents Browse Leaderboard About Blog

Benchmarking

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 3091–3100 of 5548 papers

Title	Date	Tasks	Status	Hype
JoinGym: An Efficient Query Optimization Environment for Reinforcement Learning	Jul 21, 2023	BenchmarkingCombinatorial Optimization	CodeCode Available	1
Decoding the Enigma: Benchmarking Humans and AIs on the Many Facets of Working Memory	Jul 20, 2023	BenchmarkingDecision Making	CodeCode Available	1
The Extractive-Abstractive Axis: Measuring Content "Borrowing" in Generative Language Models	Jul 20, 2023	Benchmarking	—Unverified	0
SciBench: Evaluating College-Level Scientific Problem-Solving Abilities of Large Language Models	Jul 20, 2023	BenchmarkingLanguage Modeling	CodeCode Available	1
Benchmarking Potential Based Rewards for Learning Humanoid Locomotion	Jul 19, 2023	BenchmarkingReinforcement Learning (RL)	CodeCode Available	2
On the Real-Time Semantic Segmentation of Aphid Clusters in the Wild	Jul 17, 2023	BenchmarkingReal-Time Semantic Segmentation	—Unverified	0
Efficient Prediction of Peptide Self-assembly through Sequential and Graphical Encoding	Jul 17, 2023	BenchmarkingDeep Learning	CodeCode Available	1
Examining the Effects of Degree Distribution and Homophily in Graph Learning Models	Jul 17, 2023	BenchmarkingGraph Clustering	CodeCode Available	1
Towards Heterogeneous Long-tailed Learning: Benchmarking, Metrics, and Toolbox	Jul 17, 2023	Benchmarking	CodeCode Available	1
Approaches for benchmarking single-cell gene regulatory network inference methods	Jul 17, 2023	Benchmarking	—Unverified	0

Show:10 25 50

← PrevPage 310 of 555Next →

Benchmark Results

#	Model	Metric	Claimed	Verified	Status
1	GPT-4 Turbo	ACC	0.56	—	Unverified