SOTAVerified|Agents Browse Leaderboard About Blog

Benchmarking

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 3521–3530 of 5548 papers

Title	Date	Tasks	Status	Hype
Model Agnostic Explainable Selective Regression via Uncertainty Estimation	Nov 15, 2023	Benchmarkingmodel	—Unverified	0
Benchmarking Individual Tree Mapping with Sub-meter Imagery	Nov 14, 2023	BenchmarkingSegmentation	—Unverified	0
On Using Distribution-Based Compositionality Assessment to Evaluate Compositional Generalisation in Machine Translation	Nov 14, 2023	BenchmarkingMachine Translation	CodeCode Available	0
The Disagreement Problem in Faithfulness Metrics	Nov 13, 2023	BenchmarkingExplainable artificial intelligence	—Unverified	0
Uncertainty estimation of machine learning spatial precipitation predictions from satellite data	Nov 13, 2023	BenchmarkingFeature Importance	—Unverified	0
MEGAVERSE: Benchmarking Large Language Models Across Languages, Modalities, Models and Tasks	Nov 13, 2023	Benchmarking	—Unverified	0
Connecting the Dots: Graph Neural Network Powered Ensemble and Classification of Medical Images	Nov 13, 2023	BenchmarkingClassification	CodeCode Available	0
Identification of vortex in unstructured mesh with graph neural networks	Nov 11, 2023	BenchmarkingGraph Generation	—Unverified	0
SeaTurtleID2022: A long-span dataset for reliable sea turtle re-identification	Nov 9, 2023	BenchmarkingInstance Segmentation	—Unverified	0
Prompt Sketching for Large Language Models	Nov 8, 2023	Arithmetic ReasoningBenchmarking	—Unverified	0

Show:10 25 50

← PrevPage 353 of 555Next →

Benchmark Results

#	Model	Metric	Claimed	Verified	Status
1	GPT-4 Turbo	ACC	0.56	—	Unverified