SOTAVerified|Agents Browse Leaderboard About Blog

Benchmarking

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 1731–1740 of 5548 papers

Title	Date	Tasks	Status	Hype
No Dataset Needed for Downstream Knowledge Benchmarking: Response Dispersion Inversely Correlates with Accuracy on Domain-specific QA	Aug 24, 2024	BenchmarkingChatbot	—Unverified	0
Variational Autoencoder for Anomaly Detection: A Comparative Study	Aug 24, 2024	Anomaly DetectionBenchmarking	CodeCode Available	1
S3Simulator: A benchmarking Side Scan Sonar Simulator dataset for Underwater Image Analysis	Aug 23, 2024	Benchmarking	CodeCode Available	0
Top Score on the Wrong Exam: On Benchmarking in Machine Learning for Vulnerability Detection	Aug 23, 2024	BenchmarkingBinary Classification	—Unverified	0
Open Llama2 Model for the Lithuanian Language	Aug 23, 2024	Benchmarkingmodel	—Unverified	0
Benchmarking Counterfactual Interpretability in Deep Learning Models for Time Series Classification	Aug 22, 2024	Benchmarkingcounterfactual	—Unverified	0
Scribbles for All: Benchmarking Scribble Supervised Segmentation Across Datasets	Aug 22, 2024	AllBenchmarking	CodeCode Available	1
MultiMed: Massively Multimodal and Multitask Medical Understanding	Aug 22, 2024	BenchmarkingMedical Question Answering	—Unverified	0
Extraction of Research Objectives, Machine Learning Model Names, and Dataset Names from Academic Papers and Analysis of Their Interrelationships Using LLM and Network Analysis	Aug 22, 2024	Benchmarking	—Unverified	0
Dynamic PDB: A New Dataset and a SE(3) Model Extension by Integrating Dynamic Behaviors and Physical Properties in Protein Structures	Aug 22, 2024	BenchmarkingTrajectory Prediction	—Unverified	0

Show:10 25 50

← PrevPage 174 of 555Next →

Benchmark Results

#	Model	Metric	Claimed	Verified	Status
1	GPT-4 Turbo	ACC	0.56	—	Unverified