SOTAVerified|Agents Browse Leaderboard About Blog

Benchmarking

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 2881–2890 of 5548 papers

Title	Date	Tasks	Status	Hype
Soft-Hard Attention U-Net Model and Benchmark Dataset for Multiscale Image Shadow Removal	Aug 7, 2024	BenchmarkingHard Attention	—Unverified	0
Online Model-based Anomaly Detection in Multivariate Time Series: Taxonomy, Survey, Research Challenges and Future Directions	Aug 7, 2024	Anomaly DetectionBenchmarking	—Unverified	0
Benchmarking In-the-wild Multimodal Disease Recognition and A Versatile Baseline	Aug 6, 2024	Benchmarking	—Unverified	0
From LLMs to LLM-based Agents for Software Engineering: A Survey of Current, Challenges and Future	Aug 5, 2024	BenchmarkingCode Generation	—Unverified	0
LMEMs for post-hoc analysis of HPO Benchmarking	Aug 5, 2024	BenchmarkingHyperparameter Optimization	CodeCode Available	0
MaterioMiner -- An ontology-based text mining dataset for extraction of process-structure-property entities	Aug 5, 2024	BenchmarkingGraph Generation	—Unverified	0
SPINEX-TimeSeries: Similarity-based Predictions with Explainable Neighbors Exploration for Time Series and Forecasting Problems	Aug 4, 2024	BenchmarkingComputational Efficiency	—Unverified	0
User-in-the-loop Evaluation of Multimodal LLMs for Activity Assistance	Aug 4, 2024	Action AnticipationBenchmarking	—Unverified	0
Deep Reinforcement Learning for Dynamic Order Picking in Warehouse Operations	Aug 3, 2024	BenchmarkingDeep Reinforcement Learning	—Unverified	0
Integrating Large Language Models and Knowledge Graphs for Extraction and Validation of Textual Test Data	Aug 3, 2024	BenchmarkingKnowledge Graphs	CodeCode Available	0

Show:10 25 50

← PrevPage 289 of 555Next →

Benchmark Results

#	Model	Metric	Claimed	Verified	Status
1	GPT-4 Turbo	ACC	0.56	—	Unverified