SOTAVerified|Agents Browse Leaderboard About Blog

Benchmarking

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 2381–2390 of 5548 papers

Title	Date	Tasks	Status	Hype
VL-ICL Bench: The Devil in the Details of Multimodal In-Context Learning	Mar 19, 2024	BenchmarkingImage Captioning	CodeCode Available	2
Real-IAD: A Real-World Multi-View Dataset for Benchmarking Versatile Industrial Anomaly Detection	Mar 19, 2024	Anomaly DetectionBenchmarking	CodeCode Available	3
MELTing point: Mobile Evaluation of Language Transformers	Mar 19, 2024	BenchmarkingQuantization	CodeCode Available	1
Benchmarking Badminton Action Recognition with a New Fine-Grained Dataset	Mar 19, 2024	Action RecognitionBenchmarking	—Unverified	0
AlphaFin: Benchmarking Financial Analysis with Retrieval-Augmented Stock-Chain Framework	Mar 19, 2024	BenchmarkingFinancial Analysis	CodeCode Available	3
ERASE: Benchmarking Feature Selection Methods for Deep Recommender Systems	Mar 19, 2024	Benchmarkingfeature selection	CodeCode Available	1
Embarrassingly Simple Scribble Supervision for 3D Medical Segmentation	Mar 19, 2024	BenchmarkingSegmentation	—Unverified	0
OpenEval: Benchmarking Chinese LLMs across Capability, Alignment and Safety	Mar 18, 2024	BenchmarkingMathematical Reasoning	—Unverified	0
NovelQA: Benchmarking Question Answering on Documents Exceeding 200K Tokens	Mar 18, 2024	BenchmarkingQuestion Answering	CodeCode Available	1
Align and Distill: Unifying and Improving Domain Adaptive Object Detection	Mar 18, 2024	Benchmarkingobject-detection	CodeCode Available	1

Show:10 25 50

← PrevPage 239 of 555Next →

Benchmark Results

#	Model	Metric	Claimed	Verified	Status
1	GPT-4 Turbo	ACC	0.56	—	Unverified