SOTAVerified

Benchmarking

Papers

Showing 23812390 of 5548 papers

TitleStatusHype
VL-ICL Bench: The Devil in the Details of Multimodal In-Context LearningCode2
Real-IAD: A Real-World Multi-View Dataset for Benchmarking Versatile Industrial Anomaly DetectionCode3
MELTing point: Mobile Evaluation of Language TransformersCode1
Benchmarking Badminton Action Recognition with a New Fine-Grained Dataset0
AlphaFin: Benchmarking Financial Analysis with Retrieval-Augmented Stock-Chain FrameworkCode3
ERASE: Benchmarking Feature Selection Methods for Deep Recommender SystemsCode1
Embarrassingly Simple Scribble Supervision for 3D Medical Segmentation0
OpenEval: Benchmarking Chinese LLMs across Capability, Alignment and Safety0
NovelQA: Benchmarking Question Answering on Documents Exceeding 200K TokensCode1
Align and Distill: Unifying and Improving Domain Adaptive Object DetectionCode1
Show:102550
← PrevPage 239 of 555Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1GPT-4 TurboACC0.56Unverified