SOTAVerified

Benchmarking

Papers

Showing 36263650 of 5548 papers

TitleStatusHype
Para-Lane: Multi-Lane Dataset Registering Parallel Scans for Benchmarking Novel View Synthesis0
Parsing Any Domain English text to CoNLL dependencies0
Participatory Personalization in Classification0
'Part'ly first among equals: Semantic part-based benchmarking for state-of-the-art object recognition systems0
PASTA: A Dataset for Modeling Participant States in Narratives0
PatentNet: A Large-Scale Incomplete Multiview, Multimodal, Multilabel Industrial Goods Image Database0
PathBench: A Benchmarking Platform for Classical and Learned Path Planning Algorithms0
PathBench: A comprehensive comparison benchmark for pathology foundation models towards precision oncology0
Patherea: Cell Detection and Classification for the 2020s0
Pathway: a fast and flexible unified stream data processing framework for analytical and Machine Learning applications0
Patterns of Convergence and Bound Constraint Violation in Differential Evolution on SBOX-COST Benchmarking Suite0
PawPrint: Whose Footprints Are These? Identifying Animal Individuals by Their Footprints0
Perception Test 2023: A Summary of the First Challenge And Outcome0
Perception Test 2024: Challenge Summary and a Novel Hour-Long VideoQA Benchmark0
Benchmarking the Performance and Energy Efficiency of AI Accelerators for AI Training0
Performance Benchmarking of Psychomotor Skills Using Wearable Devices: An Application in Sport0
Performance Comparison of Surrogate-Assisted Evolutionary Algorithms on Computational Fluid Dynamics Problems0
Performance Evaluation Methodology for Long-Term Visual Object Tracking0
Performance Evaluation of Transcriptomics Data Normalization for Survival Risk Prediction0
Performance-Guided LLM Knowledge Distillation for Efficient Text Classification at Scale0
Performance of large language models in numerical vs. semantic medical knowledge: Benchmarking on evidence-based Q&As0
Performance prediction of data streams on high-performance architecture0
Periocular Recognition in the Wild with Orthogonal Combination of Local Binary Coded Pattern in Dual-stream Convolutional Neural Network0
PerMedCQA: Benchmarking Large Language Models on Medical Consumer Question Answering in Persian Language0
WeQA: A Benchmark for Retrieval Augmented Generation in Wind Energy Domain0
Show:102550
← PrevPage 146 of 222Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1GPT-4 TurboACC0.56Unverified