SOTAVerified

Benchmarking

Papers

Showing 21762200 of 5548 papers

TitleStatusHype
Benchmarking Procedural Language Understanding for Low-Resource Languages: A Case Study on TurkishCode0
How to Manage Tiny Machine Learning at Scale: An Industrial PerspectiveCode0
Benchmarking Probabilistic Deep Learning Methods for License Plate RecognitionCode0
Benchmarking pre-trained text embedding models in aligning built asset informationCode0
Benchmarking Pre-trained Language Models for Multilingual NER: TraSpaS at the BSNLP2021 Shared TaskCode0
How Far Are We from Optimal Reasoning Efficiency?Code0
Towards Segment Anything Model (SAM) for Medical Image Segmentation: A SurveyCode0
A survey of probabilistic generative frameworks for molecular simulationsCode0
Benchmarking Post-Training Quantization in LLMs: Comprehensive Taxonomy, Unified Evaluation, and Comparative AnalysisCode0
Benchmarking Positional Encodings for GNNs and Graph TransformersCode0
Benchmarking Post-Hoc Interpretability Approaches for Transformer-based Misogyny DetectionCode0
HopaDIFF: Holistic-Partial Aware Fourier Conditioned Diffusion for Referring Human Action Segmentation in Multi-Person ScenariosCode0
HRIBench: Benchmarking Vision-Language Models for Real-Time Human Perception in Human-Robot InteractionCode0
IceBench: A Benchmark for Deep Learning based Sea Ice Type ClassificationCode0
Benchmarking Popular Classification Models' Robustness to Random and Targeted CorruptionsCode0
Benchmarking Perturbation-based Saliency Maps for Explaining Atari AgentsCode0
Benchmarking person re-identification datasets and approaches for practical real-world implementationsCode0
Benchmarking performance of object detection under image distortions in an uncontrolled environmentCode0
High-Quality, ROS Compatible Video Encoding and Decoding for High-Definition DatasetsCode0
High-Dynamic-Range Imaging for Cloud SegmentationCode0
Hi Guys or Hi Folks? Benchmarking Gender-Neutral Machine Translation with the GeNTE CorpusCode0
Benchmarking Pathology Foundation Models: Adaptation Strategies and ScenariosCode0
AstroVision: Towards Autonomous Feature Detection and Description for Missions to Small Bodies Using Deep LearningCode0
Hi-EF: Benchmarking Emotion Forecasting in Human-interactionCode0
Benchmarking Parameter Control Methods in Differential Evolution for Mixed-Integer Black-Box OptimizationCode0
Show:102550
← PrevPage 88 of 222Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1GPT-4 TurboACC0.56Unverified