SOTAVerified

Benchmarking

Papers

Showing 21012125 of 5548 papers

TitleStatusHype
Hyperopt-Sklearn: Automatic Hyperparameter Configuration for Scikit-LearnCode0
Hybrid Random FeaturesCode0
Hyperbolic Benchmarking Unveils Network Topology-Feature Relationship in GNN PerformanceCode0
Hyperparameter-Free Losses for Model-Based Monocular ReconstructionCode0
Benchmarking Suite for Synthetic Aperture Radar Imagery Anomaly Detection (SARIAD) AlgorithmsCode0
HuSc3D: Human Sculpture dataset for 3D object reconstructionCode0
Benchmarking Subset Selection from Large Candidate Solution Sets in Evolutionary Multi-objective OptimizationCode0
A*3D Dataset: Towards Autonomous Driving in Challenging EnvironmentsCode0
Hybrid Machine Learning Models of Classifying Residential Requests for Smart DispatchingCode0
Identifying the Smallest Adversarial Load Perturbations that Render DC-OPF InfeasibleCode0
HRIBench: Benchmarking Vision-Language Models for Real-Time Human Perception in Human-Robot InteractionCode0
Benchmarking Spurious Bias in Few-Shot Image ClassifiersCode0
HRNET: AI on Edge for mask detection and social distancingCode0
A Comprehensive Summarization and Evaluation of Feature Refinement Modules for CTR PredictionCode0
Benchmarking Spatiotemporal Reasoning in LLMs and Reasoning Models: Capabilities and ChallengesCode0
Audio Explanation Synthesis with Generative Foundation ModelsCode0
How to Manage Tiny Machine Learning at Scale: An Industrial PerspectiveCode0
HR-VILAGE-3K3M: A Human Respiratory Viral Immunization Longitudinal Gene Expression Dataset for Systems ImmunityCode0
How Far Are We from Optimal Reasoning Efficiency?Code0
Towards Segment Anything Model (SAM) for Medical Image Segmentation: A SurveyCode0
HopaDIFF: Holistic-Partial Aware Fourier Conditioned Diffusion for Referring Human Action Segmentation in Multi-Person ScenariosCode0
Benchmarking Single Image Dehazing and BeyondCode0
HSSBench: Benchmarking Humanities and Social Sciences Ability for Multimodal Large Language ModelsCode0
Benchmarking Sequential Visual Input Reasoning and Prediction in Multimodal Large Language ModelsCode0
Benchmarking sentiment analysis methods for large-scale texts: A case for using continuum-scored words and word shift graphsCode0
Show:102550
← PrevPage 85 of 222Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1GPT-4 TurboACC0.56Unverified