SOTAVerified

Benchmarking

Papers

Showing 21012150 of 5548 papers

TitleStatusHype
ImmersePro: End-to-End Stereo Video Synthesis Via Implicit Disparity LearningCode0
Benchmarking Temporal Reasoning and Alignment Across Chinese DynastiesCode0
Immunofluorescence Capillary Imaging Segmentation: Cases StudyCode0
Benchmarking Suite for Synthetic Aperture Radar Imagery Anomaly Detection (SARIAD) AlgorithmsCode0
Illuminating the Diversity-Fitness Trade-Off in Black-Box OptimizationCode0
Benchmarking Subset Selection from Large Candidate Solution Sets in Evolutionary Multi-objective OptimizationCode0
A*3D Dataset: Towards Autonomous Driving in Challenging EnvironmentsCode0
Illusory VQA: Benchmarking and Enhancing Multimodal Models on Visual IllusionsCode0
Importance of Disjoint Sampling in Conventional and Transformer Models for Hyperspectral Image ClassificationCode0
Investigating the Impact of Hard Samples on Accuracy Reveals In-class Data ImbalanceCode0
Identifying Money Laundering Subgraphs on the BlockchainCode0
Identifying the Smallest Adversarial Load Perturbations that Render DC-OPF InfeasibleCode0
Benchmarking Spurious Bias in Few-Shot Image ClassifiersCode0
Identifying and Benchmarking Natural Out-of-Context Prediction ProblemsCode0
A Comprehensive Summarization and Evaluation of Feature Refinement Modules for CTR PredictionCode0
Benchmarking Spatiotemporal Reasoning in LLMs and Reasoning Models: Capabilities and ChallengesCode0
Audio Explanation Synthesis with Generative Foundation ModelsCode0
IdeaBench: Benchmarking Large Language Models for Research Idea GenerationCode0
HypoTermQA: Hypothetical Terms Dataset for Benchmarking Hallucination Tendency of LLMsCode0
Hyperspectral Image Dataset for Benchmarking on Salient Object DetectionCode0
IceBench: A Benchmark for Deep Learning based Sea Ice Type ClassificationCode0
Hyperopt-Sklearn: Automatic Hyperparameter Configuration for Scikit-LearnCode0
Benchmarking Single Image Dehazing and BeyondCode0
Hyperparameter-Free Losses for Model-Based Monocular ReconstructionCode0
Hyperbolic Benchmarking Unveils Network Topology-Feature Relationship in GNN PerformanceCode0
Benchmarking Sequential Visual Input Reasoning and Prediction in Multimodal Large Language ModelsCode0
HuSc3D: Human Sculpture dataset for 3D object reconstructionCode0
Benchmarking sentiment analysis methods for large-scale texts: A case for using continuum-scored words and word shift graphsCode0
Benchmarking Tropical Cyclone Rapid Intensification with Satellite Images and Attention-based Deep ModelsCode0
Hybrid Machine Learning Models of Classifying Residential Requests for Smart DispatchingCode0
Attention-based Class-Conditioned Alignment for Multi-Source Domain Adaptation of Object DetectorsCode0
Benchmarking Self-Supervised Learning Methods for Accelerated MRI ReconstructionCode0
AttackSeqBench: Benchmarking Large Language Models' Understanding of Sequential Patterns in Cyber AttacksCode0
Benchmarking Self-Supervised Contrastive Learning Methods for Image-Based Plant PhenotypingCode0
Hybrid Random FeaturesCode0
HSSBench: Benchmarking Humanities and Social Sciences Ability for Multimodal Large Language ModelsCode0
Benchmarking Scalable Methods for Streaming Cross Document Entity CoreferenceCode0
Benchmarking Scalable Epistemic Uncertainty Quantification in Organ SegmentationCode0
Benchmarking Safety Monitors for Image Classifiers with Machine LearningCode0
AttackNet: Enhancing Biometric Security via Tailored Convolutional Neural Network Architectures for Liveness DetectionCode0
IHCV: Discovery of Hidden Time-Dependent Control Variables in Non-Linear Dynamical SystemsCode0
How to Manage Tiny Machine Learning at Scale: An Industrial PerspectiveCode0
A Thorough Performance Benchmarking on Lightweight Embedding-based Recommender SystemsCode0
Benchmarking Robust Self-Supervised Learning Across Diverse Downstream TasksCode0
Benchmarking Robustness to Text-Guided CorruptionsCode0
AKFruitYield: Modular benchmarking and video analysis software for Azure Kinect cameras for fruit size and fruit yield estimation in apple orchardsCode0
Natural Image Noise DatasetCode0
Benchmarking Robustness of Endoscopic Depth Estimation with Synthetically Corrupted DataCode0
A Kernel-Based Approach for Accurate Steady-State Detection in Performance Time SeriesCode0
HRIBench: Benchmarking Vision-Language Models for Real-Time Human Perception in Human-Robot InteractionCode0
Show:102550
← PrevPage 43 of 111Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1GPT-4 TurboACC0.56Unverified