SOTAVerified

Benchmarking

Papers

Showing 22012250 of 5548 papers

TitleStatusHype
AI Fairness 360: An Extensible Toolkit for Detecting, Understanding, and Mitigating Unwanted Algorithmic BiasCode0
Benchmarking optimality of time series classification methods in distinguishing diffusionsCode0
Identifying and Benchmarking Natural Out-of-Context Prediction ProblemsCode0
4D-Bench: Benchmarking Multi-modal Large Language Models for 4D Object UnderstandingCode0
IdeaBench: Benchmarking Large Language Models for Research Idea GenerationCode0
Identifying Money Laundering Subgraphs on the BlockchainCode0
Hyperspectral Image Dataset for Benchmarking on Salient Object DetectionCode0
HypoTermQA: Hypothetical Terms Dataset for Benchmarking Hallucination Tendency of LLMsCode0
Hyperparameter-Free Losses for Model-Based Monocular ReconstructionCode0
A Stepwise, Label-based Approach for Improving the Adversarial Training in Unsupervised Video SummarizationCode0
Hyperbolic Benchmarking Unveils Network Topology-Feature Relationship in GNN PerformanceCode0
Benchmarking of Query Strategies: Towards Future Deep Active LearningCode0
Hybrid Random FeaturesCode0
Hyperopt-Sklearn: Automatic Hyperparameter Configuration for Scikit-LearnCode0
IceBench: A Benchmark for Deep Learning based Sea Ice Type ClassificationCode0
Benchmarking of LSTM NetworksCode0
A comparison of translation performance between DeepL and SupertextCode0
Benchmarking of image registration methods for differently stained histological slidesCode0
HuSc3D: Human Sculpture dataset for 3D object reconstructionCode0
HR-VILAGE-3K3M: A Human Respiratory Viral Immunization Longitudinal Gene Expression Dataset for Systems ImmunityCode0
Assigning Species Information to Corresponding Genes by a Sequence Labeling FrameworkCode0
HSSBench: Benchmarking Humanities and Social Sciences Ability for Multimodal Large Language ModelsCode0
HRNET: AI on Edge for mask detection and social distancingCode0
Hybrid Machine Learning Models of Classifying Residential Requests for Smart DispatchingCode0
Towards Segment Anything Model (SAM) for Medical Image Segmentation: A SurveyCode0
How Far Are We from Optimal Reasoning Efficiency?Code0
HopaDIFF: Holistic-Partial Aware Fourier Conditioned Diffusion for Referring Human Action Segmentation in Multi-Person ScenariosCode0
HOEG: A New Approach for Object-Centric Predictive Process MonitoringCode0
3D fluorescence microscopy data synthesis for segmentation and benchmarkingCode0
How to Manage Tiny Machine Learning at Scale: An Industrial PerspectiveCode0
Hi Guys or Hi Folks? Benchmarking Gender-Neutral Machine Translation with the GeNTE CorpusCode0
High-Quality, ROS Compatible Video Encoding and Decoding for High-Definition DatasetsCode0
BOND: Benchmarking Unsupervised Outlier Node Detection on Static Attributed GraphsCode0
High-Dynamic-Range Imaging for Cloud SegmentationCode0
Hierarchical Neural Networks for Sequential Sentence Classification in Medical Scientific AbstractsCode0
HERMES: Holographic Equivariant neuRal network model for Mutational Effect and Stability predictionCode0
ASR Benchmarking: Need for a More Representative Conversational DatasetCode0
Benchmarking Neural Machine Translation for Southern African LanguagesCode0
Benchmarking neural embeddings for link prediction in knowledge graphs under semantic and structural changesCode0
Heterogeneous Datasets for Federated Survival Analysis SimulationCode0
Harnessing Orthogonality to Train Low-Rank Neural NetworksCode0
Harmonization Benchmarking Tool for Neuroimaging DatasetsCode0
Aspect-based Sentiment Classification with Aspect-specific Graph Convolutional NetworksCode0
Hardware Aware Neural Network Architectures using FbNetCode0
HATE-ITA: New Baselines for Hate Speech Detection in ItalianCode0
HammerBench: Fine-Grained Function-Calling Evaluation in Real Mobile Device ScenariosCode0
Dynamic Neighborhood Construction for Structured Large Discrete Action SpacesCode0
gym-gazebo2, a toolkit for reinforcement learning using ROS 2 and GazeboCode0
Hard-Label Cryptanalytic Extraction of Neural Network ModelsCode0
Hi-EF: Benchmarking Emotion Forecasting in Human-interactionCode0
Show:102550
← PrevPage 45 of 111Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1GPT-4 TurboACC0.56Unverified