SOTAVerified

Benchmarking

Papers

Showing 37513800 of 5548 papers

TitleStatusHype
Unsupervised Spectral Demosaicing with Lightweight Spectral Attention Networks0
OpenSiteRec: An Open Dataset for Site Recommendation0
A Synthetic Benchmarking Pipeline to Compare Camera Calibration Algorithms0
Conditionally Invariant Representation Learning for Disentangling Cellular Heterogeneity0
SysNoise: Exploring and Benchmarking Training-Deployment System Inconsistency0
InstructEval: Systematic Evaluation of Instruction Selection Methods0
Generative AI for Programming Education: Benchmarking ChatGPT, GPT-4, and Human Tutors0
Learning Environment Models with Continuous Stochastic Dynamics0
Principles and Guidelines for Evaluating Social Robot Navigation Algorithms0
Benchmarking Large Language Model Capabilities for Conditional Generation0
Emotion Analysis of Tweets Banning Education in Afghanistan0
Benchmarking Zero-Shot Recognition with Vision-Language Models: Challenges on Granularity and Specificity0
Effective Transfer of Pretrained Large Visual Model for Fabric Defect Segmentation via Specifc Knowledge Injection0
Benchmarking Stroke Forecasting with Stroke-Level Badminton Dataset0
Enhancing Navigation Benchmarking and Perception Data Generation for Row-based Crops in Simulation0
Pulse Shape-Aided Multipath Delay Estimation for Fine-Grained WiFi Sensing0
Paradigm Shift in Sustainability Disclosure Analysis: Empowering Stakeholders with CHATREPORT, a Language Model-Based Tool0
Hybrid Precoder and Combiner Designs for Decentralized Parameter Estimation in mmWave MIMO Wireless Sensor Networks0
Improving Reference-based Distinctive Image Captioning with Contrastive Rewards0
My Boli: Code-mixed Marathi-English Corpora, Pretrained Language Models and Evaluation Benchmarks0
OptIForest: Optimal Isolation Forest for Anomaly DetectionCode0
On Evaluation of Document Classification using RVL-CDIP0
Evaluation of Popular XAI Applied to Clinical Prediction Models: Can They be Trusted?0
A Comprehensive Study on the Robustness of Image Classification and Object Detection in Remote Sensing: Surveying and Benchmarking0
On-orbit model training for satellite imagery with label proportionsCode0
Diverse Community Data for Benchmarking Data Privacy Algorithms0
Did the Models Understand Documents? Benchmarking Models for Language Understanding in Document-Level Relation ExtractionCode0
Benchmarking Robustness of Deep Reinforcement Learning approaches to Online Portfolio Management0
Fairness Index Measures to Evaluate Bias in Biometric Recognition0
Using Motif Transitions for Temporal Graph GenerationCode0
Formal Covariate Benchmarking to Bound Omitted Variable Bias0
MA-BBOB: Many-Affine Combinations of BBOB Functions for Evaluating AutoML Approaches in Noiseless Numerical Black-Box Optimization Contexts0
Benchmarking Deep Learning Architectures for Urban Vegetation Point Cloud Semantic Segmentation from MLS0
Framework and Benchmarks for Combinatorial and Mixed-variable Bayesian Optimization0
ALP: Action-Aware Embodied Learning for Perception0
Acoustic Identification of Ae. aegypti Mosquitoes using Smartphone Apps and Residual Convolutional Neural NetworksCode0
Convolutional and Deep Learning based techniques for Time Series Ordinal Classification0
Dissecting Multimodality in VideoQA Transformer Models by Impairing Modality Fusion0
One Law, Many Languages: Benchmarking Multilingual Legal Reasoning for Judicial SupportCode0
Large-Scale Quantum Separability Through a Reproducible Machine Learning Lens0
DISC: a Dataset for Integrated Sensing and Communication in mmWave Systems0
DiPlomat: A Dialogue Dataset for Situated Pragmatic Reasoning0
BED: Bi-Encoder-Based Detectors for Out-of-Distribution DetectionCode0
Re-Benchmarking Pool-Based Active Learning for Binary ClassificationCode0
RRSIS: Referring Remote Sensing Image Segmentation0
MUBen: Benchmarking the Uncertainty of Molecular Representation ModelsCode0
A Cloud-based Machine Learning Pipeline for the Efficient Extraction of Insights from Customer Reviews0
detrex: Benchmarking Detection Transformers0
Contribution à l'Optimisation d'un Comportement Collectif pour un Groupe de Robots Autonomes0
A Large-Scale Analysis on Self-Supervised Video Representation Learning0
Show:102550
← PrevPage 76 of 111Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1GPT-4 TurboACC0.56Unverified