SOTAVerified

Benchmarking

Papers

Showing 49014925 of 5548 papers

TitleStatusHype
Fast Benchmarking of Asynchronous Multi-Fidelity Optimization on Zero-Cost BenchmarksCode0
Benchmarking Post-Hoc Interpretability Approaches for Transformer-based Misogyny DetectionCode0
Fast Benchmarking of Accuracy vs. Training Time with Cyclic Learning RatesCode0
Benchmarking Positional Encodings for GNNs and Graph TransformersCode0
Fast and accurate alignment of long bisulfite-seq readsCode0
Benchmarking Popular Classification Models' Robustness to Random and Targeted CorruptionsCode0
False Promises in Medical Imaging AI? Assessing Validity of Outperformance ClaimsCode0
Benchmarking Perturbation-based Saliency Maps for Explaining Atari AgentsCode0
Unsupervised Anomaly Detection in Multivariate Time Series across Heterogeneous DomainsCode0
Benchmarking person re-identification datasets and approaches for practical real-world implementationsCode0
FALCON: Feature-Label Constrained Graph Net Collapse for Memory Efficient GNNsCode0
FairX: A comprehensive benchmarking tool for model analysis using fairness, utility, and explainabilityCode0
Authentic Emotion Mapping: Benchmarking Facial Expressions in Real NewsCode0
Benchmarking performance of object detection under image distortions in an uncontrolled environmentCode0
GUNNEL: Guided Mixup Augmentation and Multi-View Fusion for Aquatic Animal SegmentationCode0
Multimodal Benchmarking and Recommendation of Text-to-Image Generation ModelsCode0
Segmenting France Across Four CenturiesCode0
Audio Explanation Synthesis with Generative Foundation ModelsCode0
Benchmarking Tropical Cyclone Rapid Intensification with Satellite Images and Attention-based Deep ModelsCode0
FailureSensorIQ: A Multi-Choice QA Dataset for Understanding Sensor Relationships and Failure ModesCode0
Can LLMs perform structured graph reasoning?Code0
Attention-based Class-Conditioned Alignment for Multi-Source Domain Adaptation of Object DetectorsCode0
Exploring Model-based Planning with Policy NetworksCode0
Exploring Context Generalizability in Citywide Crowd Mobility Prediction: An Analytic Framework and BenchmarkCode0
Multimodal Multi-User Surface Recognition with the Kernel Two-Sample TestCode0
Show:102550
← PrevPage 197 of 222Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1GPT-4 TurboACC0.56Unverified