SOTAVerified

Benchmarking

Papers

Showing 926950 of 5548 papers

TitleStatusHype
Benchmarking deep inverse models over time, and the neural-adjoint methodCode1
A Call to Reflect on Evaluation Practices for Failure Detection in Image ClassificationCode1
Benchmarking Offline Reinforcement Learning on Real-Robot HardwareCode1
AnomalyHop: An SSL-based Image Anomaly Localization MethodCode1
Evaluating Multimodal Representations on Visual Semantic Textual SimilarityCode1
Evaluation of large language models for discovery of gene set functionCode1
Benchmarking Natural Language Understanding Services for building Conversational AgentsCode1
Evaluating Adversarial Attacks on ImageNet: A Reality Check on Misclassification ClassesCode1
Benchmarking Deep Learning Interpretability in Time Series PredictionsCode1
Benchmarking Multimodal Variational Autoencoders: CdSprites+ Dataset and ToolkitCode1
Guardians of Image Quality: Benchmarking Defenses Against Adversarial Attacks on Image Quality MetricsCode1
An Open-source Benchmark of Deep Learning Models for Audio-visual Apparent and Self-reported Personality RecognitionCode1
Benchmarking Deep Models for Salient Object DetectionCode1
Benchmarking Multi-Scene Fire and Smoke DetectionCode1
Evaluating Attribution for Graph Neural NetworksCode1
Benchmarking Deep Reinforcement Learning for Navigation in Denied Sensor EnvironmentsCode1
CSAW-M: An Ordinal Classification Dataset for Benchmarking Mammographic Masking of CancerCode1
Benchmarking Neural Network Generalization for Grammar InductionCode1
Data-Driven Denoising of Stationary Accelerometer SignalsCode1
Curious Hierarchical Actor-Critic Reinforcement LearningCode1
Benchmarking emergency department triage prediction models with machine learning and large public electronic health recordsCode1
Benchmarking Multimodal Knowledge Conflict for Large Multimodal ModelsCode1
Benchmarking Detection Transfer Learning with Vision TransformersCode1
3DYoga90: A Hierarchical Video Dataset for Yoga Pose UnderstandingCode1
Benchmarking Multi-modal Semantic Segmentation under Sensor Failures: Missing and Noisy Modality RobustnessCode1
Show:102550
← PrevPage 38 of 222Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1GPT-4 TurboACC0.56Unverified