SOTAVerified

Benchmarking

Papers

Showing 52015225 of 5548 papers

TitleStatusHype
2017 Robotic Instrument Segmentation ChallengeCode0
AI Fairness 360: An Extensible Toolkit for Detecting, Understanding, and Mitigating Unwanted Algorithmic BiasCode0
Benchmarking Intersectional Biases in NLPCode0
Benchmarking Commercial Intent Detection Services with Practice-Driven EvaluationsCode0
Towards Fair and Privacy-Preserving Federated Deep ModelsCode0
SPDEBench: An Extensive Benchmark for Learning Regular and Singular Stochastic PDEsCode0
Deep Neural Network Benchmarks for Selective ClassificationCode0
Abstraction Alignment: Comparing Model-Learned and Human-Encoded Conceptual RelationshipsCode0
Arabic Speech Recognition by End-to-End, Modular Systems and HumanCode0
Benchmarking Image Perturbations for Testing Automated Driving Assistance SystemsCode0
Deep Metric Learning Meets Deep Clustering: An Novel Unsupervised Approach for Feature EmbeddingCode0
Deepened Graph Auto-Encoders Help Stabilize and Enhance Link PredictionCode0
Oral Imaging for Malocclusion Issues Assessments: OMNI Dataset, Deep Learning Baselines and BenchmarkingCode0
Orchestrator-Agent Trust: A Modular Agentic AI Visual Classification System with Trust-Aware Orchestration and RAG-Based ReasoningCode0
ORCHID: A Chinese Debate Corpus for Target-Independent Stance Detection and Argumentative Dialogue SummarizationCode0
Benchmarking Human and Automated Prompting in the Segment Anything ModelCode0
Speech Self-Supervised Representation Benchmarking: Are We Doing it Right?Code0
Deep Emotion Recognition in Textual Conversations: A SurveyCode0
Neural Style Transfer Improves 3D Cardiovascular MR Image Segmentation on Inconsistent DataCode0
OSS-Bench: Benchmark Generator for Coding LLMsCode0
DeepDrug3D: Classification of ligand-binding pockets in proteins with a convolutional neural networkCode0
deepCR: Cosmic Ray Rejection with Deep LearningCode0
A quantum-classical reinforcement learning model to play Atari gamesCode0
Towards Ground-truth-free Evaluation of Any Segmentation in Medical ImagesCode0
Deep Attention Driven Reinforcement Learning (DAD-RL) for Autonomous Decision-Making in Dynamic EnvironmentCode0
Show:102550
← PrevPage 209 of 222Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1GPT-4 TurboACC0.56Unverified