SOTAVerified

Benchmarking

Papers

Showing 851875 of 5548 papers

TitleStatusHype
Codabench: Flexible, Easy-to-Use and Reproducible Benchmarking PlatformCode1
A Survey on Graph Counterfactual Explanations: Definitions, Methods, Evaluation, and Research ChallengesCode1
Addressing Shortcomings in Fair Graph Learning Datasets: Towards a New BenchmarkCode1
Benchmarking Implicit Neural Representation and Geometric Rendering in Real-Time RGB-D SLAMCode1
Addressing the generalization of 3D registration methods with a featureless baseline and an unbiased benchmarkCode1
ICU-Sepsis: A Benchmark MDP Built from Real Medical DataCode1
AIPerf: Automated machine learning as an AI-HPC benchmarkCode1
IDToolkit: A Toolkit for Benchmarking and Developing Inverse Design Algorithms in NanophotonicsCode1
Illuminating Darkness: Enhancing Real-world Low-light Scenes with Smartphone ImagesCode1
Large Scale MRI Collection and Segmentation of Cirrhotic LiverCode1
Image Matching across Wide Baselines: From Paper to PracticeCode1
ImageNet-D: Benchmarking Neural Network Robustness on Diffusion Synthetic ObjectCode1
4DBInfer: A 4D Benchmarking Toolbox for Graph-Centric Predictive Modeling on Relational DBsCode1
ClearPose: Large-scale Transparent Object Dataset and BenchmarkCode1
An Evaluation Dataset for Intent Classification and Out-of-Scope PredictionCode1
Benchmarking Batch Deep Reinforcement Learning AlgorithmsCode1
CIDEr: Consensus-based Image Description EvaluationCode1
Improving and Benchmarking Offline Reinforcement Learning AlgorithmsCode1
AI in Lung Health: Benchmarking Detection and Diagnostic Models Across Multiple CT Scan DatasetsCode1
RGB-D Indiscernible Object Counting in Underwater ScenesCode1
CIPCaD-Bench: Continuous Industrial Process datasets for benchmarking Causal Discovery methodsCode1
Benchmarking Bias Mitigation Algorithms in Representation Learning through Fairness MetricsCode1
A Survey of Pathology Foundation Model: Progress and Future DirectionsCode1
A Comprehensive Benchmark for RNA 3D Structure-Function ModelingCode1
GEOM-Drugs Revisited: Toward More Chemically Accurate Benchmarks for 3D Molecule GenerationCode1
Show:102550
← PrevPage 35 of 222Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1GPT-4 TurboACC0.56Unverified