SOTAVerified

Benchmarking

Papers

Showing 40014050 of 5548 papers

TitleStatusHype
The importance of being constrained: dealing with infeasible solutions in Differential Evolution and beyondCode1
Systematic Comparison of Path Planning Algorithms using PathBench0
Multi-channel deep convolutional neural networks for multi-classifying thyroid disease0
Automated Machine Learning: A Case Study on Non-Intrusive Appliance Load Monitoring0
A Large-scale Comprehensive Dataset and Copy-overlap Aware Evaluation Protocol for Segment-level Video Copy DetectionCode1
Just Rank: Rethinking Evaluation with Word and Sentence SimilaritiesCode1
Benchmarking real-time algorithms for in-phase auditory stimulation of low amplitude slow waves with wearable EEG devices during sleep0
Benchmarking Instance-Centric Counterfactual Algorithms for XAI: From White Box to Black BoxCode0
Graph clustering with Boltzmann machines0
Towards Benchmarking and Evaluating Deepfake Detection0
KamNet: An Integrated Spatiotemporal Deep Neural Network for Rare Event Search in KamLAND-ZenCode0
HOI4D: A 4D Egocentric Dataset for Category-Level Human-Object InteractionCode1
Mukayese: Turkish NLP Strikes BackCode1
3D Common Corruptions and Data AugmentationCode1
Adaptive Gradient Methods with Local Guarantees0
Benchmarking Robustness of Deep Learning Classifiers Using Two-Factor Perturbation0
Reliable validation of Reinforcement Learning Benchmarks0
A predictive analytics approach for stroke prediction using machine learning and neural networksCode0
Towards IID representation learning and its application on biomedical dataCode0
GraphWorld: Fake Graphs Bring Real Insights for GNNsCode1
PMC-Patients: A Large-scale Dataset of Patient Summaries and Relations for Benchmarking Retrieval-based Clinical Decision Support SystemsCode1
Towards Class-agnostic Tracking Using Feature Decorrelation in Point Clouds0
Prepare for Trouble and Make it Double. Supervised and Unsupervised Stacking for AnomalyBased Intrusion Detection0
Generalised Gaussian Process Latent Variable Models (GPLVM) with Stochastic Variational Inference0
Spatio-Temporal Latent Graph Structure Learning for Traffic Forecasting0
SUTD-PRCM Dataset and Neural Architecture Search Approach for Complex Metasurface Design0
Measuring CLEVRness: Blackbox testing of Visual Reasoning Models0
Benchmarking Generative Latent Variable Models for SpeechCode0
Evaluating Feature Attribution Methods in the Image DomainCode0
Benchmarking the Linear Algebra Awareness of TensorFlow and PyTorchCode0
How to Manage Tiny Machine Learning at Scale: An Industrial PerspectiveCode0
Rethinking Pareto Frontier for Performance Evaluation of Deep Neural Networks0
MultiRes-NetVLAD: Augmenting Place Recognition Training with Low-Resolution ImageryCode1
Benchmarking missing-values approaches for predictive models on health databasesCode0
On loss functions and evaluation metrics for music source separation0
Benchmarking of DL Libraries and Models on Mobile DevicesCode1
Benchmarking Online Sequence-to-Sequence and Character-based Handwriting Recognition from IMU-Enhanced Pens0
Benchmarking Robot Manipulation with the Rubik's Cube0
MetaShift: A Dataset of Datasets for Evaluating Contextual Distribution Shifts and Training ConflictsCode1
Wukong: A 100 Million Large-scale Chinese Cross-modal Pre-training BenchmarkCode0
Dual Task Framework for Improving Persona-grounded Dialogue Dataset0
High Fidelity RF Clutter Modeling and Simulation0
Lightweight Jet Reconstruction and Identification as an Object Detection Task0
BIQ2021: A Large-Scale Blind Image Quality Assessment Database0
ECRECer: Enzyme Commission Number Recommendation and Benchmarking based on Multiagent Dual-core LearningCode1
Comparative Study Between Distance Measures On Supervised Optimum-Path Forest ClassificationCode0
What are the best systems? New perspectives on NLP BenchmarkingCode1
RECOVER: sequential model optimization platform for combination drug repurposing identifies novel synergistic compounds in vitroCode1
Theory-inspired Parameter Control Benchmarks for Dynamic Algorithm ConfigurationCode0
Benchmarking and Analyzing Point Cloud Classification under CorruptionsCode1
Show:102550
← PrevPage 81 of 111Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1GPT-4 TurboACC0.56Unverified