| Building a Large Scale Dataset for Image Emotion Recognition: The Fine Print and The Benchmark | May 9, 2016 | BenchmarkingEmotion Recognition | CodeCode Available | 0 | 5 |
| Comparative Analysis: Violence Recognition from Videos using Transfer Learning | Aug 26, 2024 | Action RecognitionBenchmarking | CodeCode Available | 0 | 5 |
| Inverse Contextual Bandits: Learning How Behavior Evolves over Time | Jul 13, 2021 | BenchmarkingDecision Making | CodeCode Available | 0 | 5 |
| Investigating the Impact of Hard Samples on Accuracy Reveals In-class Data Imbalance | Sep 22, 2024 | AutoMLBenchmarking | CodeCode Available | 0 | 5 |
| InViG: Benchmarking Interactive Visual Grounding with 500K Human-Robot Interactions | Oct 18, 2023 | BenchmarkingVisual Grounding | CodeCode Available | 0 | 5 |
| Bugs in the Data: How ImageNet Misrepresents Biodiversity | Aug 24, 2022 | BenchmarkingObject Detection | CodeCode Available | 0 | 5 |
| CleanPatrick: A Benchmark for Image Data Cleaning | May 16, 2025 | BenchmarkingLabel Error Detection | CodeCode Available | 0 | 5 |
| BubGAN: Bubble Generative Adversarial Networks for Synthesizing Realistic Bubbly Flow Images | Sep 7, 2018 | Benchmarking | CodeCode Available | 0 | 5 |
| Introducing SLAMBench, a performance and accuracy benchmarking methodology for SLAM | Oct 8, 2014 | Benchmarking | CodeCode Available | 0 | 5 |
| bsnsing: A decision tree induction method based on recursive optimal boolean rule composition | May 30, 2022 | Benchmarking | CodeCode Available | 0 | 5 |
| BSBench: will your LLM find the largest prime number? | Jun 5, 2025 | Benchmarking | CodeCode Available | 0 | 5 |
| Adaptive Shrinkage Estimation For Personalized Deep Kernel Regression In Modeling Brain Trajectories | Apr 10, 2025 | Additive modelsBenchmarking | CodeCode Available | 0 | 5 |
| MixMAS: A Framework for Sampling-Based Mixer Architecture Search for Multimodal Fusion and Learning | Dec 24, 2024 | Benchmarking | CodeCode Available | 0 | 5 |
| INTERSPEECH 2009 Emotion Challenge Revisited: Benchmarking 15 Years of Progress in Speech Emotion Recognition | Jun 10, 2024 | BenchmarkingEmotion Recognition | CodeCode Available | 0 | 5 |
| JALMBench: Benchmarking Jailbreak Vulnerabilities in Audio Language Models | May 23, 2025 | BenchmarkingDiversity | CodeCode Available | 0 | 5 |
| Towards Learning Universal, Regional, and Local Hydrological Behaviors via Machine-Learning Applied to Large-Sample Datasets | Jul 19, 2019 | BenchmarkingBIG-bench Machine Learning | CodeCode Available | 0 | 5 |
| Bridging the Generalisation Gap: Synthetic Data Generation for Multi-Site Clinical Model Validation | Apr 29, 2025 | BenchmarkingFairness | CodeCode Available | 0 | 5 |
| Adaptive Power System Emergency Control using Deep Reinforcement Learning | Mar 9, 2019 | BenchmarkingDeep Reinforcement Learning | CodeCode Available | 0 | 5 |
| InstaIndoor and Multi-modal Deep Learning for Indoor Scene Recognition | Dec 23, 2021 | BenchmarkingDeep Learning | CodeCode Available | 0 | 5 |
| BRI3L: A Brightness Illusion Image Dataset for Identification and Localization of Regions of Illusory Perception | Feb 7, 2024 | Benchmarking | CodeCode Available | 0 | 5 |
| Benchmarking Abstract and Reasoning Abilities Through A Theoretical Perspective | May 28, 2025 | BenchmarkingMemorization | CodeCode Available | 0 | 5 |
| inMOTIFin: a lightweight end-to-end simulation software for regulatory sequences | Jun 25, 2025 | Benchmarking | CodeCode Available | 0 | 5 |
| Benchmarking 6DOF Outdoor Visual Localization in Changing Conditions | Jul 28, 2017 | Autonomous VehiclesBenchmarking | CodeCode Available | 0 | 5 |
| BoxingGym: Benchmarking Progress in Automated Experimental Design and Model Discovery | Jan 2, 2025 | BenchmarkingExperimental Design | CodeCode Available | 0 | 5 |
| AnaloBench: Benchmarking the Identification of Abstract and Long-context Analogies | Feb 19, 2024 | Benchmarking | CodeCode Available | 0 | 5 |
| LMEMs for post-hoc analysis of HPO Benchmarking | Aug 5, 2024 | BenchmarkingHyperparameter Optimization | CodeCode Available | 0 | 5 |
| InDL: A New Dataset and Benchmark for In-Diagram Logic Interpretation based on Visual Illusion | May 28, 2023 | BenchmarkingDecision Making | CodeCode Available | 0 | 5 |
| Integrating Expert Knowledge into Logical Programs via LLMs | Feb 17, 2025 | BenchmarkingLogical Reasoning | CodeCode Available | 0 | 5 |
| Improving the Perturbation-Based Explanation of Deepfake Detectors Through the Use of Adversarially-Generated Samples | Feb 6, 2025 | BenchmarkingDeepFake Detection | CodeCode Available | 0 | 5 |
| Benchmark Generation Framework with Customizable Distortions for Image Classifier Robustness | Oct 28, 2023 | Benchmarkingimage-classification | CodeCode Available | 0 | 5 |
| Improving Pretrained Models for Zero-shot Multi-label Text Classification through Reinforced Label Hierarchy Reasoning | Apr 4, 2021 | BenchmarkingMulti Label Text Classification | CodeCode Available | 0 | 5 |
| IndiBias: A Benchmark Dataset to Measure Social Biases in Language Models for Indian Context | Mar 29, 2024 | BenchmarkingSentence | CodeCode Available | 0 | 5 |
| BONES: a Benchmark fOr Neural Estimation of Shapley values | Jul 23, 2024 | Benchmarking | CodeCode Available | 0 | 5 |
| BOLD: Dataset and Metrics for Measuring Biases in Open-Ended Language Generation | Jan 27, 2021 | BenchmarkingText Generation | CodeCode Available | 0 | 5 |
| Benchmarking Instance-Centric Counterfactual Algorithms for XAI: From White Box to Black Box | Mar 4, 2022 | Benchmarkingcounterfactual | CodeCode Available | 0 | 5 |
| Using Color To Identify Insider Threats | Nov 25, 2021 | Benchmarking | CodeCode Available | 0 | 5 |
| Conditional diffusions for amortized neural posterior estimation | Oct 24, 2024 | Bayesian InferenceBenchmarking | CodeCode Available | 0 | 5 |
| Benchmarking datasets for Anomaly-based Network Intrusion Detection: KDD CUP 99 alternatives | Nov 13, 2018 | BenchmarkingIntrusion Detection | CodeCode Available | 0 | 5 |
| Improvements & Evaluations on the MLCommons CloudMask Benchmark | Mar 7, 2024 | Benchmarking | CodeCode Available | 0 | 5 |
| Improving Generalization of Neural Vehicle Routing Problem Solvers Through the Lens of Model Architecture | Jun 10, 2024 | BenchmarkingDecoder | CodeCode Available | 0 | 5 |
| Improved Multilingual Language Model Pretraining for Social Media Text via Translation Pair Prediction | Oct 20, 2021 | BenchmarkingLanguage Modeling | CodeCode Available | 0 | 5 |
| BN-AuthProf: Benchmarking Machine Learning for Bangla Author Profiling on Social Media Texts | Dec 3, 2024 | Age And Gender ClassificationAge and Gender Estimation | CodeCode Available | 0 | 5 |
| Improved Target-specific Stance Detection on Social Media Platforms by Delving into Conversation Threads | Nov 6, 2022 | BenchmarkingOpinion Mining | CodeCode Available | 0 | 5 |
| MST: Adaptive Multi-Scale Tokens Guided Interactive Segmentation | Jan 9, 2024 | BenchmarkingInteractive Segmentation | CodeCode Available | 0 | 5 |
| Importance of Disjoint Sampling in Conventional and Transformer Models for Hyperspectral Image Classification | Apr 23, 2024 | BenchmarkingHyperspectral Image Classification | CodeCode Available | 0 | 5 |
| Improve Machine Learning carbon footprint using Nvidia GPU and Mixed Precision training for classification models -- Part I | Sep 12, 2024 | BenchmarkingCPU | CodeCode Available | 0 | 5 |
| Benchmark data and method for real-time people counting in cluttered scenes using depth sensors | Apr 12, 2018 | Benchmarking | CodeCode Available | 0 | 5 |
| ImpliRet: Benchmarking the Implicit Fact Retrieval Challenge | Jun 17, 2025 | BenchmarkingRetrieval | CodeCode Available | 0 | 5 |
| ConQRet: Benchmarking Fine-Grained Evaluation of Retrieval Augmented Argumentation with LLM Judges | Dec 6, 2024 | BenchmarkingRetrieval | CodeCode Available | 0 | 5 |
| BLESS: Benchmarking Large Language Models on Sentence Simplification | Oct 24, 2023 | BenchmarkingDiversity | CodeCode Available | 0 | 5 |