| LMEMs for post-hoc analysis of HPO Benchmarking | Aug 5, 2024 | BenchmarkingHyperparameter Optimization | CodeCode Available | 0 | 5 |
| InDL: A New Dataset and Benchmark for In-Diagram Logic Interpretation based on Visual Illusion | May 28, 2023 | BenchmarkingDecision Making | CodeCode Available | 0 | 5 |
| Integrating Expert Knowledge into Logical Programs via LLMs | Feb 17, 2025 | BenchmarkingLogical Reasoning | CodeCode Available | 0 | 5 |
| Improving the Perturbation-Based Explanation of Deepfake Detectors Through the Use of Adversarially-Generated Samples | Feb 6, 2025 | BenchmarkingDeepFake Detection | CodeCode Available | 0 | 5 |
| Benchmark Generation Framework with Customizable Distortions for Image Classifier Robustness | Oct 28, 2023 | Benchmarkingimage-classification | CodeCode Available | 0 | 5 |
| Improving Pretrained Models for Zero-shot Multi-label Text Classification through Reinforced Label Hierarchy Reasoning | Apr 4, 2021 | BenchmarkingMulti Label Text Classification | CodeCode Available | 0 | 5 |
| IndiBias: A Benchmark Dataset to Measure Social Biases in Language Models for Indian Context | Mar 29, 2024 | BenchmarkingSentence | CodeCode Available | 0 | 5 |
| BONES: a Benchmark fOr Neural Estimation of Shapley values | Jul 23, 2024 | Benchmarking | CodeCode Available | 0 | 5 |
| BOLD: Dataset and Metrics for Measuring Biases in Open-Ended Language Generation | Jan 27, 2021 | BenchmarkingText Generation | CodeCode Available | 0 | 5 |
| Benchmarking Instance-Centric Counterfactual Algorithms for XAI: From White Box to Black Box | Mar 4, 2022 | Benchmarkingcounterfactual | CodeCode Available | 0 | 5 |
| Using Color To Identify Insider Threats | Nov 25, 2021 | Benchmarking | CodeCode Available | 0 | 5 |
| Conditional diffusions for amortized neural posterior estimation | Oct 24, 2024 | Bayesian InferenceBenchmarking | CodeCode Available | 0 | 5 |
| Benchmarking datasets for Anomaly-based Network Intrusion Detection: KDD CUP 99 alternatives | Nov 13, 2018 | BenchmarkingIntrusion Detection | CodeCode Available | 0 | 5 |
| Improvements & Evaluations on the MLCommons CloudMask Benchmark | Mar 7, 2024 | Benchmarking | CodeCode Available | 0 | 5 |
| Improving Generalization of Neural Vehicle Routing Problem Solvers Through the Lens of Model Architecture | Jun 10, 2024 | BenchmarkingDecoder | CodeCode Available | 0 | 5 |
| Improved Multilingual Language Model Pretraining for Social Media Text via Translation Pair Prediction | Oct 20, 2021 | BenchmarkingLanguage Modeling | CodeCode Available | 0 | 5 |
| BN-AuthProf: Benchmarking Machine Learning for Bangla Author Profiling on Social Media Texts | Dec 3, 2024 | Age And Gender ClassificationAge and Gender Estimation | CodeCode Available | 0 | 5 |
| Improved Target-specific Stance Detection on Social Media Platforms by Delving into Conversation Threads | Nov 6, 2022 | BenchmarkingOpinion Mining | CodeCode Available | 0 | 5 |
| MST: Adaptive Multi-Scale Tokens Guided Interactive Segmentation | Jan 9, 2024 | BenchmarkingInteractive Segmentation | CodeCode Available | 0 | 5 |
| Importance of Disjoint Sampling in Conventional and Transformer Models for Hyperspectral Image Classification | Apr 23, 2024 | BenchmarkingHyperspectral Image Classification | CodeCode Available | 0 | 5 |
| Improve Machine Learning carbon footprint using Nvidia GPU and Mixed Precision training for classification models -- Part I | Sep 12, 2024 | BenchmarkingCPU | CodeCode Available | 0 | 5 |
| Benchmark data and method for real-time people counting in cluttered scenes using depth sensors | Apr 12, 2018 | Benchmarking | CodeCode Available | 0 | 5 |
| ImpliRet: Benchmarking the Implicit Fact Retrieval Challenge | Jun 17, 2025 | BenchmarkingRetrieval | CodeCode Available | 0 | 5 |
| ConQRet: Benchmarking Fine-Grained Evaluation of Retrieval Augmented Argumentation with LLM Judges | Dec 6, 2024 | BenchmarkingRetrieval | CodeCode Available | 0 | 5 |
| BLESS: Benchmarking Large Language Models on Sentence Simplification | Oct 24, 2023 | BenchmarkingDiversity | CodeCode Available | 0 | 5 |