| Generative Models for Fast Simulation of Cherenkov Detectors at the Electron-Ion Collider | Apr 26, 2025 | BenchmarkingGPU | CodeCode Available | 0 | 5 |
| Benchmarking Intersectional Biases in NLP | Jul 1, 2022 | BenchmarkingBIG-bench Machine Learning | CodeCode Available | 0 | 5 |
| DFEE: Interactive DataFlow Execution and Evaluation Kit | Dec 4, 2022 | BenchmarkingScheduling | CodeCode Available | 0 | 5 |
| A Manually Annotated Image-Caption Dataset for Detecting Children in the Wild | Jun 11, 2025 | Age EstimationBenchmarking | CodeCode Available | 0 | 5 |
| Graph-theoretical approach to robust 3D normal extraction of LiDAR data | May 23, 2022 | Benchmarking | CodeCode Available | 0 | 5 |
| Benchmarking Commercial Intent Detection Services with Practice-Driven Evaluations | Dec 7, 2020 | BenchmarkingGoal-Oriented Dialog | CodeCode Available | 0 | 5 |
| GenderBench: Evaluation Suite for Gender Biases in LLMs | May 17, 2025 | Benchmarking | CodeCode Available | 0 | 5 |
| GECOBench: A Gender-Controlled Text Dataset and Benchmark for Quantifying Biases in Explanations | Jun 17, 2024 | BenchmarkingDataset Generation | CodeCode Available | 0 | 5 |
| GenCeption: Evaluate Multimodal LLMs with Unlabeled Unimodal Data | Feb 22, 2024 | Benchmarking | CodeCode Available | 0 | 5 |
| Generalization and Regularization in DQN | Sep 29, 2018 | Atari GamesBenchmarking | CodeCode Available | 0 | 5 |
| Arabic Speech Recognition by End-to-End, Modular Systems and Human | Jan 21, 2021 | Arabic Speech RecognitionAutomatic Speech Recognition | CodeCode Available | 0 | 5 |
| Detecting Stereotypes and Anti-stereotypes the Correct Way Using Social Psychological Underpinnings | Apr 4, 2025 | Benchmarking | CodeCode Available | 0 | 5 |
| Recognizing Object Affordances to Support Scene Reasoning for Manipulation Tasks | Sep 12, 2019 | Affordance DetectionAffordance Recognition | CodeCode Available | 0 | 5 |
| Detecting critical treatment effect bias in small subgroups | Apr 29, 2024 | BenchmarkingDecision Making | CodeCode Available | 0 | 5 |
| From Variability to Stability: Advancing RecSys Benchmarking Practices | Feb 15, 2024 | BenchmarkingCollaborative Filtering | CodeCode Available | 0 | 5 |
| Benchmarking Image Perturbations for Testing Automated Driving Assistance Systems | Jan 21, 2025 | Autonomous VehiclesBenchmarking | CodeCode Available | 0 | 5 |
| From raw affiliations to organization identifiers | May 12, 2025 | BenchmarkingMetadata quality | CodeCode Available | 0 | 5 |
| Affine Non-negative Collaborative Representation Based Pattern Classification | Jul 10, 2020 | BenchmarkingClassification | CodeCode Available | 0 | 5 |
| DEsignBench: Exploring and Benchmarking DALL-E 3 for Imagining Visual Design | Oct 23, 2023 | BenchmarkingImage Generation | CodeCode Available | 0 | 5 |
| From Past to Present: A Survey of Malicious URL Detection Techniques, Datasets and Code Repositories | Apr 23, 2025 | Benchmarking | CodeCode Available | 0 | 5 |
| Design and implementation of intelligent packet filtering in IoT microcontroller-based devices | May 30, 2023 | Benchmarking | CodeCode Available | 0 | 5 |
| Accurate Peak Detection in Multimodal Optimization via Approximated Landscape Learning | Mar 23, 2025 | Benchmarking | CodeCode Available | 0 | 5 |
| From Bytes to Borsch: Fine-Tuning Gemma and Mistral for the Ukrainian Language Representation | Apr 14, 2024 | BenchmarkingDiversity | CodeCode Available | 0 | 5 |
| A quantum-classical reinforcement learning model to play Atari games | Dec 11, 2024 | Atari GamesBenchmarking | CodeCode Available | 0 | 5 |
| From Modern CNNs to Vision Transformers: Assessing the Performance, Robustness, and Classification Strategies of Deep Learning Models in Histopathology | Apr 11, 2022 | BenchmarkingCancer Classification | CodeCode Available | 0 | 5 |
| From Knowledge to Reasoning: Evaluating LLMs for Ionic Liquids Research in Chemical and Biological Engineering | May 11, 2025 | BenchmarkingGeneral Knowledge | CodeCode Available | 0 | 5 |
| Dermatological Diagnosis Explainability Benchmark for Convolutional Neural Networks | Feb 23, 2023 | BenchmarkingMedical Diagnosis | CodeCode Available | 0 | 5 |
| Benchmarking Human and Automated Prompting in the Segment Anything Model | Oct 29, 2024 | BenchmarkingImage Segmentation | CodeCode Available | 0 | 5 |
| Depth Functions for Partial Orders with a Descriptive Analysis of Machine Learning Algorithms | Apr 19, 2023 | BenchmarkingDescriptive | CodeCode Available | 0 | 5 |
| Benchmarking histopathology foundation models in a multi-center dataset for skin cancer subtyping | Jun 23, 2025 | BenchmarkingDiversity | CodeCode Available | 0 | 5 |
| From MNIST to ImageNet and Back: Benchmarking Continual Curriculum Learning | Mar 16, 2023 | BenchmarkingContinual Learning | CodeCode Available | 0 | 5 |
| Fully Automatic Segmentation of Gross Target Volume and Organs-at-Risk for Radiotherapy Planning of Nasopharyngeal Carcinoma | Oct 4, 2023 | BenchmarkingSegmentation | CodeCode Available | 0 | 5 |
| Benchmarking HillVallEA for the GECCO 2019 Competition on Multimodal Optimization | Jul 25, 2019 | Benchmarking | CodeCode Available | 0 | 5 |
| Delving into Instance-Dependent Label Noise in Graph Data: A Comprehensive Study and Benchmark | Jun 14, 2025 | BenchmarkingGraph Learning | CodeCode Available | 0 | 5 |
| Benchmarking Hierarchical Script Knowledge | Jun 1, 2019 | Benchmarking | CodeCode Available | 0 | 5 |
| FRAMES-VQA: Benchmarking Fine-Tuning Robustness across Multi-Modal Shifts in Visual Question Answering | May 27, 2025 | BenchmarkingQuestion Answering | CodeCode Available | 0 | 5 |
| Delta-Influence: Unlearning Poisons via Influence Functions | Nov 20, 2024 | AttributeBenchmarking | CodeCode Available | 0 | 5 |
| Forecasting time series with constraints | Feb 14, 2025 | Additive modelsBenchmarking | CodeCode Available | 0 | 5 |
| FHBench: Towards Efficient and Personalized Federated Learning for Multimodal Healthcare | Apr 15, 2025 | BenchmarkingDiagnostic | CodeCode Available | 0 | 5 |
| Benchmarking Robustness in Object Detection: Autonomous Driving when Winter is Coming | Jul 17, 2019 | Autonomous DrivingBenchmarking | CodeCode Available | 0 | 5 |
| Forecasting Future International Events: A Reliable Dataset for Text-Based Event Modeling | Nov 21, 2024 | ArticlesBenchmarking | CodeCode Available | 0 | 5 |
| Aesthetic Image Captioning From Weakly-Labelled Photographs | Aug 29, 2019 | Aesthetic Image CaptioningBenchmarking | CodeCode Available | 0 | 5 |
| Defense-friendly Images in Adversarial Attacks: Dataset and Metrics for Perturbation Difficulty | Nov 5, 2020 | Adversarial AttackBenchmarking | CodeCode Available | 0 | 5 |
| DefAn: Definitive Answer Dataset for LLMs Hallucination Evaluation | Jun 13, 2024 | BenchmarkingHallucination | CodeCode Available | 0 | 5 |
| Forecasting Across Time Series Databases using Recurrent Neural Networks on Groups of Similar Series: A Clustering Approach | Oct 9, 2017 | BenchmarkingClustering | CodeCode Available | 0 | 5 |
| FORLORN: A Framework for Comparing Offline Methods and Reinforcement Learning for Optimization of RAN Parameters | Sep 8, 2022 | Benchmarkingcontinuous-control | CodeCode Available | 0 | 5 |
| Fluorescence Reference Target Quantitative Analysis Library | Apr 22, 2025 | Benchmarking | CodeCode Available | 0 | 5 |
| Finding the Perfect Fit: Applying Regression Models to ClimateBench v1.0 | Aug 23, 2023 | Benchmarkingregression | CodeCode Available | 0 | 5 |
| Benchmarking Hallucination in Large Language Models based on Unanswerable Math Word Problem | Mar 6, 2024 | BenchmarkingHallucination | CodeCode Available | 0 | 5 |
| Benchmarking Graph Representations and Graph Neural Networks for Multivariate Time Series Classification | Jan 14, 2025 | BenchmarkingGraph Representation Learning | CodeCode Available | 0 | 5 |