| Benchmarking of Query Strategies: Towards Future Deep Active Learning | Dec 10, 2023 | Active LearningBenchmarking | CodeCode Available | 0 |
| Semi-Supervised Learning for Anomaly Traffic Detection via Bidirectional Normalizing Flows | Mar 13, 2024 | Anomaly DetectionBenchmarking | CodeCode Available | 0 |
| A Context-Aware Citation Recommendation Model with BERT and Graph Convolutional Networks | Mar 15, 2019 | BenchmarkingCitation Recommendation | CodeCode Available | 0 |
| Named Clinical Entity Recognition Benchmark | Oct 7, 2024 | BenchmarkingDecoder | CodeCode Available | 0 |
| EvalxNLP: A Framework for Benchmarking Post-Hoc Explainability Methods on NLP Models | May 2, 2025 | Benchmarking | CodeCode Available | 0 |
| Evaluating the Transferability of Machine-Learned Force Fields for Material Property Modeling | Jan 10, 2023 | BenchmarkingGraph Neural Network | CodeCode Available | 0 |
| Evaluating the Systematic Reasoning Abilities of Large Language Models through Graph Coloring | Feb 10, 2025 | Benchmarking | CodeCode Available | 0 |
| Evaluating the Robustness of Deep Reinforcement Learning for Autonomous Policies in a Multi-agent Urban Driving Environment | Dec 22, 2021 | Autonomous DrivingBenchmarking | CodeCode Available | 0 |
| Watts: Infrastructure for Open-Ended Learning | Apr 28, 2022 | Benchmarking | CodeCode Available | 0 |
| Evaluating the Ability of LLMs to Solve Semantics-Aware Process Mining Tasks | Jul 2, 2024 | Activity PredictionAnomaly Detection | CodeCode Available | 0 |
| A Thorough Performance Benchmarking on Lightweight Embedding-based Recommender Systems | Jun 25, 2024 | BenchmarkingCollaborative Filtering | CodeCode Available | 0 |
| SemSegBench & DetecBench: Benchmarking Reliability and Generalization Beyond Classification | May 23, 2025 | BenchmarkingClassification | CodeCode Available | 0 |
| Separating form and meaning: Using self-consistency to quantify task understanding across multiple senses | May 19, 2023 | BenchmarkingForm | CodeCode Available | 0 |
| Unsupervised Novelty Detection Methods Benchmarking with Wavelet Decomposition | Sep 11, 2024 | BenchmarkingNovelty Detection | CodeCode Available | 0 |
| Evaluating Shallow and Deep Neural Networks for Network Intrusion Detection Systems in Cyber Security | Oct 8, 2018 | BenchmarkingBIG-bench Machine Learning | CodeCode Available | 0 |
| Transparent and Scrutable Recommendations Using Natural Language User Profiles | Feb 8, 2024 | BenchmarkingDescriptive | CodeCode Available | 0 |
| SenseShift6D: Multimodal RGB-D Benchmarking for Robust 6D Pose Estimation across Environment and Sensor Variations | Jul 8, 2025 | 6D Pose Estimation6D Pose Estimation using RGB | CodeCode Available | 0 |
| SensorBench: Benchmarking LLMs in Coding-Based Sensor Processing | Oct 14, 2024 | BenchmarkingManagement | CodeCode Available | 0 |
| A Comprehensive Summarization and Evaluation of Feature Refinement Modules for CTR Prediction | Nov 8, 2023 | BenchmarkingClick-Through Rate Prediction | CodeCode Available | 0 |
| Navigating Out-of-Distribution Electricity Load Forecasting during COVID-19: Benchmarking energy load forecasting models without and with continual learning | Sep 8, 2023 | BenchmarkingContinual Learning | CodeCode Available | 0 |
| Evaluating SAT and SMT Solvers on Large-Scale Sudoku Puzzles | Jan 15, 2025 | Benchmarking | CodeCode Available | 0 |
| NbBench: Benchmarking Language Models for Comprehensive Nanobody Tasks | May 4, 2025 | BenchmarkingRepresentation Learning | CodeCode Available | 0 |
| NCAdapt: Dynamic adaptation with domain-specific Neural Cellular Automata for continual hippocampus segmentation | Oct 30, 2024 | BenchmarkingContinual Learning | CodeCode Available | 0 |
| A Systematic Review of Green AI | Jan 26, 2023 | Benchmarking | CodeCode Available | 0 |
| Evaluating LLP Methods: Challenges and Approaches | Oct 29, 2023 | BenchmarkingModel Selection | CodeCode Available | 0 |