| Revisiting a Pain in the Neck: Semantic Phrase Processing Benchmark for Language Models | May 5, 2024 | Benchmarking | CodeCode Available | 0 |
| AI-enabled Sound Pattern Recognition on Asthma Medication Adherence: Evaluation with the RDA Benchmark Suite | May 30, 2022 | BenchmarkingBIG-bench Machine Learning | CodeCode Available | 0 |
| BioVFM-21M: Benchmarking and Scaling Self-Supervised Vision Foundation Models for Biomedical Image Analysis | May 14, 2025 | BenchmarkingComputational Efficiency | CodeCode Available | 0 |
| Illuminating the Diversity-Fitness Trade-Off in Black-Box Optimization | Aug 29, 2024 | BenchmarkingDiversity | CodeCode Available | 0 |
| Revisiting Hate Speech Benchmarks: From Data Curation to System Deployment | Jun 1, 2023 | BenchmarkingHate Speech Detection | CodeCode Available | 0 |
| Local manifold learning and its link to domain-based physics knowledge | Jul 1, 2022 | BenchmarkingDimensionality Reduction | CodeCode Available | 0 |
| LOCO-EPI: Leave-one-chromosome-out (LOCO) as a benchmarking paradigm for deep learning based prediction of enhancer-promoter interactions | Apr 1, 2025 | Benchmarking | CodeCode Available | 0 |
| IJCB 2022 Mobile Behavioral Biometrics Competition (MobileB2C) | Oct 6, 2022 | Benchmarking | CodeCode Available | 0 |
| Why Stop at One Error? Benchmarking LLMs as Data Science Code Debuggers for Multi-Hop and Multi-Bug Errors | Mar 28, 2025 | BenchmarkingCode Generation | CodeCode Available | 0 |
| BioSentVec: creating sentence embeddings for biomedical texts | Oct 22, 2018 | ArticlesBenchmarking | CodeCode Available | 0 |
| LogicCat: A Chain-of-Thought Text-to-SQL Benchmark for Multi-Domain Reasoning Challenges | May 24, 2025 | BenchmarkingMathematical Reasoning | CodeCode Available | 0 |
| IHCV: Discovery of Hidden Time-Dependent Control Variables in Non-Linear Dynamical Systems | Apr 5, 2023 | Benchmarking | CodeCode Available | 0 |
| Identifying the Smallest Adversarial Load Perturbations that Render DC-OPF Infeasible | Jul 10, 2025 | Adversarial AttackBenchmarking | CodeCode Available | 0 |
| LogoNet: a fine-grained network for instance-level logo sketch retrieval | Apr 5, 2023 | 2kBenchmarking | CodeCode Available | 0 |
| Identifying Money Laundering Subgraphs on the Blockchain | Oct 10, 2024 | Benchmarking | CodeCode Available | 0 |
| Identifying and Benchmarking Natural Out-of-Context Prediction Problems | Oct 25, 2021 | Benchmarking | CodeCode Available | 0 |
| Analysis | OPEN | Published: 17 June 2019 Multitask learning and benchmarking with clinical time series data | Jun 17, 2019 | BenchmarkingBIG-bench Machine Learning | CodeCode Available | 0 |
| IdeaBench: Benchmarking Large Language Models for Research Idea Generation | Oct 31, 2024 | Benchmarkingscientific discovery | CodeCode Available | 0 |
| IceBench: A Benchmark for Deep Learning based Sea Ice Type Classification | Mar 22, 2025 | BenchmarkingClassification | CodeCode Available | 0 |
| BioFors: A Large Biomedical Image Forensics Dataset | Aug 30, 2021 | BenchmarkingImage Forensics | CodeCode Available | 0 |
| Benchmarking Attribution Methods with Relative Feature Importance | Jul 23, 2019 | BenchmarkingFeature Importance | CodeCode Available | 0 |
| HypoTermQA: Hypothetical Terms Dataset for Benchmarking Hallucination Tendency of LLMs | Feb 25, 2024 | BenchmarkingChatbot | CodeCode Available | 0 |
| Hyperspectral Image Dataset for Benchmarking on Salient Object Detection | Jun 29, 2018 | BenchmarkingObject | CodeCode Available | 0 |
| Long-Term Visitation Value for Deep Exploration in Sparse Reward Reinforcement Learning | Jan 1, 2020 | Benchmarkingreinforcement-learning | CodeCode Available | 0 |
| Look Across Elapse: Disentangled Representation Learning and Photorealistic Cross-Age Face Synthesis for Age-Invariant Face Recognition | Sep 2, 2018 | Age-Invariant Face RecognitionBenchmarking | CodeCode Available | 0 |