| HateBench: Benchmarking Hate Speech Detectors on LLM-Generated Content and Hate Campaigns | Jan 28, 2025 | Adversarial AttackBenchmarking | CodeCode Available | 1 | 5 |
| Hopfield-Enhanced Deep Neural Networks for Artifact-Resilient Brain State Decoding | Nov 6, 2023 | BenchmarkingData Compression | CodeCode Available | 1 | 5 |
| Are LLMs Capable of Data-based Statistical and Causal Reasoning? Benchmarking Advanced Quantitative Reasoning with Data | Feb 27, 2024 | Benchmarking | CodeCode Available | 1 | 5 |
| Benchmarking Meaning Representations in Neural Semantic Parsing | Nov 1, 2020 | BenchmarkingSemantic Parsing | CodeCode Available | 1 | 5 |
| ARLBench: Flexible and Efficient Benchmarking for Hyperparameter Optimization in Reinforcement Learning | Sep 27, 2024 | AutoMLBenchmarking | CodeCode Available | 1 | 5 |
| Benchmarking Meta-embeddings: What Works and What Does Not | Nov 1, 2021 | BenchmarkingEmbeddings Evaluation | CodeCode Available | 1 | 5 |
| AgentSense: Benchmarking Social Intelligence of Language Agents through Interactive Scenarios | Oct 25, 2024 | BenchmarkingDiversity | CodeCode Available | 1 | 5 |
| Benchmarking Micro-action Recognition: Dataset, Methods, and Applications | Mar 8, 2024 | Action RecognitionBenchmarking | CodeCode Available | 1 | 5 |
| Generative Wind Power Curve Modeling Via Machine Vision: A Self-learning Deep Convolutional Network Based Method | Aug 19, 2021 | BenchmarkingSynthetic Data Generation | CodeCode Available | 1 | 5 |
| Benchmarking Large Language Models for News Summarization | Jan 31, 2023 | BenchmarkingNews Summarization | CodeCode Available | 1 | 5 |