| Comics Datasets Framework: Mix of Comics datasets for detection benchmarking | Jul 3, 2024 | BenchmarkingObject | CodeCode Available | 1 |
| Social Bias in Large Language Models For Bangla: An Empirical Study on Gender and Religious Bias | Jul 3, 2024 | BenchmarkingBias Detection | CodeCode Available | 0 |
| CoIR: A Comprehensive Benchmark for Code Information Retrieval Models | Jul 3, 2024 | BenchmarkingCode Search | CodeCode Available | 2 |
| GraCoRe: Benchmarking Graph Comprehension and Complex Reasoning in Large Language Models | Jul 3, 2024 | Benchmarking | CodeCode Available | 1 |
| Emotion and Intent Joint Understanding in Multimodal Conversation: A Benchmarking Dataset | Jul 3, 2024 | BenchmarkingDiversity | CodeCode Available | 1 |
| TTSlow: Slow Down Text-to-Speech with Efficiency Robustness Evaluations | Jul 2, 2024 | Benchmarkingtext-to-speech | —Unverified | 0 |
| Evaluating the Ability of LLMs to Solve Semantics-Aware Process Mining Tasks | Jul 2, 2024 | Activity PredictionAnomaly Detection | CodeCode Available | 0 |
| Open foundation models for Azerbaijani language | Jul 2, 2024 | Benchmarking | —Unverified | 0 |
| Occlusion-Aware Seamless Segmentation | Jul 2, 2024 | BenchmarkingDomain Adaptation | CodeCode Available | 1 |
| MIRAI: Evaluating LLM Agents for Event Forecasting | Jul 1, 2024 | ArticlesBenchmarking | —Unverified | 0 |