| SPINEX_ Symbolic Regression: Similarity-based Symbolic Regression with Explainable Neighbors Exploration | Nov 5, 2024 | Benchmarkingregression | —Unverified | 0 |
| On the Loss of Context-awareness in General Instruction Fine-tuning | Nov 5, 2024 | BenchmarkingInstruction Following | CodeCode Available | 0 |
| Imagining and building wise machines: The centrality of AI metacognition | Nov 4, 2024 | BenchmarkingNavigate | —Unverified | 0 |
| Benchmarking XAI Explanations with Human-Aligned Evaluations | Nov 4, 2024 | Benchmarking | —Unverified | 0 |
| SinaTools: Open Source Toolkit for Arabic Natural Language Processing | Nov 3, 2024 | BenchmarkingLemmatization | —Unverified | 0 |
| Varco Arena: A Tournament Approach to Reference-Free Benchmarking Large Language Models | Nov 2, 2024 | Benchmarking | —Unverified | 0 |
| FEET: A Framework for Evaluating Embedding Techniques | Nov 2, 2024 | BenchmarkingRepresentation Learning | CodeCode Available | 0 |
| Artificial Intelligence for Microbiology and Microbiome Research | Nov 2, 2024 | BenchmarkingDeep Learning | —Unverified | 0 |
| Benchmarking Bias in Large Language Models during Role-Playing | Nov 1, 2024 | BenchmarkingFairness | —Unverified | 0 |
| Cityscape-Adverse: Benchmarking Robustness of Semantic Segmentation with Realistic Scene Modifications via Diffusion-Based Image Editing | Nov 1, 2024 | BenchmarkingSemantic Segmentation | CodeCode Available | 0 |