| QuantBench: Benchmarking AI Methods for Quantitative Investment | Apr 24, 2025 | BenchmarkingContinual Learning | —Unverified | 0 |
| From Past to Present: A Survey of Malicious URL Detection Techniques, Datasets and Code Repositories | Apr 23, 2025 | Benchmarking | CodeCode Available | 0 |
| MAYA: Addressing Inconsistencies in Generative Password Guessing through a Unified Benchmark | Apr 23, 2025 | Benchmarking | CodeCode Available | 0 |
| LongMamba: Enhancing Mamba's Long Context Capabilities via Training-Free Receptive Field Enlargement | Apr 22, 2025 | BenchmarkingLanguage Modeling | CodeCode Available | 1 |
| Enhancing TCR-Peptide Interaction Prediction with Pretrained Language Models and Molecular Representations | Apr 22, 2025 | BenchmarkingFew-Shot Learning | —Unverified | 0 |
| Benchmarking machine learning models for predicting aerofoil performance | Apr 22, 2025 | Benchmarking | —Unverified | 0 |
| Fluorescence Reference Target Quantitative Analysis Library | Apr 22, 2025 | Benchmarking | CodeCode Available | 0 |
| CLIRudit: Cross-Lingual Information Retrieval of Scientific Documents | Apr 22, 2025 | BenchmarkingCross-Lingual Information Retrieval | —Unverified | 0 |
| Benchmarking LLM for Code Smells Detection: OpenAI GPT-4.0 vs DeepSeek-V3 | Apr 22, 2025 | BenchmarkingLanguage Modeling | —Unverified | 0 |
| A Large-scale Class-level Benchmark Dataset for Code Generation with LLMs | Apr 22, 2025 | BenchmarkingClass-level Code Generation | —Unverified | 0 |