| GenTel-Safe: A Unified Benchmark and Shielding Framework for Defending Against Prompt Injection Attacks | Sep 29, 2024 | Benchmarking | —Unverified | 0 |
| AstroMLab 2: AstroLLaMA-2-70B Model and Benchmarking Specialised LLMs for Astronomy | Sep 29, 2024 | AstronomyBenchmarking | —Unverified | 0 |
| SciDoc2Diagrammer-MAF: Towards Generation of Scientific Diagrams from Documents guided by Multi-Aspect Feedback Refinement | Sep 28, 2024 | BenchmarkingCode Generation | —Unverified | 0 |
| Data Analysis in the Era of Generative AI | Sep 27, 2024 | Benchmarking | —Unverified | 0 |
| Constructing Confidence Intervals for 'the' Generalization Error -- a Comprehensive Benchmark Study | Sep 27, 2024 | Benchmarkingtabular-regression | CodeCode Available | 0 |
| MCUBench: A Benchmark of Tiny Object Detectors on MCUs | Sep 27, 2024 | BenchmarkingModel Selection | —Unverified | 0 |
| CLLMate: A Multimodal Benchmark for Weather and Climate Events Forecasting | Sep 27, 2024 | ArticlesBenchmarking | —Unverified | 0 |
| bnRep: A repository of Bayesian networks from the academic literature | Sep 27, 2024 | Benchmarking | —Unverified | 0 |
| EarthquakeNPP: Benchmark Datasets for Earthquake Forecasting with Neural Point Processes | Sep 27, 2024 | BenchmarkingDataset Generation | —Unverified | 0 |
| Conformal Prediction: A Theoretical Note and Benchmarking Transductive Node Classification in Graphs | Sep 26, 2024 | BenchmarkingConformal Prediction | CodeCode Available | 0 |