| Evaluating LLP Methods: Challenges and Approaches | Oct 29, 2023 | BenchmarkingModel Selection | CodeCode Available | 0 |
| Benchmark Generation Framework with Customizable Distortions for Image Classifier Robustness | Oct 28, 2023 | Benchmarkingimage-classification | CodeCode Available | 0 |
| OpenDMC: An Open-Source Library and Performance Evaluation for Deep-learning-based Multi-frame Compression | Oct 27, 2023 | BenchmarkingGPU | CodeCode Available | 0 |
| On General Language Understanding | Oct 27, 2023 | BenchmarkingEthics | —Unverified | 0 |
| OrionBench: Benchmarking Time Series Generative Models in the Service of the End-User | Oct 26, 2023 | Anomaly DetectionBenchmarking | —Unverified | 0 |
| Quantum Long Short-Term Memory (QLSTM) vs Classical LSTM in Time Series Forecasting: A Comparative Study in Solar Power Forecasting | Oct 25, 2023 | BenchmarkingHyperparameter Optimization | —Unverified | 0 |
| RDBench: ML Benchmark for Relational Databases | Oct 25, 2023 | Benchmarking | —Unverified | 0 |
| ConDefects: A New Dataset to Address the Data Leakage Concern for LLM-based Fault Localization and Program Repair | Oct 25, 2023 | BenchmarkingFault localization | —Unverified | 0 |
| XFEVER: Exploring Fact Verification across Languages | Oct 25, 2023 | BenchmarkingFact Verification | CodeCode Available | 0 |
| MLFMF: Data Sets for Machine Learning for Mathematical Formalization | Oct 24, 2023 | BenchmarkingRecommendation Systems | CodeCode Available | 1 |