| Class-incremental Learning for Time Series: Benchmark and Evaluation | Feb 19, 2024 | Activity RecognitionBenchmarking | CodeCode Available | 2 |
| AnaloBench: Benchmarking the Identification of Abstract and Long-context Analogies | Feb 19, 2024 | Benchmarking | CodeCode Available | 0 |
| Revisiting Zeroth-Order Optimization for Memory-Efficient LLM Fine-Tuning: A Benchmark | Feb 18, 2024 | Benchmarking | CodeCode Available | 2 |
| Benchmarking Knowledge Boundary for Large Language Models: A Different Perspective on Model Evaluation | Feb 18, 2024 | BenchmarkingLanguage Modeling | CodeCode Available | 1 |
| PEDANTS: Cheap but Effective and Interpretable Answer Equivalence | Feb 17, 2024 | BenchmarkingForm | CodeCode Available | 2 |
| VATr++: Choose Your Words Wisely for Handwritten Text Generation | Feb 16, 2024 | BenchmarkingText Generation | —Unverified | 0 |
| Learning Disentangled Audio Representations through Controlled Synthesis | Feb 16, 2024 | BenchmarkingDisentanglement | —Unverified | 0 |
| Benchmarking federated strategies in Peer-to-Peer Federated learning for biomedical data | Feb 15, 2024 | BenchmarkingFederated Learning | —Unverified | 0 |
| Large-scale Benchmarking of Metaphor-based Optimization Heuristics | Feb 15, 2024 | BenchmarkingExperimental Design | —Unverified | 0 |
| The Butterfly Effect of Model Editing: Few Edits Can Trigger Large Language Models Collapse | Feb 15, 2024 | BenchmarkingModel Editing | CodeCode Available | 0 |