| Adaptive Visual Scene Understanding: Incremental Scene Graph Generation | Oct 2, 2023 | BenchmarkingContinual Learning | CodeCode Available | 0 |
| Who is ChatGPT? Benchmarking LLMs' Psychological Portrayal Using PsychoBench | Oct 2, 2023 | BenchmarkingSafety Alignment | CodeCode Available | 1 |
| A New Real-World Video Dataset for the Comparison of Defogging Algorithms | Oct 2, 2023 | BenchmarkingDeblurring | —Unverified | 0 |
| NewsRecLib: A PyTorch-Lightning Library for Neural News Recommendation | Oct 2, 2023 | BenchmarkingNews Recommendation | CodeCode Available | 1 |
| TRAM: Benchmarking Temporal Reasoning for Large Language Models | Oct 2, 2023 | BenchmarkingFew-Shot Learning | —Unverified | 0 |
| CoDBench: A Critical Evaluation of Data-driven Models for Continuous Dynamical Systems | Oct 2, 2023 | BenchmarkingComputational Efficiency | —Unverified | 0 |
| FELM: Benchmarking Factuality Evaluation of Large Language Models | Oct 1, 2023 | BenchmarkingMath | CodeCode Available | 1 |
| RoleLLM: Benchmarking, Eliciting, and Enhancing Role-Playing Abilities of Large Language Models | Oct 1, 2023 | Benchmarking | CodeCode Available | 2 |
| Adaptive Control of an Inverted Pendulum by a Reinforcement Learning-based LQR Method | Sep 30, 2023 | BenchmarkingReinforcement Learning (RL) | —Unverified | 0 |
| The Sparsity Roofline: Understanding the Hardware Limits of Sparse Neural Networks | Sep 30, 2023 | Benchmarking | —Unverified | 0 |