| Scaling Language Models: Methods, Analysis & Insights from Training Gopher | Dec 8, 2021 | Abstract AlgebraAnachronisms | CodeCode Available | 2 |
| HALO: Hierarchical Autonomous Logic-Oriented Orchestration for Multi-Agent LLM Systems | May 17, 2025 | Arithmetic ReasoningCode Generation | CodeCode Available | 1 |
| CMoralEval: A Moral Evaluation Benchmark for Chinese Large Language Models | Aug 19, 2024 | DiversityLanguage Modeling | CodeCode Available | 1 |
| Evaluating the Moral Beliefs Encoded in LLMs | Jul 26, 2023 | Moral ScenariosSurvey | CodeCode Available | 1 |
| "Oops, Did I Just Say That?" Testing and Repairing Unethical Suggestions of Large Language Models with Suggest-Critique-Reflect Process | May 4, 2023 | Moral Scenarios | CodeCode Available | 1 |
| Measurement of LLM's Philosophies of Human Nature | Apr 3, 2025 | Moral Scenarios | CodeCode Available | 0 |
| Enhancing LLM Reasoning with Multi-Path Collaborative Reactive and Reflection agents | Dec 31, 2024 | Moral Scenarios | —Unverified | 0 |
| M^3oralBench: A MultiModal Moral Benchmark for LVLMs | Dec 30, 2024 | Moral Scenarios | CodeCode Available | 0 |
| Fine-Tuning Language Models for Ethical Ambiguity: A Comparative Study of Alignment with Human Responses | Oct 10, 2024 | Moral Scenarios | —Unverified | 0 |
| The Moral Turing Test: Evaluating Human-LLM Alignment in Moral Decision-Making | Oct 9, 2024 | Decision MakingMoral Scenarios | —Unverified | 0 |