| GPT-4 Is Too Smart To Be Safe: Stealthy Chat with LLMs via Cipher | Aug 12, 2023 | EthicsRed Teaming | CodeCode Available | 2 |
| Getting pwn'd by AI: Penetration Testing with Large Language Models | Jul 24, 2023 | EthicsTask Planning | CodeCode Available | 2 |
| Scaling Language Models: Methods, Analysis & Insights from Training Gopher | Dec 8, 2021 | Abstract AlgebraAnachronisms | CodeCode Available | 2 |
| Aligning AI With Shared Human Values | Aug 5, 2020 | Ethicsreinforcement-learning | CodeCode Available | 2 |
| XTRUST: On the Multilingual Trustworthiness of Large Language Models | Sep 24, 2024 | EthicsFairness | CodeCode Available | 1 |
| Has Multimodal Learning Delivered Universal Intelligence in Healthcare? A Comprehensive Survey | Aug 23, 2024 | Ethics | CodeCode Available | 1 |
| Language Model Alignment in Multilingual Trolley Problems | Jul 2, 2024 | Decision MakingEthics | CodeCode Available | 1 |
| MoralBench: Moral Evaluation of LLMs | Jun 6, 2024 | Ethics | CodeCode Available | 1 |
| MedSafetyBench: Evaluating and Improving the Medical Safety of Large Language Models | Mar 6, 2024 | EthicsGeneral Knowledge | CodeCode Available | 1 |
| NewsBench: A Systematic Evaluation Framework for Assessing Editorial Capabilities of Large Language Models in Chinese Journalism | Feb 29, 2024 | EthicsMultiple-choice | CodeCode Available | 1 |