| Text-Diffusion Red-Teaming of Large Language Models: Unveiling Harmful Behaviors with Proximity Constraints | Jan 14, 2025 | Large Language ModelRed Teaming | —Unverified | 0 |
| The Human Factor in AI Red Teaming: Perspectives from Social and Collaborative Computing | Jul 10, 2024 | FairnessRed Teaming | —Unverified | 0 |
| The Multilingual Alignment Prism: Aligning Global and Local Preferences to Reduce Harm | Jun 26, 2024 | Cross-Lingual TransferRed Teaming | —Unverified | 0 |
| The Promise and Peril of Artificial Intelligence -- Violet Teaming Offers a Balanced Path Forward | Aug 28, 2023 | EthicsPhilosophy | —Unverified | 0 |
| Tiny Refinements Elicit Resilience: Toward Efficient Prefix-Model Against LLM Red-Teaming | May 21, 2024 | Red Teaming | —Unverified | 0 |
| Towards medical AI misalignment: a preliminary study | May 22, 2025 | Red Teaming | —Unverified | 0 |
| Towards Publicly Accountable Frontier LLMs: Building an External Scrutiny Ecosystem under the ASPIRE Framework | Nov 15, 2023 | Red Teaming | —Unverified | 0 |
| Towards Red Teaming in Multimodal and Multilingual Translation | Jan 29, 2024 | Machine TranslationRed Teaming | —Unverified | 0 |
| Towards Secure MLOps: Surveying Attacks, Mitigation Strategies, and Research Challenges | May 30, 2025 | Red Teaming | —Unverified | 0 |
| Understanding and Mitigating Risks of Generative AI in Financial Services | Apr 25, 2025 | FairnessRed Teaming | —Unverified | 0 |