| Attention Eclipse: Manipulating Attention to Bypass LLM Safety-Alignment | Feb 21, 2025 | Safety Alignment | —Unverified | 0 |
| Assessing the Brittleness of Safety Alignment via Pruning and Low-Rank Modifications | Feb 7, 2024 | Safety Alignment | —Unverified | 0 |
| "Haet Bhasha aur Diskrimineshun": Phonetic Perturbations in Code-Mixed Hinglish to Red-Team LLMs | May 20, 2025 | Image GenerationRed Teaming | —Unverified | 0 |
| Cognitive Overload: Jailbreaking Large Language Models with Overloaded Logical Thinking | Nov 16, 2023 | Safety Alignment | —Unverified | 0 |
| From Judgment to Interference: Early Stopping LLM Harmful Outputs via Streaming Content Monitoring | Jun 11, 2025 | Safety Alignment | —Unverified | 0 |
| Code-Switching Curriculum Learning for Multilingual Transfer in LLMs | Nov 4, 2024 | Cross-Lingual TransferLanguage Acquisition | —Unverified | 0 |
| Model Merging and Safety Alignment: One Bad Model Spoils the Bunch | Jun 20, 2024 | modelSafety Alignment | —Unverified | 0 |
| Llama-3.1-Sherkala-8B-Chat: An Open Large Language Model for Kazakh | Mar 3, 2025 | Language ModelingLanguage Modelling | —Unverified | 0 |
| NeuRel-Attack: Neuron Relearning for Safety Disalignment in Large Language Models | Apr 29, 2025 | Safety Alignment | —Unverified | 0 |
| From Evaluation to Defense: Advancing Safety in Video Large Language Models | May 22, 2025 | Safety Alignment | —Unverified | 0 |