| How (un)ethical are instruction-centric responses of LLMs? Unveiling the vulnerabilities of safety guardrails to harmful queries | Feb 23, 2024 | Model EditingResponse Generation | CodeCode Available | 0 |
| Towards Unified Task Embeddings Across Multiple Models: Bridging the Gap for Prompt-Based Large Language Models and Beyond | Feb 22, 2024 | Meta-LearningModel Editing | —Unverified | 0 |
| Potential and Challenges of Model Editing for Social Debiasing | Feb 21, 2024 | Model Editing | —Unverified | 0 |
| Knowledge Graph Enhanced Large Language Model Editing | Feb 21, 2024 | Knowledge GraphsLanguage Modeling | —Unverified | 0 |
| Dense Passage Retrieval: Is it Retrieving? | Feb 16, 2024 | Model EditingPassage Retrieval | —Unverified | 0 |
| Towards Uncovering How Large Language Model Works: An Explainability Perspective | Feb 16, 2024 | HallucinationLanguage Modeling | —Unverified | 0 |
| The Butterfly Effect of Model Editing: Few Edits Can Trigger Large Language Models Collapse | Feb 15, 2024 | BenchmarkingModel Editing | CodeCode Available | 0 |
| Long-form evaluation of model editing | Feb 14, 2024 | Formmodel | CodeCode Available | 0 |
| Rethinking Machine Unlearning for Large Language Models | Feb 13, 2024 | Machine UnlearningManagement | —Unverified | 0 |
| On the Robustness of Editing Large Language Models | Feb 8, 2024 | Model EditingText Generation | CodeCode Available | 0 |