| Model Editing as a Double-Edged Sword: Steering Agent Ethical Behavior Toward Beneficence or Harm | Jun 25, 2025 | Model Editing | CodeCode Available | 0 |
| DualEdit: Dual Editing for Knowledge Updating in Vision-Language Models | Jun 16, 2025 | Model Editing | —Unverified | 0 |
| Mitigating Safety Fallback in Editing-based Backdoor Injection on LLMs | Jun 16, 2025 | DiversityModel Editing | CodeCode Available | 0 |
| SoK: Machine Unlearning for Large Language Models | Jun 10, 2025 | Large Language ModelMachine Unlearning | —Unverified | 0 |
| MEMOIR: Lifelong Model Editing with Minimal Overwrite and Informed Retention for LLMs | Jun 9, 2025 | HallucinationModel Editing | —Unverified | 0 |
| The OCR Quest for Generalization: Learning to recognize low-resource alphabets with model editing | Jun 7, 2025 | Meta-LearningModel Editing | —Unverified | 0 |
| Drop Dropout on Single-Epoch Language Model Pretraining | May 30, 2025 | Language ModelingLanguage Modelling | CodeCode Available | 0 |
| On Fairness of Task Arithmetic: The Role of Task Vectors | May 30, 2025 | FairnessHate Speech Detection | —Unverified | 0 |
| Model Unlearning via Sparse Autoencoder Subspace Guided Projections | May 30, 2025 | Adversarial Robustnessfeature selection | —Unverified | 0 |
| DocMEdit: Towards Document-Level Model Editing | May 26, 2025 | modelModel Editing | —Unverified | 0 |
| REACT: Representation Extraction And Controllable Tuning to Overcome Overfitting in LLM Knowledge Editing | May 25, 2025 | knowledge editingLanguage Modeling | —Unverified | 0 |
| Disentangling Knowledge Representations for Large Language Model Editing | May 24, 2025 | Disentanglementknowledge editing | —Unverified | 0 |
| Localizing Knowledge in Diffusion Transformers | May 24, 2025 | Model Editing | —Unverified | 0 |
| Model Editing with Graph-Based External Memory | May 23, 2025 | graph constructionmodel | —Unverified | 0 |
| LyapLock: Bounded Knowledge Preservation in Sequential Large Language Model Editing | May 21, 2025 | Language ModelingLanguage Modelling | CodeCode Available | 0 |
| UniErase: Unlearning Token as a Universal Erasure Primitive for Language Models | May 21, 2025 | Machine UnlearningModel Editing | CodeCode Available | 0 |
| Editing Across Languages: A Survey of Multilingual Knowledge Editing | May 20, 2025 | knowledge editingModel Editing | —Unverified | 0 |
| UltraEdit: Training-, Subject-, and Memory-Free Lifelong Editing in Large Language Models | May 20, 2025 | GPULifelong learning | CodeCode Available | 2 |
| UniEdit: A Unified Knowledge Editing Benchmark for Large Language Models | May 18, 2025 | Diversityknowledge editing | —Unverified | 0 |
| NAMET: Robust Massive Model Editing via Noise-Aware Memory Optimization | May 17, 2025 | AttributeModel Editing | CodeCode Available | 0 |
| Cross-Model Transfer of Task Vectors via Few-Shot Orthogonal Alignment | May 17, 2025 | Model EditingTask Arithmetic | CodeCode Available | 0 |
| BalancEdit: Dynamically Balancing the Generality-Locality Trade-off in Multi-modal Model Editing | May 2, 2025 | knowledge editingModel Editing | —Unverified | 0 |
| A Comprehensive Survey in LLM(-Agent) Full Stack Safety: Data, Training and Deployment | Apr 22, 2025 | Model Editing | —Unverified | 0 |
| REDEditing: Relationship-Driven Precise Backdoor Poisoning on Text-to-Image Diffusion Models | Apr 20, 2025 | AttributeImage Generation | —Unverified | 0 |
| When is Task Vector Provably Effective for Model Editing? A Generalization Analysis of Nonlinear Transformers | Apr 15, 2025 | Binary ClassificationDomain Generalization | —Unverified | 0 |
| NAACL2025 Tutorial: Adaptation of Large Language Models | Apr 4, 2025 | Code GenerationModel Editing | —Unverified | 0 |
| Localized Definitions and Distributed Reasoning: A Proof-of-Concept Mechanistic Interpretability Study via Activation Patching | Apr 3, 2025 | Answer GenerationEEG | CodeCode Available | 0 |
| Efficient Model Editing with Task-Localized Sparse Fine-tuning | Apr 3, 2025 | DisentanglementModel Editing | CodeCode Available | 0 |
| Leaking LoRa: An Evaluation of Password Leaks and Knowledge Storage in Large Language Models | Mar 29, 2025 | Language ModelingLanguage Modelling | CodeCode Available | 0 |
| BiasEdit: Debiasing Stereotyped Language Models via Model Editing | Mar 11, 2025 | counterfactualLanguage Modeling | CodeCode Available | 1 |
| SPEED: Scalable, Precise, and Efficient Concept Erasure for Diffusion Models | Mar 10, 2025 | Model Editing | CodeCode Available | 1 |
| Exploiting Edited Large Language Models as General Scientific Optimizers | Mar 8, 2025 | Model Editing | —Unverified | 0 |
| GeoEdit: Geometric Knowledge Editing for Large Language Models | Feb 27, 2025 | General Knowledgeknowledge editing | —Unverified | 0 |
| A Causal Lens for Evaluating Faithfulness Metrics | Feb 26, 2025 | Decision MakingFact Checking | —Unverified | 0 |
| CoME: An Unlearning-based Approach to Conflict-free Model Editing | Feb 20, 2025 | Model Editing | CodeCode Available | 0 |
| DELMAN: Dynamic Defense Against Large Language Model Jailbreaking with Model Editing | Feb 17, 2025 | Decision MakingLanguage Modeling | —Unverified | 0 |
| The Mirage of Model Editing: Revisiting Evaluation in the Wild | Feb 16, 2025 | Model EditingQuestion Answering | CodeCode Available | 1 |
| K-Edit: Language Model Editing with Contextual Knowledge Awareness | Feb 15, 2025 | Knowledge GraphsLanguage Modeling | —Unverified | 0 |
| Reinforced Lifelong Editing for Language Models | Feb 9, 2025 | Model Editing | CodeCode Available | 1 |
| Injecting Universal Jailbreak Backdoors into LLMs in Minutes | Feb 9, 2025 | Model Editing | CodeCode Available | 1 |
| AnyEdit: Edit Any Knowledge Encoded in Language Models | Feb 8, 2025 | FormImage Editing | —Unverified | 0 |
| Cross-Encoder Rediscovers a Semantic Variant of BM25 | Feb 7, 2025 | Information RetrievalModel Editing | —Unverified | 0 |
| Position: Editing Large Language Models Poses Serious Safety Risks | Feb 5, 2025 | knowledge editingModel Editing | —Unverified | 0 |
| Efficient Model Editing with Task Vector Bases: A Theoretical Framework and Scalable Approach | Feb 3, 2025 | Model EditingNegation | CodeCode Available | 0 |
| Building Bridges, Not Walls -- Advancing Interpretability by Unifying Feature, Data, and Model Component Attribution | Jan 31, 2025 | AttributeModel Editing | —Unverified | 0 |
| Enhancing Semantic Consistency of Large Language Models through Model Editing: An Interpretability-Oriented Approach | Jan 19, 2025 | Large Language ModelModel Editing | —Unverified | 0 |
| SeaLion: Semantic Part-Aware Latent Point Diffusion Models for 3D Generation | Jan 1, 2025 | 3D GenerationData Augmentation | —Unverified | 0 |
| Forget Vectors at Play: Universal Input Perturbations Driving Machine Unlearning in Image Classification | Dec 21, 2024 | image-classificationImage Classification | CodeCode Available | 0 |
| Concept-ROT: Poisoning Concepts in Large Language Models with Model Editing | Dec 17, 2024 | MisinformationModel Editing | CodeCode Available | 0 |
| Model-Editing-Based Jailbreak against Safety-aligned Large Language Models | Dec 11, 2024 | Model EditingSafety Alignment | —Unverified | 0 |