SOTAVerified

Model Editing

Papers

Showing 150 of 193 papers

TitleStatusHype
Model Editing as a Double-Edged Sword: Steering Agent Ethical Behavior Toward Beneficence or HarmCode0
DualEdit: Dual Editing for Knowledge Updating in Vision-Language Models0
Mitigating Safety Fallback in Editing-based Backdoor Injection on LLMsCode0
SoK: Machine Unlearning for Large Language Models0
MEMOIR: Lifelong Model Editing with Minimal Overwrite and Informed Retention for LLMs0
The OCR Quest for Generalization: Learning to recognize low-resource alphabets with model editing0
Drop Dropout on Single-Epoch Language Model PretrainingCode0
On Fairness of Task Arithmetic: The Role of Task Vectors0
Model Unlearning via Sparse Autoencoder Subspace Guided Projections0
DocMEdit: Towards Document-Level Model Editing0
REACT: Representation Extraction And Controllable Tuning to Overcome Overfitting in LLM Knowledge Editing0
Disentangling Knowledge Representations for Large Language Model Editing0
Localizing Knowledge in Diffusion Transformers0
Model Editing with Graph-Based External Memory0
LyapLock: Bounded Knowledge Preservation in Sequential Large Language Model EditingCode0
UniErase: Unlearning Token as a Universal Erasure Primitive for Language ModelsCode0
Editing Across Languages: A Survey of Multilingual Knowledge Editing0
UltraEdit: Training-, Subject-, and Memory-Free Lifelong Editing in Large Language ModelsCode2
UniEdit: A Unified Knowledge Editing Benchmark for Large Language Models0
NAMET: Robust Massive Model Editing via Noise-Aware Memory OptimizationCode0
Cross-Model Transfer of Task Vectors via Few-Shot Orthogonal AlignmentCode0
BalancEdit: Dynamically Balancing the Generality-Locality Trade-off in Multi-modal Model Editing0
A Comprehensive Survey in LLM(-Agent) Full Stack Safety: Data, Training and Deployment0
REDEditing: Relationship-Driven Precise Backdoor Poisoning on Text-to-Image Diffusion Models0
When is Task Vector Provably Effective for Model Editing? A Generalization Analysis of Nonlinear Transformers0
NAACL2025 Tutorial: Adaptation of Large Language Models0
Localized Definitions and Distributed Reasoning: A Proof-of-Concept Mechanistic Interpretability Study via Activation PatchingCode0
Efficient Model Editing with Task-Localized Sparse Fine-tuningCode0
Leaking LoRa: An Evaluation of Password Leaks and Knowledge Storage in Large Language ModelsCode0
BiasEdit: Debiasing Stereotyped Language Models via Model EditingCode1
SPEED: Scalable, Precise, and Efficient Concept Erasure for Diffusion ModelsCode1
Exploiting Edited Large Language Models as General Scientific Optimizers0
GeoEdit: Geometric Knowledge Editing for Large Language Models0
A Causal Lens for Evaluating Faithfulness Metrics0
CoME: An Unlearning-based Approach to Conflict-free Model EditingCode0
DELMAN: Dynamic Defense Against Large Language Model Jailbreaking with Model Editing0
The Mirage of Model Editing: Revisiting Evaluation in the WildCode1
K-Edit: Language Model Editing with Contextual Knowledge Awareness0
Reinforced Lifelong Editing for Language ModelsCode1
Injecting Universal Jailbreak Backdoors into LLMs in MinutesCode1
AnyEdit: Edit Any Knowledge Encoded in Language Models0
Cross-Encoder Rediscovers a Semantic Variant of BM250
Position: Editing Large Language Models Poses Serious Safety Risks0
Efficient Model Editing with Task Vector Bases: A Theoretical Framework and Scalable ApproachCode0
Building Bridges, Not Walls -- Advancing Interpretability by Unifying Feature, Data, and Model Component Attribution0
Enhancing Semantic Consistency of Large Language Models through Model Editing: An Interpretability-Oriented Approach0
SeaLion: Semantic Part-Aware Latent Point Diffusion Models for 3D Generation0
Forget Vectors at Play: Universal Input Perturbations Driving Machine Unlearning in Image ClassificationCode0
Concept-ROT: Poisoning Concepts in Large Language Models with Model EditingCode0
Model-Editing-Based Jailbreak against Safety-aligned Large Language Models0
Show:102550
← PrevPage 1 of 4Next →

No leaderboard results yet.