µKE: Matryoshka Unstructured Knowledge Editing of Large Language Models

COLM 2025
1Purdue University 2Johns Hopkins University
* Co-first and corresponding authors

Abstract

Large language models (LLMs) have emerged as powerful knowledge bases yet are limited by static training data, leading to issues such as hallucinations and safety risks. Editing a model's internal knowledge through the locate-and-edit paradigm has proven a cost-effective alternative to retraining, though current unstructured approaches—especially window-based autoregressive methods—often disrupt the causal dependency between early memory updates and later output tokens.

In this work, we first theoretically analyze these limitations and then introduce Matryoshka Unstructured Knowledge Editing (µKE), a novel memory update mechanism that preserves such dependencies via a Matryoshka-style objective and adaptive loss coefficients. Empirical evaluations on two models across five benchmarks demonstrate that µKE improves edit efficacy by up to 12.33% over state-of-the-art methods, and remains robust when applied to diverse formatted edits, underscoring its potential for effective unstructured knowledge editing in LLMs.

Introduction

µKE Overview

Figure 1: Comparison between different unstructured editing paradigms. (a) One-for-All updates one memory for the entire edit target. (b) Window-by-Window splits the target into windows but overlooks memory dependencies. (c) Our Matryoshka approach maintains proper dependency while benefiting from multiple memory shifts.

Large Language Models increasingly power diverse applications but suffer from static training data limitations. Model editing has emerged as a cost-effective method for updating internal knowledge without extensive retraining. While early methods focused on structured knowledge triplets, recent approaches like AnyEdit extend to unstructured editing through window-based strategies.

However, these window-based autoregressive methods disrupt causal dependencies between early memory updates and later output tokens. µKE addresses this limitation through a novel Matryoshka-style memory update mechanism that preserves these critical dependencies.

Method

Key Innovation: Matryoshka-Style Memory Update

µKE introduces a Matryoshka-style objective that enables gradient flows from all latter tokens to former working memories during optimization. This approach treats each memory update as potentially contributing to all subsequent target tokens, maintaining proper causality.

Unlike window-based methods that optimize each memory shift independently, our Matryoshka objective ensures that early memory updates account for their influence on all downstream tokens. This preserves the natural causal structure of autoregressive generation while still benefiting from multiple localized memory updates.

Matryoshka-style Working Memory Update

Figure 2: Matryoshka-style working memory update. For each δᵢ, the objective is a weighted sum of negative log-likelihood of all target figures starting from window i conditioned on previous contexts.

Affinity Example

Figure 3: Affinities between target figures. The heatmap shows gradient affinity between different target segments, informing the adaptive coefficients for balanced optimization.

Adaptive Coefficients

We introduce adaptive coefficients informed by gradient affinity between target segments. This balances the contribution of intermediate terms and mitigates variance across different target lengths. The coefficients are dynamically computed based on the alignment between memory updates and their corresponding target segments, ensuring stable optimization across diverse editing scenarios.

Experimental Results

Experimental Setup

We comprehensively evaluate µKE on long-form knowledge editing across multiple dimensions:

Models Tested

  • Llama3-8B-Instruct - Meta's instruction-tuned model with 8B parameters
  • Qwen2.5-7B-Instruct - Alibaba's latest instruction-tuned model with 7B parameters

Benchmarks

  • UnKEBench - Unstructured knowledge editing with long-form answers
  • AKEW-CounterFact - Counterfactual editing for fact correction
  • AKEW-MQuAKE - Multi-hop question answering edits
  • EditEverything - Diverse domains (math, code, poetry, news, chemistry)
  • SelfCheckGPT - Hallucination reduction evaluation

Evaluation Metrics

  • BLEU - N-gram precision for lexical similarity
  • BERTScore - Semantic similarity using BERT embeddings
  • ROUGE-L - Longest common subsequence similarity
  • Locality - Preservation of general capabilities (MMLU, IFEval)

Method Variants

  • µKE - Based on MEMIT (edits MLP layers 4-8)
  • µKE* - Based on UnKE (edits full transformer layer 7)
  • Window size: 30 tokens (Llama3), 20 tokens (Qwen2.5)
  • Optimization: 25 steps with Adam optimizer

UnKEBench Performance

99.81% BLEU Score (µKE*) Llama3-8B-Instruct
99.97% BERTScore Near-perfect semantic match
+12.33% vs AnyEdit Absolute improvement

Evaluated on 100+ long-form QA pairs with answers averaging 150+ tokens

AKEW-CounterFact

99.96% BLEU (µKE*) Llama3 Original
43.60% BLEU Paraphrase +3.35% vs AnyEdit*
99.99% BERTScore Perfect semantics

Counterfactual editing maintaining consistency across fact updates

Cross-Domain Robustness

5 Domains EditEverything All outperform baselines
86.77% Hallucination ↓ SelfCheckGPT BLEU
Stable Locality MMLU & IFEval preserved

Consistent improvements across math, code, poetry, news, and chemistry

EditEverything Results

Figure 4: Performance on EditEverything benchmark. µKE demonstrates superior performance across diverse domains including mathematics, poetry, news, programming, and chemistry, showing consistent improvements over baseline methods.

Key Findings

  • µKE improves edit efficacy by up to 12.33% in BLEU score over state-of-the-art AnyEdit
  • Achieves near-perfect scores (>99.9%) on multiple benchmarks with the UnKE-based variant (µKE*)
  • Maintains robust performance across diverse editing formats including math, code, poetry, news, and chemistry
  • Preserves model's general capabilities on MMLU and IFEval benchmarks
  • Shows superior generalization to paraphrased questions compared to baseline methods
  • Performance remains stable across different target lengths unlike window-based methods

Performance Analysis

Window Size Analysis

Figure 5: µKE performance with various window sizes. Performance is relatively insensitive to window size, with variations typically within 3% across all metrics. The optimal window size varies by dataset (best at 80 tokens for UnKEBench, 60 for MQuAKE).

Batch Editing Performance

Figure 6: Batch editing performance. µKE consistently outperforms AnyEdit for different batch sizes, highlighting the significance of maintaining memory dependency for unstructured batch editing across UnKEBench, CounterFact, and MQuAKE benchmarks.

Citation

If you find our work useful, please cite:

@inproceedings{su2025muke,
    title={µKE: Matryoshka Unstructured Knowledge Editing of Large Language Models},
    author={Su, Zian and Huang, Ziyang and Zhang, Kaiyuan and Zhang, Xiangyu},
    booktitle={Conference on Language Modeling (COLM)},
    year={2025}
}