Lottery Ticket Adaptation: Mitigating Destructive Interference in LLMs

Paper · arXiv 2406.16797 · Published June 24, 2024

Existing methods for adapting large language models (LLMs) to new tasks are not suited to multi-task adaptation because they modify all the model weights–causing destructive interference between tasks. The resulting effects, such as catastrophic forgetting of earlier tasks, make it challenging to obtain good performance on multiple tasks at the same time. To mitigate this, we propose Lottery Ticket Adaptation (LoTA), a sparse adaptation method that identifies and optimizes only a sparse subnetwork of the model. We evaluate LoTA on a wide range of challenging tasks such as instruction following, reasoning, math, and summarization. LoTA obtains better performance than full fine-tuning and low-rank adaptation (LoRA), and maintains good performance even after training on other tasks – thus, avoiding catastrophic forgetting. By extracting and fine-tuning over lottery tickets (or sparse task vectors), LoTA also enables model merging over highly dissimilar tasks.

As illustrated in the first row of Figure 1, an emerging paradigm for multi-task adaptation is to store an adapter for each desired task and load a particular adapter at inference time [21, 7, 22], depending on the need [23]. Notably, Apple stores and loads adapters to enable their on-device models [24]. This approach, while avoiding any interference between tasks, increases the memory and compute cost as it requires storing and loading an additional adapter per task.

one common approach is to fine-tune the model on different tasks sequentially [35], e.g., first fine-tune on task A, then fine-tune on task B.

Fine-tuning the LLM for new tasks with FFT or existing PEFT methods leads to catastrophic forgetting of earlier tasks. This is problematic, especially for safety alignment, since we can fine-tune an LLM to be safe but later get this feature erased during future fine-tuning on new tasks [37].

Can model developers allow users to finetune their aligned models on custom

datasets while retaining safety?