Skip to content

Vaidehi99/UPCORE

Repository files navigation

UPCORE: Utility-Preserving Coreset Selection for Balanced Unlearning

This repository contains the code and data for the paper

UPCORE: Utility-Preserving Coreset Selection for Balanced Unlearning

Vaidehi Patil, Elias Stengel-Eskin and Mohit Bansal

Overview

Unlearning is an essential technique for ensuring that machine learning models adhere to data privacy regulations (e.g., GDPR, CCPA) and ethical AI practices. However, existing unlearning methods often result in unintended forgetting and degradation of model performance. UPCORE (Utility-Preserving Coreset Selection) addresses this issue by selectively pruning high-variance outliers from the forget set, ensuring a better balance between deletion effectiveness and model utility retention.

Key Features:

  • Less Collateral Damage: Reduces unintended forgetting by removing outlier points from the forget set.
  • Positive Transfer: Pruned points still get forgotten, minimizing damage while maintaining forget performance.
  • Method-Agnostic: UPCORE can be applied to any unlearning framework.
  • Evaluating Unlearning: Introduces AUC-based metrics to evaluate the trade-off between deletion and utility over time.

UPCORE

UPCORE: Utility-Preserving Coreset Design for Unlearning

UPCORE is built on top of tofu and NPO. It provides methods for effective unlearning in large language models while balancing accuracy and utility.


📁 Table of Contents


🛠 Installation

To set up UPCORE, follow these steps:

git clone https://github.com/Vaidehi99/UPCORE.git
cd UPCORE
conda create -n upcore python=3.10
conda activate upcore
pip install -r requirements.txt
pip install flash-attn --no-build-isolation

📦 Dependencies

Ensure that all required dependencies are installed using:

pip install -r requirements.txt
pip install flash-attn --no-build-isolation

📊 Datasets

UPCORE supports unlearning on the following datasets:

  • Counterfact Topics:
    data/Counterfact_topics/
  • TriviaQA Topics:
    data/TriviaQA_topics/

🚀 Usage

Extracting the Core Forget Set

To extract the core forget set from a given complete forget set, run:

python3 scripts/upcore.py

Gradient Ascent & Refusal with Counterfact Topics

Training

To unlearn using Gradient Ascent, set forget_loss = switch_unlearn in tofu/bash_config.conf.

To unlearn using Refusal, set forget_loss = switch_idk in tofu/bash_config.conf.

Run training:

cd tofu
./loop_run_split_train.sh

Evaluation

cd tofu
./loop_run_split_eval.sh

Compute AUC

python3 scripts/gather_results.py
python3 scripts/auc.py

Negative Preference Optimization (NPO) with Counterfact Topics

Training

To train using NPO, run:

cd negative-preference-optimization/TOFU
./loop_run_split_train_npo.sh

Evaluation

cd negative-preference-optimization/TOFU
./loop_run_split_eval_npo.sh

Compute AUC

python3 scripts/gather_results.py
python3 scripts/auc.py

Gradient Ascent with TriviaQA Topics

Training

cd tofu
./loop_run_split_train_longform.sh

Evaluation

cd tofu
./loop_run_split_eval_longform.sh

Compute AUC

python3 scripts/gather_results.py
python3 scripts/auc.py

📈 Results

Results can be obtained using:

python3 scripts/gather_results.py
python3 scripts/auc.py

📜 License

This project is licensed under MIT License.


📖 Citation

If you find this work useful, please cite:

@misc{patil2025upcoreutilitypreservingcoresetselection,
      title={UPCORE: Utility-Preserving Coreset Selection for Balanced Unlearning}, 
      author={Vaidehi Patil and Elias Stengel-Eskin and Mohit Bansal},
      year={2025},
      eprint={2502.15082},
      archivePrefix={arXiv},
      primaryClass={cs.LG},
      url={https://arxiv.org/abs/2502.15082}, 
}

About

No description, website, or topics provided.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published