UPCORE: Utility-Preserving Coreset Selection for Balanced Unlearning

This repository contains the code and data for the paper

UPCORE: Utility-Preserving Coreset Selection for Balanced Unlearning

Vaidehi Patil, Elias Stengel-Eskin and Mohit Bansal

Overview

Unlearning is an essential technique for ensuring that machine learning models adhere to data privacy regulations (e.g., GDPR, CCPA) and ethical AI practices. However, existing unlearning methods often result in unintended forgetting and degradation of model performance. UPCORE (Utility-Preserving Coreset Selection) addresses this issue by selectively pruning high-variance outliers from the forget set, ensuring a better balance between deletion effectiveness and model utility retention.

Key Features:

Less Collateral Damage: Reduces unintended forgetting by removing outlier points from the forget set.
Positive Transfer: Pruned points still get forgotten, minimizing damage while maintaining forget performance.
Method-Agnostic: UPCORE can be applied to any unlearning framework.
Evaluating Unlearning: Introduces AUC-based metrics to evaluate the trade-off between deletion and utility over time.

UPCORE: Utility-Preserving Coreset Design for Unlearning

UPCORE is built on top of tofu and NPO. It provides methods for effective unlearning in large language models while balancing accuracy and utility.

🛠 Installation

To set up UPCORE, follow these steps:

git clone https://github.com/Vaidehi99/UPCORE.git
cd UPCORE
conda create -n upcore python=3.10
conda activate upcore
pip install -r requirements.txt
pip install flash-attn --no-build-isolation

📦 Dependencies

Ensure that all required dependencies are installed using:

pip install -r requirements.txt
pip install flash-attn --no-build-isolation

📊 Datasets

UPCORE supports unlearning on the following datasets:

Counterfact Topics:
```
data/Counterfact_topics/
```
TriviaQA Topics:
```
data/TriviaQA_topics/
```

🚀 Usage

Extracting the Core Forget Set

To extract the core forget set from a given complete forget set, run:

python3 scripts/upcore.py

Gradient Ascent & Refusal with Counterfact Topics

Training

To unlearn using Gradient Ascent, set forget_loss = switch_unlearn in tofu/bash_config.conf.

To unlearn using Refusal, set forget_loss = switch_idk in tofu/bash_config.conf.

Run training:

cd tofu
./loop_run_split_train.sh

Evaluation

cd tofu
./loop_run_split_eval.sh

Compute AUC

python3 scripts/gather_results.py
python3 scripts/auc.py

Negative Preference Optimization (NPO) with Counterfact Topics

Training

To train using NPO, run:

cd negative-preference-optimization/TOFU
./loop_run_split_train_npo.sh

Evaluation

cd negative-preference-optimization/TOFU
./loop_run_split_eval_npo.sh

Compute AUC

python3 scripts/gather_results.py
python3 scripts/auc.py

Gradient Ascent with TriviaQA Topics

Training

cd tofu
./loop_run_split_train_longform.sh

Evaluation

cd tofu
./loop_run_split_eval_longform.sh

Compute AUC

python3 scripts/gather_results.py
python3 scripts/auc.py

📈 Results

Results can be obtained using:

python3 scripts/gather_results.py
python3 scripts/auc.py

📜 License

This project is licensed under MIT License.

📖 Citation

If you find this work useful, please cite:

@misc{patil2025upcoreutilitypreservingcoresetselection,
      title={UPCORE: Utility-Preserving Coreset Selection for Balanced Unlearning}, 
      author={Vaidehi Patil and Elias Stengel-Eskin and Mohit Bansal},
      year={2025},
      eprint={2502.15082},
      archivePrefix={arXiv},
      primaryClass={cs.LG},
      url={https://arxiv.org/abs/2502.15082}, 
}

Name		Name	Last commit message	Last commit date
Latest commit History 2 Commits
assets		assets
data		data
longform/tofu		longform/tofu
negative-preference-optimization		negative-preference-optimization
scripts		scripts
tofu		tofu
wiki		wiki
.gitignore		.gitignore
.gitmodules		.gitmodules
LICENSE		LICENSE
README.md		README.md
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

UPCORE: Utility-Preserving Coreset Selection for Balanced Unlearning

Overview

Key Features:

UPCORE: Utility-Preserving Coreset Design for Unlearning

📁 Table of Contents

🛠 Installation

📦 Dependencies

📊 Datasets

🚀 Usage

Extracting the Core Forget Set

Gradient Ascent & Refusal with Counterfact Topics

Training

Evaluation

Compute AUC

Negative Preference Optimization (NPO) with Counterfact Topics

Training

Evaluation

Compute AUC

Gradient Ascent with TriviaQA Topics

Training

Evaluation

Compute AUC

📈 Results

📜 License

📖 Citation

About

Releases

Packages

Languages

License

Vaidehi99/UPCORE

Folders and files

Latest commit

History

Repository files navigation

UPCORE: Utility-Preserving Coreset Selection for Balanced Unlearning

Overview

Key Features:

UPCORE: Utility-Preserving Coreset Design for Unlearning

📁 Table of Contents

🛠 Installation

📦 Dependencies

📊 Datasets

🚀 Usage

Extracting the Core Forget Set

Gradient Ascent & Refusal with Counterfact Topics

Training

Evaluation

Compute AUC

Negative Preference Optimization (NPO) with Counterfact Topics

Training

Evaluation

Compute AUC

Gradient Ascent with TriviaQA Topics

Training

Evaluation

Compute AUC

📈 Results

📜 License

📖 Citation

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages