Abstract: We introduce PanDerm, a multimodal dermatology foundation model addressing the challenge that current deep learning models excel only at specific tasks rather than meeting the complex, multimodal requirements of clinical dermatology practice. Pretrained through self-supervised learning on over 2 million skin disease images across four imaging modalities from multiple institutions, PanDerm demonstrates state-of-the-art performance across diverse tasks, including skin cancer screening, differential diagnosis, lesion segmentation, longitudinal monitoring, and prognosis prediction, often requiring less labeled data than existing approaches. Clinical reader studies show PanDerm outperforms clinicians in early melanoma detection, improves dermatologists' diagnostic skin cancer diagnosis accuracy, and enhances non-specialists' differential diagnosis capabilities across numerous skin conditions.
- 29/04/2025: The ViT-base version of PanDerm (PanDerm_base) is now available, providing a smaller model for more widespread usage scenarios.
- 26/04/2025: Released the finetuning script for image classification.
What is PanDerm? PanDerm is a vision-centric multimodal foundation model pretrained on 2 million dermatological images. It provides specialized representations across four dermatological imaging modalities (dermoscopy, clinical images, TBP, and dermatopathology), delivering superior performance in skin cancer diagnosis, differential diagnosis of hundreds of skin conditions, disease progression monitoring, Total Body Photography-based applications, and image segmentation.
Why use PanDerm? PanDerm significantly outperforms clinically popular CNN models like ResNet, especially with limited labeled data. Its strong linear probing results offer a computationally efficient alternative with lower implementation barriers. PanDerm also demonstrates superior performance compared to existing foundation models while minimizing data leakage risk—a common concern with web-scale pretrained models like DINOv2, SwavDerm, and Derm Foundation. These combined advantages make PanDerm the ideal choice for replacing both traditional CNNs and other foundation models in clinical applications, including human-AI collaboration, multimodal image analysis, and various diagnostic and progression tasks.
Note: PanDerm is a general-purpose dermatology foundation model and requires fine-tuning or linear probing before application to specific tasks.
First, clone the repo and cd into the directory:
git clone https://github.com/SiyuanYan1/PanDerm
cd PanDerm/classification
Then create a conda env and install the dependencies:
conda create -n PanDerm python=3.10 -y
conda activate PanDerm
pip install torch==2.4.1 torchvision==0.19.1 torchaudio==2.4.1 --index-url https://download.pytorch.org/whl/cu118
cd classification
pip install -r requirements.txt
Model Name | Release Date | Model Architecture | Google Drive Link |
---|---|---|---|
PanDerm_Base | 04/2025 | ViT-B/16 | Link |
PanDerm (proposed in our paper) | 10/2024 | ViT-L/16 | Link |
Using Your Own Dataset
If you wish to use our model with your own dataset, the dataset used for linear probing or finetuning should be organized in a CSV file with the following structure:
Required Columns
image
: Path to the image file (e.g., ISIC_0034524.jpg)split
: Dataset partition indicator (train, val, or test)- For multi-class classification:
label
: Numerical class label (e.g., 0, 1, 2, 3, 4)
- For binary classification:
binary_label
: Binary class label (e.g., 0, 1)
For Multi-class Example:
image,label,split
ISIC_0034524.jpg,1,train
ISIC_0034525.jpg,1,train
ISIC_0034526.jpg,4,val
ISIC_0034527.jpg,3,test
For Binary Classification Example:
image,binary_label,split
ISIC_0034524.jpg,1,train
ISIC_0034525.jpg,1,train
ISIC_0034526.jpg,0,val
ISIC_0034527.jpg,0,test
Using Pre-processed Public Datasets
We've already pre-processed several public datasets to reproduce the results in our study and prevent data leakage between splits. These datasets are ready to use with our model and require no additional formatting.
Dataset | Processed Data | Original Data |
---|---|---|
HAM10000 | Download | Official Website |
BCN20000 | Download | Official Website |
DDI | Download | Official Website |
Derm7pt | Download | Official Website |
Dermnet | Download | Official Website |
HIBA | Download | Official Website |
MSKCC | Download | Official Website |
PAD-UFES | Download | Official Website |
PATCH16 | Download | Official Website |
Note: The processed datasets provided here may differ slightly from those on the official websites. To ensure reproducibility of our paper's results, please use the processed data links above.
Training and evaluation using the PAD-UFES dataset as an example. Replace the CSV path and root path with your own dataset.
batch_size
: Adjust based on the memory size of your GPU.model
: Model size - "PanDerm_Large_LP" (original paper model) or "PanDerm_Base_LP" (smaller version)nb_classes
: Set this to the number of classes in your evaluation dataset.percent_data
: Controls the percentage of training data used. For example, 0.1 means evaluate models using 10% of the training data. Modify this if you want to conduct label efficiency generalization experiments.csv_path
: Organize your dataset as described in the "Data Preparation" section.root_path
: The path of your folder for saved images.pretrained_checkpoint
: Path to the pretrain checkpoint - "panderm_ll_data6_checkpoint-499.pth" for "PanDerm_Large_LP" and "panderm_bb_data6_checkpoint-499.pth" for "PanDerm_Base_LP".
cd classification
CUDA_VISIBLE_DEVICES=1 python3 linear_eval.py \
--batch_size 1000 \
--model "PanDerm_Large_LP" \
--nb_classes 6 \
--percent_data 1.0 \
--csv_filename "PanDerm_Large_LP_result.csv" \
--output_dir "/path/to/your/PanDerm/output_dir/PanDerm_res/" \
--csv_path "/path/to/your/PanDerm/Evaluation_datasets/pad-ufes/2000.csv" \
--root_path "/path/to/your/PanDerm/Evaluation_datasets/pad-ufes/images/ " \
--pretrained_checkpoint "/path/to/your/PanDerm/pretrain_weight/panderm_ll_data6_checkpoint-499.pth"
For additional evaluation datasets, please refer to the bash scripts for detailed usage. We provide running code to evaluate on 9 public datasets. You can choose the model from the available options.
To run the evaluations:
cd classification
bash script/lp_reproduce.sh
model
: Model size - "PanDerm_Large_FT" (original paper model) or "PanDerm_Base_FT" (smaller version)pretrained_checkpoint
: Path to the pretrain checkpoint - "panderm_ll_data6_checkpoint-499.pth" for "PanDerm_Large_FT" and "panderm_bb_data6_checkpoint-499.pth" for "PanDerm_Base_FT".nb_classes
: Set this to the number of classes in your evaluation dataset.weights
: Setting to use the weighted random sampler for the imbalanced class dataset.monitor
: Choosing your checkpoint based on "acc" or "recall".csv_path
: Organize your dataset as described in the "Data Preparation" section.root_path
: The path of your folder for saved images.TTA
: Enable Test-Time Augmentation. You can modify the augmentation setting in the classTTAHandler
classification/furnace/engine_for_finetuning.py. --eval
: Model inference.
Our experiments show the following hyperparameters deliver optimal performance across various evaluation datasets:
- Batch size: 128
- Learning rate: 5e-4
- Training epochs: 50
- Enable the weighted random sampler
- Enable TTA during testing
We observed that the hyperparameter setting is robust across datasets and typically doesn't require adjustment.
You could fine-tune PanDerm on your dataset. Here is a command-line example for fine-tuning PanDerm_Large on the PAD-UFES dataset:
MODEL_NAME="PanDerm_Large_FT"
MODEL_PATH="/path/to/your/PanDerm/pretrain_weight/panderm_ll_data6_checkpoint-499.pth"
CUDA_VISIBLE_DEVICES=0 python3 run_class_finetuning.py \
--model $MODEL_NAME \
--pretrained_checkpoint $MODEL_PATH \
--nb_classes 6 \
--batch_size 128 \
--lr 5e-4 \
--update_freq 1 \
--warmup_epochs 10 \
--epochs 50 --layer_decay 0.65 --drop_path 0.2 \
--weight_decay 0.05 --mixup 0.8 --cutmix 1.0 \
--weights \
--sin_pos_emb \
--no_auto_resume \
--exp_name exp_name "pad finetune and eval" \
--imagenet_default_mean_and_std \
--wandb_name "Reproduce_PAD_FT_${seed}" \
--output_dir /path/to/your/PanDerm/Evaluation_datasets/PAD_Res/ \ # Your best epoch's fine-tuned checkpoint and model output results on the test set will be saved in this directory
--csv_path "/path/to/your/PanDerm/Evaluation_datasets/pad-ufes/2000.csv" \
--root_path "/path/to/your/PanDerm/Evaluation_datasets/pad-ufes/images/ " \
--seed 0
The script for fine-tuning and evaluating PanDerm:
cd classification
bash script/finetune_train.sh
Note: Remember to adjust the pretrained_checkpoint
argument to your storage location of pretrained model weights.
cd classification
bash script/finetune_test.sh
Note: Remember to adjust the resume
argument to your storage location of finetuned model weights.
Please refer to details here.
The model and associated code are released under the CC-BY-NC-ND 4.0 license and may only be used for non-commercial academic research purposes with proper attribution.
This code is built on CAEv2, UNI, MAE. We thank the authors for sharing their code.
@misc{yan2025multimodalvisionfoundationmodel,
title={A Multimodal Vision Foundation Model for Clinical Dermatology},
author={Siyuan Yan and Zhen Yu and Clare Primiero and Cristina Vico-Alonso and Zhonghua Wang and Litao Yang and Philipp Tschandl and Ming Hu and Lie Ju and Gin Tan and Vincent Tang and Aik Beng Ng and David Powell and Paul Bonnington and Simon See and Elisabetta Magnaterra and Peter Ferguson and Jennifer Nguyen and Pascale Guitera and Jose Banuls and Monika Janda and Victoria Mar and Harald Kittler and H. Peter Soyer and Zongyuan Ge},
year={2025},
eprint={2410.15038},
archivePrefix={arXiv},
primaryClass={cs.CV},
url={https://arxiv.org/abs/2410.15038},
}