Skip to content

Conversion from pretrained HuggingFace models #115

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
ealt opened this issue Apr 22, 2025 · 0 comments · May be fixed by #116
Open

Conversion from pretrained HuggingFace models #115

ealt opened this issue Apr 22, 2025 · 0 comments · May be fixed by #116

Comments

@ealt
Copy link

ealt commented Apr 22, 2025

Bug Description

When attempting to convert a HuggingFace model to a Penzai model using [llama/mistral/gpt_neox]_from_huggingface_model, the conversion fails with a ValueError when the model configuration contains certain attributes that are not explicitly handled.

Steps to Reproduce

from penzai.models.transformer.variants import llama
import transformers

model_name = "hf-internal-testing/tiny-random-LlamaForCausalLM"
hf_model = transformers.LlamaForCausalLM.from_pretrained(model_name)
pz_model = llama.llama_from_huggingface_model(hf_model)

(similar for mistral and gpt_neox)

Expected Behavior

The conversion should complete successfully, as missing attributes (e.x. _name_or_path) that are not critical for constructing the penzai model and can be ignored.

Actual Behavior

The conversion fails, raising a ValueError for unexpected missing attributes. For the llama example above:

 ValueError: Conversion of a LlamaForCausalLM does not support these configuration attributes: {'pad_token_id': -1, '_name_or_path': 'hf-internal-testing/tiny-random-LlamaForCausalLM'}

Root Cause

In penzai/models/transformer/variants/[llama/mistral/gpt_neox].py, the [llama/mistral/gpt_neox]_from_huggingface_model functions check for unsupported configuration attributes but are missing values like _name_or_path in their lists of handled_or_ignored_attributes.

Suggested Fix

Add missing attributes to the handled_or_ignored_attributes sets in the [llama/mistral/gpt_neox]_from_huggingface_model functions.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

1 participant