Description
Bug Description
When attempting to convert a HuggingFace model to a Penzai model using [llama/mistral/gpt_neox]_from_huggingface_model
, the conversion fails with a ValueError when the model configuration contains certain attributes that are not explicitly handled.
Steps to Reproduce
from penzai.models.transformer.variants import llama
import transformers
model_name = "hf-internal-testing/tiny-random-LlamaForCausalLM"
hf_model = transformers.LlamaForCausalLM.from_pretrained(model_name)
pz_model = llama.llama_from_huggingface_model(hf_model)
(similar for mistral
and gpt_neox
)
Expected Behavior
The conversion should complete successfully, as missing attributes (e.x. _name_or_path
) that are not critical for constructing the penzai model and can be ignored.
Actual Behavior
The conversion fails, raising a ValueError for unexpected missing attributes. For the llama example above:
ValueError: Conversion of a LlamaForCausalLM does not support these configuration attributes: {'pad_token_id': -1, '_name_or_path': 'hf-internal-testing/tiny-random-LlamaForCausalLM'}
Root Cause
In penzai/models/transformer/variants/[llama/mistral/gpt_neox].py
, the [llama/mistral/gpt_neox]_from_huggingface_model
functions check for unsupported configuration attributes but are missing values like _name_or_path
in their lists of handled_or_ignored_attributes
.
Suggested Fix
Add missing attributes to the handled_or_ignored_attributes
sets in the [llama/mistral/gpt_neox]_from_huggingface_model
functions.