Skip to content

Conversion from pretrained HuggingFace models #115

Closed
@ealt

Description

@ealt

Bug Description

When attempting to convert a HuggingFace model to a Penzai model using [llama/mistral/gpt_neox]_from_huggingface_model, the conversion fails with a ValueError when the model configuration contains certain attributes that are not explicitly handled.

Steps to Reproduce

from penzai.models.transformer.variants import llama
import transformers

model_name = "hf-internal-testing/tiny-random-LlamaForCausalLM"
hf_model = transformers.LlamaForCausalLM.from_pretrained(model_name)
pz_model = llama.llama_from_huggingface_model(hf_model)

(similar for mistral and gpt_neox)

Expected Behavior

The conversion should complete successfully, as missing attributes (e.x. _name_or_path) that are not critical for constructing the penzai model and can be ignored.

Actual Behavior

The conversion fails, raising a ValueError for unexpected missing attributes. For the llama example above:

 ValueError: Conversion of a LlamaForCausalLM does not support these configuration attributes: {'pad_token_id': -1, '_name_or_path': 'hf-internal-testing/tiny-random-LlamaForCausalLM'}

Root Cause

In penzai/models/transformer/variants/[llama/mistral/gpt_neox].py, the [llama/mistral/gpt_neox]_from_huggingface_model functions check for unsupported configuration attributes but are missing values like _name_or_path in their lists of handled_or_ignored_attributes.

Suggested Fix

Add missing attributes to the handled_or_ignored_attributes sets in the [llama/mistral/gpt_neox]_from_huggingface_model functions.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions