Skip to content

Discovered a magic bug in keras, which is caused by incoming data #20070

Closed
@pass-lin

Description

@pass-lin

Here is an example code, where bert4keras3 is my own llm library, which can be installed through pip

import json
config = {
    "type_vocab_size": 2,
    "use_bias": 0,
    "o_bias": 0,
    "vocab_size": 64000,
    "num_hidden_layers": 1,
    "query_head": 32,
    "num_attention_heads": 4,
    "hidden_size": 4096,
    "intermediate_size": 11008,
    "attention_head_size": 128,
    "dropout_rate": 0.0,
    "hidden_act": "silu",
    "max_wavelength": 5000000.0
}

with open('config.json', 'w') as f:
        json.dump(config, f, indent=4, ensure_ascii=False)
import os
os.environ["KERAS_BACKEND"] = "torch"
os.environ["CUDA_VISIBLE_DEVICES"]="-1"
maxlen= 16
import tensorflow as tf
import tensorflow.experimental.numpy as tnp
import keras
from bert4keras3.models import build_transformer_model
keras.config.set_dtype_policy("bfloat16")
model=build_transformer_model(
    'config.json',
    model='llama',
    sequence_length=maxlen,
)
tokens = tnp.random.randint(100,60000,size=[3,maxlen],dtype='int32')
segments = tnp.ones([3,maxlen],dtype='int32')
print(model.inputs)
#this can successfully running
print(model([tokens,segments]))
#this can successfully running
try:
    print(model({'Input-Token':segments ,
        'Input-Segment':tokens }))
except:
    print('error')
#this can not successfully running
try:
    model({'Input-Token':tokens ,
        'Input-Segment':segments })
except:
    print('error')

The reason for the failure can be discovered in the call function of the bert4keras.Layers.Add.Embedding class. If you print inputs, self.name, you will find that the layer where 'Input-Token' should be passed has received 'Input-Segment' instead. However, if you print self.name, inputs.name during the execution of the build_transformer_model function, you can see that they are matched correctly.

Further, if we do not use a dictionary as input and instead pass a list in the order of model.inputs, the code will run smoothly at this point.
Additionally, this bug only exists in Keras 3.4.1 and not in Keras 3.3.3.

Metadata

Metadata

Assignees

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions