Skip to content

Commit b8a80af

Browse files
committed
Rename keras-nlp -> keras-hub
1 parent 290091e commit b8a80af

File tree

623 files changed

+4711
-4711
lines changed

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

623 files changed

+4711
-4711
lines changed

.github/workflows/actions.yml

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -60,7 +60,7 @@ jobs:
6060
pip install keras-nightly --progress-bar off
6161
- name: Test with pytest
6262
run: |
63-
pytest keras_nlp/
63+
pytest keras_hub/
6464
- name: Run integration tests
6565
run: |
6666
python pip_build.py --install

.gitignore

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -7,7 +7,7 @@ __pycache__/
77
*.swp
88
*.swo
99

10-
keras_nlp.egg-info/
10+
keras_hub.egg-info/
1111
dist/
1212

1313
.coverage

.kokoro/github/ubuntu/gpu/build.sh

Lines changed: 4 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -62,9 +62,9 @@ pip install huggingface_hub
6262
# Run Extra Large Tests for Continuous builds
6363
if [ "${RUN_XLARGE:-0}" == "1" ]
6464
then
65-
pytest keras_nlp --check_gpu --run_large --run_extra_large \
66-
--cov=keras-nlp
65+
pytest keras_hub --check_gpu --run_large --run_extra_large \
66+
--cov=keras-hub
6767
else
68-
pytest keras_nlp --check_gpu --run_large \
69-
--cov=keras-nlp
68+
pytest keras_hub --check_gpu --run_large \
69+
--cov=keras-hub
7070
fi

API_DESIGN_GUIDE.md

Lines changed: 6 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -3,7 +3,7 @@
33
Before reading this document, please read the
44
[Keras API design guidelines](https://github.com/keras-team/governance/blob/master/keras_api_design_guidelines.md).
55

6-
Below are some design considerations specific to KerasNLP.
6+
Below are some design considerations specific to KerasHub.
77

88
## Philosophy
99

@@ -18,16 +18,16 @@ Below are some design considerations specific to KerasNLP.
1818
arbitrarily advanced use cases should be possible. There should always be a
1919
"we need to go deeper" path available to our most expert users.
2020

21-
- **Grow as a platform and as a community.** KerasNLP development should be
21+
- **Grow as a platform and as a community.** KerasHub development should be
2222
driven by the community, with feature and release planning happening in
2323
the open on GitHub.
2424

2525
## Avoid new dependencies
2626

27-
The core dependencies of KerasNLP are Keras, NumPy, TensorFlow, and
27+
The core dependencies of KerasHub are Keras, NumPy, TensorFlow, and
2828
[Tensorflow Text](https://www.tensorflow.org/text).
2929

30-
We strive to keep KerasNLP as self-contained as possible, and avoid adding
30+
We strive to keep KerasHub as self-contained as possible, and avoid adding
3131
dependencies to projects (for example NLTK or spaCy) for text preprocessing.
3232

3333
In rare cases, particularly with tokenizers and metrics, we may need to add
@@ -65,7 +65,7 @@ calling a layer, metric or loss with `@tf.function` without running into issues.
6565
[tf.text](https://www.tensorflow.org/text/api_docs/python/text) provides a large
6666
surface on TensorFlow operations that manipulate strings. If an low-level (c++)
6767
operation we need is missing, we should add it in collaboration with core
68-
TensorFlow or TensorFlow Text. KerasNLP is a python-only library.
68+
TensorFlow or TensorFlow Text. KerasHub is a python-only library.
6969

7070
We should also strive to keep computation XLA compilable wherever possible (e.g.
7171
`tf.function(jit_compile=True)`). For trainable modeling components this is
@@ -84,7 +84,7 @@ both batched and unbatched data as input to preprocessing layers.
8484

8585
## Prioritize multi-lingual support
8686

87-
We strive to keep KerasNLP a friendly and useful library for speakers of all
87+
We strive to keep KerasHub a friendly and useful library for speakers of all
8888
languages. In general, prefer designing workflows that are language agnostic,
8989
and do not involve logic (e.g. stemming) that need to be rewritten
9090
per-language.

CONTRIBUTING.md

Lines changed: 19 additions & 19 deletions
Original file line numberDiff line numberDiff line change
@@ -1,7 +1,7 @@
11
# Contribution guide
22

3-
KerasNLP is an actively growing project and community! We would love for you
4-
to get involved. Below are instructions for how to plug into KerasNLP
3+
KerasHub is an actively growing project and community! We would love for you
4+
to get involved. Below are instructions for how to plug into KerasHub
55
development.
66

77
## Background reading
@@ -83,21 +83,21 @@ Once the pull request is approved, a team member will take care of merging.
8383

8484
Python 3.9 or later is required.
8585

86-
Setting up your KerasNLP development environment requires you to fork the
87-
KerasNLP repository and clone it locally. With the
86+
Setting up your KerasHub development environment requires you to fork the
87+
KerasHub repository and clone it locally. With the
8888
[GitHub CLI](https://github.com/cli/cli) installed, you can do this as follows:
8989

9090
```shell
9191
gh repo fork keras-team/keras-nlp --clone --remote
92-
cd keras-nlp
92+
cd keras-hub
9393
```
9494

9595
Next we must setup a python environment with the correct dependencies. We
9696
recommend using `conda` to set up a base environment, and `pip` to install
9797
python packages from PyPI. The exact method will depend on your OS.
9898

9999
**Note**: Be careful not to use mix pre-packaged tensorflow and jax libraries in
100-
`conda` with PyPI packages from `pip`. We recommend pulling *all* KerasNLP
100+
`conda` with PyPI packages from `pip`. We recommend pulling *all* KerasHub
101101
dependencies via `pip` as described below.
102102

103103
### Linux (recommended)
@@ -108,29 +108,29 @@ want accelerator support. The easiest way to get GPU support across all of our
108108
backends is to set up a few different python environements and pull in all cuda
109109
dependencies via `pip`.
110110

111-
The shell snippet below will install four conda environments: `keras-nlp-cpu`,
112-
`keras-nlp-jax`, `keras-nlp-torch`, and `keras-nlp-tensorflow`. The cpu
111+
The shell snippet below will install four conda environments: `keras-hub-cpu`,
112+
`keras-hub-jax`, `keras-hub-torch`, and `keras-hub-tensorflow`. The cpu
113113
environement supports all backends without cuda, and each backend environement
114114
has cuda support.
115115

116116
```shell
117-
conda create -y -n keras-nlp-cpu python=3.10
118-
conda activate keras-nlp-cpu
117+
conda create -y -n keras-hub-cpu python=3.10
118+
conda activate keras-hub-cpu
119119
pip install -r requirements.txt # install deps
120-
pip install -e . # install keras-nlp
120+
pip install -e . # install keras-hub
121121

122122
for backend in "jax" "torch" "tensorflow"; do
123-
conda create -y -n keras-nlp-${backend} python=3.10
124-
conda activate keras-nlp-${backend}
123+
conda create -y -n keras-hub-${backend} python=3.10
124+
conda activate keras-hub-${backend}
125125
pip install -r requirements-${backend}-cuda.txt # install deps
126-
pip install -e . # install keras-nlp
126+
pip install -e . # install keras-hub
127127
done
128128
```
129129

130130
To activate the jax environment and set keras to use jax, run:
131131

132132
```shell
133-
conda activate keras-nlp-jax && export KERAS_BACKEND=jax
133+
conda activate keras-hub-jax && export KERAS_BACKEND=jax
134134
```
135135

136136
### MacOS
@@ -160,16 +160,16 @@ repository.
160160

161161
## Update Public API
162162

163-
Run API generation script when creating PRs that update `keras_nlp_export`
164-
public APIs. Add the files changed in `keras_nlp/api` to the same PR.
163+
Run API generation script when creating PRs that update `keras_hub_export`
164+
public APIs. Add the files changed in `keras_hub/api` to the same PR.
165165

166166
```
167167
./shell/api_gen.sh
168168
```
169169

170170
## Testing changes
171171

172-
KerasNLP is tested using [PyTest](https://docs.pytest.org/en/6.2.x/).
172+
KerasHub is tested using [PyTest](https://docs.pytest.org/en/6.2.x/).
173173

174174
### Run a test file
175175

@@ -184,7 +184,7 @@ can use the following command to run all the tests in `import_test.py`
184184
whose names contain `import`:
185185

186186
```shell
187-
pytest keras_nlp/keras_nlp/integration_tests/import_test.py -k="import"
187+
pytest keras_hub/integration_tests/import_test.py -k="import"
188188
```
189189

190190
### Run the full test suite

CONTRIBUTING_MODELS.md

Lines changed: 28 additions & 28 deletions
Original file line numberDiff line numberDiff line change
@@ -1,13 +1,13 @@
11
# Model Contribution Guide
22

3-
KerasNLP has a plethora of pre-trained large language models
3+
KerasHub has a plethora of pre-trained large language models
44
ranging from BERT to OPT. We are always looking for more models and are always
55
open to contributions!
66

77
In this guide, we will walk you through the steps one needs to take in order to
8-
contribute a new pre-trained model to KerasNLP. For illustration purposes, let's
8+
contribute a new pre-trained model to KerasHub. For illustration purposes, let's
99
assume that you want to contribute the DistilBERT model. Before we dive in, we encourage you to go through
10-
[our getting started guide](https://keras.io/guides/keras_nlp/getting_started/)
10+
[our getting started guide](https://keras.io/guides/keras_hub/getting_started/)
1111
for an introduction to the library, and our
1212
[contribution guide](https://github.com/keras-team/keras-nlp/blob/master/CONTRIBUTING.md).
1313

@@ -22,29 +22,29 @@ Keep this checklist handy!
2222

2323
### Step 2: PR #1 - Add XXBackbone
2424

25-
- [ ] An `xx/xx_backbone.py` file which has the model graph \[[Example](https://github.com/keras-team/keras-nlp/blob/master/keras_nlp/models/distil_bert/distil_bert_backbone.py)\].
26-
- [ ] An `xx/xx_backbone_test.py` file which has unit tests for the backbone \[[Example](https://github.com/keras-team/keras-nlp/blob/master/keras_nlp/models/distil_bert/distil_bert_backbone_test.py)\].
25+
- [ ] An `xx/xx_backbone.py` file which has the model graph \[[Example](https://github.com/keras-team/keras-nlp/blob/master/keras_hub/models/distil_bert/distil_bert_backbone.py)\].
26+
- [ ] An `xx/xx_backbone_test.py` file which has unit tests for the backbone \[[Example](https://github.com/keras-team/keras-nlp/blob/master/keras_hub/models/distil_bert/distil_bert_backbone_test.py)\].
2727
- [ ] A Colab notebook link in the PR description which matches the outputs of the implemented backbone model with the original source \[[Example](https://colab.research.google.com/drive/1SeZWJorKWmwWJax8ORSdxKrxE25BfhHa?usp=sharing)\].
2828

2929
### Step 3: PR #2 - Add XXTokenizer
3030

31-
- [ ] An `xx/xx_tokenizer.py` file which has the tokenizer for the model \[[Example](https://github.com/keras-team/keras-nlp/blob/master/keras_nlp/models/distil_bert/distil_bert_tokenizer.py)\].
32-
- [ ] An `xx/xx_tokenizer_test.py` file which has unit tests for the model tokenizer \[[Example](https://github.com/keras-team/keras-nlp/blob/master/keras_nlp/models/distil_bert/distil_bert_tokenizer_test.py)\].
31+
- [ ] An `xx/xx_tokenizer.py` file which has the tokenizer for the model \[[Example](https://github.com/keras-team/keras-nlp/blob/master/keras_hub/models/distil_bert/distil_bert_tokenizer.py)\].
32+
- [ ] An `xx/xx_tokenizer_test.py` file which has unit tests for the model tokenizer \[[Example](https://github.com/keras-team/keras-nlp/blob/master/keras_hub/models/distil_bert/distil_bert_tokenizer_test.py)\].
3333
- [ ] A Colab notebook link in the PR description, demonstrating that the output of the tokenizer matches the original tokenizer \[[Example](https://colab.research.google.com/drive/1MH_rpuFB1Nz_NkKIAvVtVae2HFLjXZDA?usp=sharing)].
3434

3535
### Step 4: PR #3 - Add XX Presets
3636

37-
- [ ] An `xx/xx_presets.py` file with links to weights uploaded to a personal GCP bucket/Google Drive \[[Example](https://github.com/keras-team/keras-nlp/blob/master/keras_nlp/models/distil_bert/distil_bert_presets.py)\].
37+
- [ ] An `xx/xx_presets.py` file with links to weights uploaded to a personal GCP bucket/Google Drive \[[Example](https://github.com/keras-team/keras-nlp/blob/master/keras_hub/models/distil_bert/distil_bert_presets.py)\].
3838
- [ ] A `tools/checkpoint_conversion/convert_xx_checkpoints.py` which is reusable script for converting checkpoints \[[Example](https://github.com/keras-team/keras-nlp/blob/master/tools/checkpoint_conversion/convert_distilbert_checkpoints.py)\].
3939
- [ ] A Colab notebook link in the PR description, showing an end-to-end task such as text classification, etc. The task model can be built using the backbone model, with the task head on top \[[Example](https://gist.github.com/mattdangerw/bf0ca07fb66b6738150c8b56ee5bab4e)\].
4040

4141
### Step 5: PR #4 and Beyond - Add XX Tasks and Preprocessors
4242

4343
This PR is optional.
4444

45-
- [ ] An `xx/xx_<task>.py` file for adding a task model like classifier, masked LM, etc. \[[Example](https://github.com/keras-team/keras-nlp/blob/master/keras_nlp/models/distil_bert/distil_bert_classifier.py)\]
46-
- [ ] An `xx/xx_<task>_preprocessor.py` file which has the preprocessor and can be used to get inputs suitable for the task model \[[Example](https://github.com/keras-team/keras-nlp/blob/master/keras_nlp/models/distil_bert/distil_bert_preprocessor.py)\].
47-
- [ ] `xx/xx_<task>_test.py` file and `xx/xx_<task>_preprocessor_test.py` files which have unit tests for the above two modules \[[Example 1](https://github.com/keras-team/keras-nlp/blob/master/keras_nlp/models/distil_bert/distil_bert_classifier_test.py) and [Example 2](https://github.com/keras-team/keras-nlp/blob/master/keras_nlp/models/distil_bert/distil_bert_preprocessor_test.py)\].
45+
- [ ] An `xx/xx_<task>.py` file for adding a task model like classifier, masked LM, etc. \[[Example](https://github.com/keras-team/keras-nlp/blob/master/keras_hub/models/distil_bert/distil_bert_classifier.py)\]
46+
- [ ] An `xx/xx_<task>_preprocessor.py` file which has the preprocessor and can be used to get inputs suitable for the task model \[[Example](https://github.com/keras-team/keras-nlp/blob/master/keras_hub/models/distil_bert/distil_bert_preprocessor.py)\].
47+
- [ ] `xx/xx_<task>_test.py` file and `xx/xx_<task>_preprocessor_test.py` files which have unit tests for the above two modules \[[Example 1](https://github.com/keras-team/keras-nlp/blob/master/keras_hub/models/distil_bert/distil_bert_classifier_test.py) and [Example 2](https://github.com/keras-team/keras-nlp/blob/master/keras_hub/models/distil_bert/distil_bert_preprocessor_test.py)\].
4848
- [ ] A Colab notebook link in the PR description, demonstrating that the output of the preprocessor matches the output of the original preprocessor \[[Example](https://colab.research.google.com/drive/1GFFC7Y1I_2PtYlWDToqKvzYhHWv1b3nC?usp=sharing)].
4949

5050
## Detailed Instructions
@@ -81,7 +81,7 @@ around by a class to implement our models.
8181

8282
A model is typically split into three/four sections. We would recommend you to
8383
compare this side-by-side with the
84-
[`keras_nlp.layers.DistilBertBackbone` source code](https://github.com/keras-team/keras-nlp/blob/master/keras_nlp/models/distil_bert/distil_bert_backbone.py)!
84+
[`keras_hub.layers.DistilBertBackbone` source code](https://github.com/keras-team/keras-nlp/blob/master/keras_hub/models/distil_bert/distil_bert_backbone.py)!
8585

8686
**Inputs to the model**
8787

@@ -92,32 +92,32 @@ Generally, the standard inputs to any text model are:
9292
**Embedding layer(s)**
9393

9494
Standard layers used: `keras.layers.Embedding`,
95-
`keras_nlp.layers.PositionEmbedding`, `keras_nlp.layers.TokenAndPositionEmbedding`.
95+
`keras_hub.layers.PositionEmbedding`, `keras_hub.layers.TokenAndPositionEmbedding`.
9696

9797
**Encoder layers**
9898

99-
Standard layers used: `keras_nlp.layers.TransformerEncoder`, `keras_nlp.layers.FNetEncoder`.
99+
Standard layers used: `keras_hub.layers.TransformerEncoder`, `keras_hub.layers.FNetEncoder`.
100100

101101
**Decoder layers (possibly)**
102102

103-
Standard layers used: `keras_nlp.layers.TransformerDecoder`.
103+
Standard layers used: `keras_hub.layers.TransformerDecoder`.
104104

105105
**Other layers which might be used**
106106

107107
`keras.layers.LayerNorm`, `keras.layers.Dropout`, `keras.layers.Conv1D`, etc.
108108

109109
<br/>
110110

111-
The standard layers provided in Keras and KerasNLP are generally enough for
111+
The standard layers provided in Keras and KerasHub are generally enough for
112112
most of the usecases and it is recommended to do a thorough search
113-
[here](https://keras.io/api/layers/) and [here](https://keras.io/api/keras_nlp/layers/).
113+
[here](https://keras.io/api/layers/) and [here](https://keras.io/api/keras_hub/layers/).
114114
However, sometimes, models have small tweaks/paradigm changes in their architecture.
115115
This is when things might slightly get complicated.
116116

117117
If the model introduces a paradigm shift, such as using relative attention instead
118118
of vanilla attention, the contributor will have to implement complete custom layers. A case
119-
in point is `keras_nlp.models.DebertaV3Backbone` where we had to [implement layers
120-
from scratch](https://github.com/keras-team/keras-nlp/tree/master/keras_nlp/models/deberta_v3).
119+
in point is `keras_hub.models.DebertaV3Backbone` where we had to [implement layers
120+
from scratch](https://github.com/keras-team/keras-nlp/tree/master/keras_hub/models/deberta_v3).
121121

122122
On the other hand, if the model has a small tweak, something simpler can be done.
123123
For instance, in the Whisper model, the self-attention and cross-attention mechanism
@@ -154,23 +154,23 @@ and loaded correctly, etc.
154154
#### Tokenizer
155155

156156
Most text models nowadays use subword tokenizers such as WordPiece, SentencePiece
157-
and BPE Tokenizer. Since KerasNLP has implementations of most of the popular
157+
and BPE Tokenizer. Since KerasHub has implementations of most of the popular
158158
subword tokenizers, the model tokenizer layer typically inherits from a base
159159
tokenizer class.
160160

161161
For example, DistilBERT uses the WordPiece tokenizer. So, we can introduce a new
162-
class, `DistilBertTokenizer`, which inherits from `keras_nlp.tokenizers.WordPieceTokenizer`.
162+
class, `DistilBertTokenizer`, which inherits from `keras_hub.tokenizers.WordPieceTokenizer`.
163163
All the underlying actual tokenization will be taken care of by the superclass.
164164

165165
The important thing here is adding "special tokens". Most models have
166166
special tokens such as beginning-of-sequence token, end-of-sequence token,
167167
mask token, pad token, etc. These have to be
168-
[added as member attributes](https://github.com/keras-team/keras-nlp/blob/master/keras_nlp/models/distil_bert/distil_bert_tokenizer.py#L91-L105)
168+
[added as member attributes](https://github.com/keras-team/keras-nlp/blob/master/keras_hub/models/distil_bert/distil_bert_tokenizer.py#L91-L105)
169169
to the tokenizer class. These member attributes are then accessed by the
170170
preprocessor layers.
171171

172-
For a full list of the tokenizers KerasNLP offers, please visit
173-
[this link](https://keras.io/api/keras_nlp/tokenizers/) and make use of the
172+
For a full list of the tokenizers KerasHub offers, please visit
173+
[this link](https://keras.io/api/keras_hub/tokenizers/) and make use of the
174174
tokenizer your model uses!
175175

176176
#### Unit Tests
@@ -193,7 +193,7 @@ files. These files will then be uploaded to GCP by us!
193193
After wrapping up the preset configuration file, you need to
194194
add the `from_preset` function to all three classes, i.e., `DistilBertBackbone`,
195195
and `DistilBertTokenizer`. Here is an
196-
[example](https://github.com/keras-team/keras-nlp/blob/master/keras_nlp/models/distil_bert/distil_bert_backbone.py#L187-L189).
196+
[example](https://github.com/keras-team/keras-nlp/blob/master/keras_hub/models/distil_bert/distil_bert_backbone.py#L187-L189).
197197

198198
The testing for presets is divided into two: "large" and "extra large".
199199
For "large" tests, we pick the smallest preset (in terms of number of parameters)
@@ -228,12 +228,12 @@ and return the dictionary in the form expected by the model.
228228

229229
The preprocessor class might have a few intricacies depending on the model. For example,
230230
the DeBERTaV3 tokenizer does not have the `[MASK]` in the provided sentencepiece
231-
proto file, and we had to make some modifications [here](https://github.com/keras-team/keras-nlp/blob/master/keras_nlp/models/deberta_v3/deberta_v3_preprocessor.py). Secondly, we have
231+
proto file, and we had to make some modifications [here](https://github.com/keras-team/keras-nlp/blob/master/keras_hub/models/deberta_v3/deberta_v3_preprocessor.py). Secondly, we have
232232
a separate preprocessor class for every task. This is because different tasks
233-
might require different input formats. For instance, we have a [separate preprocessor](https://github.com/keras-team/keras-nlp/blob/master/keras_nlp/models/distil_bert/distil_bert_masked_lm_preprocessor.py)
233+
might require different input formats. For instance, we have a [separate preprocessor](https://github.com/keras-team/keras-nlp/blob/master/keras_hub/models/distil_bert/distil_bert_masked_lm_preprocessor.py)
234234
for masked language modeling (MLM) for DistilBERT.
235235

236236
## Conclusion
237237

238238
Once all three PRs (and optionally, the fourth PR) have been merged, you have
239-
successfully contributed a model to KerasNLP. Congratulations! 🔥
239+
successfully contributed a model to KerasHub. Congratulations! 🔥

LICENSE

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -1,4 +1,4 @@
1-
Copyright 2024, KerasNLP authors. All rights reserved.
1+
Copyright 2024, KerasHub authors. All rights reserved.
22

33
Apache License
44
Version 2.0, January 2004
@@ -188,7 +188,7 @@ Copyright 2024, KerasNLP authors. All rights reserved.
188188
same "printed page" as the copyright notice for easier
189189
identification within third-party archives.
190190

191-
Copyright 2024, KerasNLP authors.
191+
Copyright 2024, KerasHub authors.
192192

193193
Licensed under the Apache License, Version 2.0 (the "License");
194194
you may not use this file except in compliance with the License.

0 commit comments

Comments
 (0)