tinygpt-2 : A GPT-2 architecture from Scratch

Welcome! This repository is all about understanding and building a GPT-2 model from scratch using PyTorch. Inspired by Andrej Karpathy's brilliant tutorials, this project breaks down the process into simple steps while keeping it engaging and accessible.

Cool Features

Customizable Settings: Tweak the model's vocabulary size, embedding dimensions, layers, and more with the GPTConfig class.
Attention Mechanisms: Implements causal self-attention to ensure predictions only consider past tokens—think of it as the model's memory.
Streamlined Layers: Efficient MLP layers make computation a breeze.
Pretrained Model Support: Plug in Hugging Face models for quick results.
Text Creation: Use the generate method to produce coherent and creative text.

Getting Started

First, make sure you have Python 3.7 or later. Then, install the required libraries:

pip install torch tiktoken

How to Use

Step 1: Set Up Your Model

from model import GPT, GPTConfig

# Configure your model
config = GPTConfig(
    block_size=1024,
    vocab_size=50257,
    n_layer=12,
    n_head=12,
    n_embd=768,
    dropout=0.1,
    bias=True
)

# Initialize your GPT model
model = GPT(config)

Step 2: Load Pretrained Weights

from transformers import GPT2LMHeadModel

# Load pretrained GPT-2 weights
pretrained_model = GPT.from_pretrained("gpt2")

Examples to Try

Example 1: Finish the Sentence

prompt = "The universe is vast and"
tokens = enc.encode(prompt)
inputs = torch.tensor([tokens], dtype=torch.long)

# Generate the next part of the text
generated = model.generate(inputs, max_new_tokens=15)
print(enc.decode(generated[0].tolist()))

Example 2: Write a Short Story

prompt = "Once upon a time in a land far, far away,"
tokens = enc.encode(prompt)
inputs = torch.tensor([tokens], dtype=torch.long)

# Let the model continue the story
generated = model.generate(inputs, max_new_tokens=20)
print(enc.decode(generated[0].tolist()))

Shoutout

Big thanks to Andrej Karpathy for his incredible tutorials and code snippets that made this project possible. His work continues to inspire the AI community.

License

This project is licensed under the MIT License.

Name		Name	Last commit message	Last commit date
Latest commit History 2 Commits
README.md		README.md
gpt2.py		gpt2.py
playbook.ipynb		playbook.ipynb

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

tinygpt-2 : A GPT-2 architecture from Scratch

Cool Features

Getting Started

How to Use

Step 1: Set Up Your Model

Step 2: Load Pretrained Weights

Examples to Try

Example 1: Finish the Sentence

Example 2: Write a Short Story

Shoutout

License

About

Releases

Packages

Languages

gitbharrat/TinyGPT2---A-mini-GPT2-model

Folders and files

Latest commit

History

Repository files navigation

tinygpt-2 : A GPT-2 architecture from Scratch

Cool Features

Getting Started

How to Use

Step 1: Set Up Your Model

Step 2: Load Pretrained Weights

Examples to Try

Example 1: Finish the Sentence

Example 2: Write a Short Story

Shoutout

License

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages