⚡ Transformer From Scratch

A complete Transformer language model built from scratch in PyTorch

Loading model...
📊 Model Overview
Parameters
Vocab Size
d_model
Heads
Layers
Device
✍️ Generate Text
0.8
10
100
📜 Generated Output
Generated text will appear here...
🏋️ Train Model
🏗️ Architecture
Input Tokens
Embedding × √d_model
+ Positional Encoding
Decoder Layer × N
Self-Attn → Add&Norm → FFN → Add&Norm
Layer Norm
Linear → Logits
Softmax → Next Token