Projects

Transformer from Scratch

PyTorch • Multi30k • Full encoder–decoder architecture

Implemented a full Transformer model from scratch in PyTorch, including multi-head attention, positional encodings, custom tokenizers, and a training loop on the Multi30k dataset. Focused on reproducing “Attention is All You Need” details and understanding every component end-to-end.

Transformer Project Github Link