I’ve decided to go back and re-write a transformer architecture from scratch. Find it: gpt-from-scratch