In the world of open-source, I believe nothing is as compelling as a well-contextualized compiler. Here is mine, doing my part to convert the planet to memory-safe Rust, in GPL form.
A memory-safe Rust port of tiny-gpt2, with GPT-2 weight download, tokenizer support, inference, and attention visualization.
Original project credit: Stephen Diehl
Source: https://github.com/sdiehl/tiny-gpt2
- Download GPT-2 config and weights from Hugging Face (
openai-community/gpt2) - Download GPT-2 tokenizer assets (
encoder.json,vocab.bpe) - Run greedy text generation from a prompt
- Visualize per-head attention weights as PNG heatmaps
- Keep implementation fully in safe Rust using
ndarray - Core logic is functional and idiomatic, with minimal mutable state
cd /Users/jrule/git/rust/tiny-gpt
cargo run -- generate --prompt "The quick brown fox" --max-tokens 20cd /Users/jrule/git/rust/tiny-gpt
cargo run -- viz --prompt "The quick brown fox jumps over the lazy dog" --block 0 --head 0 --out attention.pngGenerated with:
cd /Users/jrule/git/rust/tiny-gpt
cargo run -- viz --prompt "The quick brown fox jumped over the lazy dog." --block 0 --head 0 --out attention.pnggenerate: Generate continuation text from a prompttokens: Show token ids and token text piecesviz: Save attention heatmap for a specific block/head
Example:
cd /Users/jrule/git/rust/tiny-gpt
cargo run -- tokens --text "hello rust"This repository is licensed under GPL-3.0-or-later. See LICENSE.
