rust-matrixmicrogpt

A small GPT-style language model training prototype in Rust, using a hand-written matrix backend (column-major storage, no external tensor framework).

Origin

This project is based on:

Andrej Karpathy's microgpt concept and reference implementation.

How this version is different

Uses an explicit Matrix module (src/algebra.rs) instead of a tape/autodiff node graph.
Keeps computations in matrix form for forward and backward passes (src/model.rs).
Uses column-major matrix storage for performance.
Preserves reference training/inference behavior and sampling outputs as closely as possible while using matrix code paths.

What this program does

Loads a text dataset from input.txt (line-based documents).
Builds a character-level vocabulary from the dataset.
Trains a tiny decoder-only transformer on next-token prediction.
Prints periodic training loss.
Runs autoregressive sampling after training to generate text.

How it works

Data preparation (src/main.rs)

Reads non-empty lines from input.txt.
Shuffles documents with a Python-compatible MT19937 RNG.
Builds charset and adds a special BOS token id.

Model definition (src/model.rs)

Token embedding (wte) + positional embedding (wpe).
One attention block:
- RMSNorm
- Multi-head causal self-attention (wq, wk, wv, wo)
- Residual connection
One MLP block:
- RMSNorm
- fc1 -> ReLU -> fc2
- Residual connection
Final projection to vocabulary logits (lm_head).

Math primitives (src/algebra.rs)

Column-major Matrix type.
Core ops: matmul, transpose, ReLU, softmax-by-column, scaling.
RMSNorm forward/backward utilities.

Training loop (src/main.rs)

For each step, creates a token sequence:
- [BOS] + encoded_document + [BOS]
- truncated to the model block size before the final BOS.
Forward pass -> logits.
Cross-entropy loss and logits gradient.
Backward pass to compute gradients for all parameters.
Parameter update with Adam-like optimizer (step_all).

Inference (src/main.rs)

Starts from BOS.
Repeats:
- forward pass on current prefix
- temperature scaling
- softmax + weighted sampling
- stop on BOS or max length.

Run

cargo run --release

Name		Name	Last commit message	Last commit date
Latest commit History 1 Commit
src		src
.gitignore		.gitignore
Cargo.lock		Cargo.lock
Cargo.toml		Cargo.toml
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

rust-matrixmicrogpt

Origin

How this version is different

What this program does

How it works

Run

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

rust-matrixmicrogpt

Origin

How this version is different

What this program does

How it works

Run

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages