A Java port of Andrej Karpathy‘s llm.c.
-
Clone llm.c and follow instructions given there in README, section quick start (CPU). This will get you the dataset, the tokens, the small GPT-2 model (124M) released by OpenAI, and two executables for testing and training.
-
Clone this repository, open in VS Code, build and run the executables for testing and training.
The samples.md file provides the output of llm.java captured from the first working version with Java Stream parallelization on a Lenovo T15p notebook. There is a blog on parallelization with TornadoVM.
Andrej Karpathy - (llm.c)
Copyright (c) 2024 Andrej Karpathy - MIT License
Java implementation
Harry Jackson - (Java implementation)
Copyright (c) 2024 Harry Jackson - MIT License
Adopted file and endian handling from llm.java shared by @harryjackson.