Skip to content

CypherpunkArmory/momblish

Repository files navigation

momblish

Momblish is a small library and CLI for generating fake-but-pronounceable words from a source corpus.

http://mentalfloss.com/article/69880/7-fake-words-ended-dictionary

It is named after a "fake" word put into the OED on accident.

Momblish uses trigram analysis to generate mostly pronounceable gibberish, so it can be used for any language that can be n-gram analyzed.

Description

To use momblish, import it.

from momblish import Momblish

m = Momblish.english()

The built-in English loader analyzes the system dictionary once and caches the result in the XDG cache directory.

from momblish import Momblish
from momblish.corpus import Corpus

m = Momblish.english()
m.corpus.save("/tmp/corpus.json")

c = Corpus.load("/tmp/corpus.json")
n = Momblish(c)

To generate words directly, call word() on a Momblish instance. sentence() returns a generator of words of varying length.

m.word()                     # => "PONESSAL"
m.word(10)                   # => "MIDONIHYLA"
m.word(6, prefix="d")        # => "D..."
m.word(7, prefix="dabc")     # => "DABCADC"
w = m.sentence()
next(w)                      # => "TICK"
next(w)                      # => "DRIXY"
next(w)                      # => "UNREA"
m.sentence(3, word_length=5) # => ["LEDGE", "DEAKA", "HONGI"]

You can also analyze your own corpus file.

custom = Momblish.from_file("/tmp/words.txt")
custom.word(8, prefix="tr")

There is also a command-line interface for quick generation without writing code.

$ momble 6
$ momble 7 dabc
$ momble --rebuild-cache 7 dabc
$ momble --corpus /tmp/words.txt 7 dabc

The CLI uses the cached analyzed corpus when available. Pass --rebuild-cache to force re-analysis of either the default English corpus or the file supplied with --corpus.

About

What nonsense! Lango!

Resources

License

Stars

Watchers

Forks

Packages

No packages published

Contributors 2

  •  
  •