GitHub - x-tabdeveloping/noloox: Unsupervised machine learning methods that you will need, and not find elsewhere.

is a Python library containing reference implementations of a bunch of very useful unsupervised learning algorithms that you probably won't find elsewhere.

What is:

A collection of unsupervised machine learning algorithms
A scikit-learn compatible library
An educational resource containing worked examples and reference implementation

What isn't:

The most feature-complete or efficient implementation of these algorithms
A replacement for scikit-learn
An all-in-one machine learning framework
A library for complete Bayesian inference. Use a PPL like NumPyro, PyMC or Stan.

Basic usage

Install noloox from PyPI:

pip install noloox

Then you can load models from the library and use them the same way you would use scikit-learn.

from noloox.mixture import StudentsTMixture

model = StudentsTMixture(n_components=10)
cluster_labels = model.fit_predict(X)

Models

Model	What do I use it for?	JAX or NumPy?	What algorithm?
Peax	Cluster 2D data where the number of clusters is unknown.	NumPy	Expectation-Maximization
SNMF	Factor data, where you expect the factors to be non-negative, but the data is unbounded	JAX	Iterative updates
WNMF	NMF, but you don't want to weight all observations equally.	NumPy	Iterative updates
StudentsTMixture and CauchyMixture	Cluster continuous data in a way that is robust to outliers.	JAX	Expectation-Maximization
DirichletMultinomialMixture	Cluster count data/Short-text topic modelling	JAX	Collapsed Gibbs Sampling

Tutorials

Here are some things, that you can do in easier in than in scikit-learn:

Our philosophy and goals

Keep implementations simple and minimal, Minimal dependencies
Everything should either be implemented in NumPy or JAX. Preferably as many in JAX as possible.
Library structure should match sklearn standards, and all algorithms should be drop-in replacements for scikit-learn equivalents.
Under these restrictions, algorithms should be as fast as humanly possible

The wishlist:

There are a number of algorithms that would be nice to implement in the library. Contributions are very welcome.

ProdLDA, and amortized ProdLDA (CTMs) (without Flax)
Parametric-TSNE, possibly also Multi-scale Parametric-TSNE
DiRE
Infinite NMF
Latent Dirichlet Allocation with Gibbs Sampling
Gaussian LDA

Name		Name	Last commit message	Last commit date
Latest commit History 31 Commits
.github/workflows		.github/workflows
docs		docs
noloox		noloox
tests		tests
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
mkdocs.yml		mkdocs.yml
pyproject.toml		pyproject.toml
testing.py		testing.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

What is:

What isn't:

Basic usage

Models

Tutorials

Our philosophy and goals

The wishlist:

About

Uh oh!

Releases

Packages

Languages

License

x-tabdeveloping/noloox

Folders and files

Latest commit

History

Repository files navigation

What is:

What isn't:

Basic usage

Models

Tutorials

Our philosophy and goals

The wishlist:

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages