This repository hosts codebase and instrumentation for an application to push bundles given a notification stream.
This code features instrumentation for bundling notifications around a ranker algorithm which predicts readiness of a bundle given observed notifications.
Below is an overview of the codebase:
-
/datacontains all data, including:notifications.csvthe input file of observed notifications we're processing,bundles.csvthe output file containing the bundles,
-
/srccontains all entrypoints, including:bundlify.pythe main entrypoint which outputs bundles to stdout given a filepath to notifications,
-
/libcontains all additional codebase, including:/lib/models/which contains class models for our domain (e.g. notification, bundle)/lib/config.pywhich contains configuration for our entrypoints,/lib/utils.pywhich contains utils method e.g. argument parser,/lib/errors.pywhich contains bundlify specific errors,/lib/hypervisor.pywhich contains the bulk of the logic for coming with bundles from notifications defined as an abstract class with different implementations
-
/notebookscontains jupyter notebooks which help with visualisation and interpretation. Those include valuable insights into this exercise: here. -
/testscontains all unit tests written in pytest -
/doccontains all docs, currently ADRs -
Makefiledefines targets to ease with development along with utils targets, -
Dockerfiledefines our application docker image, based onpython:3.8-alpineimage; currently set withCMDtopython src/bundlifier.py data/notifications.csv > data/bundles.csv -
docker-entrypoint.shprovides a convenient entrypoint to our application running the main python entrypoint and redirecting output todata/bundles.csv. -
docker-compose.ymlprovides a convenient entrypoint to our application based on our Dockerfile.
Below is an illustrated flow of our main bundlify.py entrypoint.
- The bundlifier is updating the hypervisor with notifications as it is parsing the file,
- When a day's worth of observation has elapsed the Hypervisor computes and send bundles from notifications to stdout,
Binaries for git, make and python3 are needed. To use follow below steps:
- Git clone this repo anywhere then
cd <cloned_repo> - Install dependencies in a virtual environment e.g.:
python3 -m venv beautiful && source $(pwd)/beautiful/bin/activate && pip install -r dev-requirements.txt
To run linting, type checking and unit tests with coverage:
make check
To get all bundles on standard output run, with optional redirection:
PYTHONPATH=$(pwd) python src/bundlifier.py <path-to-notifications-csv> [ > <path-to-bundles.csv> ]
Needs the extra dependency for docker-compose. To create the volume: ./data:/bundlifier/data/ and run the main entrypoint using data/notifications.csv:
make run
Find your results at: /data/bundles.csv
For any compute-intense algorithms performances can be improved.
Multi-processing seems more fit to our use-case as a way to leverage a multi-core CPU; multi threaded or async programming are usually more fitted to improve IO bounds applications.