Skip to content

botheredbybees/memory_jukebox

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

1 Commit
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

memory-jukebox

A personal photo-and-music reminiscence tool: pick a song you love, and watch a reel of photos from your own archive drift past, matched to the song's era, mood, and imagery. Intended for a family audience (one household, not public distribution) and designed to draw on the well-documented phenomenon of music-evoked autobiographical memory.

This repository currently holds the music metadata ingest pipeline — the first of several components. The photo pipeline lives separately in photo-dedupe; the jukebox player itself is not yet built.


What this component does

music_ingest.py builds a standalone SQLite database (music_index.db) that describes your music collection in enough detail to drive the jukebox's photo-matching engine. It reads from four sources and progressively enriches what it knows:

  1. Jellyfin album.nfo files — one per album folder, with inline track listings, genre, year, and MusicBrainz album IDs.
  2. Jellyfin artist.nfo files — per artist, with biography and MusicBrainz artist IDs.
  3. LRC lyric files — plain-text lyrics alongside audio files.
  4. MediaMonkey MM5.DB — play counts, ratings, date added, and last-played timestamps. This is where personal significance comes from: a song played 100 times and rated five stars matters differently from one that's merely present in the library.

Three optional enrichment passes then generate the semantic themes that will drive photo-matching:

  1. Genre dictionary — a small built-in map (folk → acoustic, reflective, earthy, intimate, storytelling; celtic → sea, rain, diaspora, pastoral, old, windswept; and so on).
  2. MusicBrainz artist-tag API — community-contributed mood and style tags, queried once per artist, applied to every song by that artist.
  3. Local LLM via Ollama — themes grounded in the artist biography and song lyrics, producing concrete imagery vocabulary (coast, dusk, kitchen, road) that matches the vocabulary the photo tagger uses.

Each phase is an independent, resumable subcommand. You can run them in any order, stop and restart freely, and re-run any one without affecting the others.


Why a separate database?

The music ingest is independent of the photo ingest. Writing to its own SQLite file (music_index.db) means:

  • The music pipeline can run on one machine while the photo pipeline runs on another.
  • The music database is a self-contained artefact — easy to back up, move, or share.
  • The jukebox player joins the two databases at query time via SQLite's ATTACH DATABASE — no schema coupling until the last possible moment.

Quick start

pip install -r requirements.txt

# Create the database (or just run any other phase — the schema is auto-created)
python music_ingest.py init --music-db D:\music_archive\music_index.db

# Ingest Jellyfin album and artist NFOs
python music_ingest.py jellyfin --music-db D:\music_archive\music_index.db --nfo-root D:\music

# Attach any LRC lyric files to their songs
python music_ingest.py lyrics --music-db D:\music_archive\music_index.db --music-root D:\music

# Merge MediaMonkey play counts, ratings, and dates
python music_ingest.py mediamonkey --music-db D:\music_archive\music_index.db --mm-db "%APPDATA%\MediaMonkey5\MM5.DB"

# Generate themes (each phase is independent — run as many as you want)
python music_ingest.py themes-genre --music-db D:\music_archive\music_index.db
python music_ingest.py themes-musicbrainz --music-db D:\music_archive\music_index.db
python music_ingest.py themes-llm --music-db D:\music_archive\music_index.db --year-from 1971 --year-to 1991

# See what you've got
python music_ingest.py report --music-db D:\music_archive\music_index.db

Requirements

requests>=2.31    (MusicBrainz API calls; optional)
ollama>=0.4       (LLM theme generation; optional)
tqdm>=4.0         (progress bars; optional)

requests, ollama, and tqdm are all optional — the script degrades gracefully if any are missing, refusing only the phases that need them. Only Python 3.10+ is required; everything else is standard library.

An NFO-saving Jellyfin music library is required for Phases 1 and 2. To enable NFO saving in Jellyfin, go to Dashboard → Libraries → (your music library) → Manage Library → turn on "Nfo" in the metadata savers list. The ingest handles UTF-8 BOMs, mangled ampersands, and stray whitespace without complaint.

A running Ollama server with a text-only LLM (llama3.1:8b, qwen2.5:7b, or similar) is required for Phase 7 only. The default host http://192.168.1.20:11434 can be overridden with --ollama-host.


What's in the database

The database has three primary tables and one operational log:

Table What it holds
artists One row per artist, with biography, MusicBrainz ID, and genre.
songs One row per song (or remix variant). Bibliographic fields, file path, lyrics, MediaMonkey play-stats, derived significance score.
song_themes Many-to-many of songs to themes, tagged with source so you can tell genre-dictionary themes from MusicBrainz from LLM-generated.
music_ingest_log Every run appends a row with a JSON stats blob — useful for debugging long runs.

The significance_score on each song is a derived value: log(play_count + 1) × (rating / 5.0), with unrated songs using 0.5 as a neutral multiplier. This single number captures "how much does this song seem to have mattered to me" and will weight the jukebox's song selection later.


Subcommand reference

Subcommand Required flags What it does
init --music-db Create or verify the database schema.
jellyfin --music-db, --nfo-root Parse all album.nfo and artist.nfo files under --nfo-root. Match each track to an audio file in its album folder.
lyrics --music-db, --music-root Find all *.lrc files under --music-root. Strip timestamps and metadata tags; attach to the matching song.
mediamonkey --music-db, --mm-db Open MM5.DB read-only, merge play counts and ratings into existing songs or insert new MM-only rows. Close MediaMonkey first.
themes-genre --music-db Apply the built-in genre-to-themes dictionary to every song with a genre.
themes-musicbrainz --music-db Query MusicBrainz for every artist with an MBID; apply returned tags to every song by that artist. Rate-limited to 1 request/second.
themes-llm --music-db Generate themes via Ollama. Supports --year-from, --year-to, --min-significance, --require-lyrics, --limit.
report --music-db Print coverage summary, top 20 songs by significance, and theme breakdown.

Known limitations

  • MediaMonkey schema drift. The MM4 and MM5 schemas share the same core tables (Songs, Artists, Albums, Genres) but MM6 or later versions may change this. If the query fails, the script prints a diagnostic and you can inspect the schema with sqlite3 MM5.DB .schema.
  • Windows path case sensitivity. The MM-to-Jellyfin path matching lowercases both sides, but if your libraries reference the same files via different mount points or drive letters, path-match will fail and fuzzy artist+title matching takes over. Check the matched_path vs matched_fuzzy numbers in the mediamonkey phase report.
  • Artist duplicates. Jellyfin may write an artist name differently on different albums ("The Beatles" vs "Beatles"). Each distinct spelling becomes a separate artists row. A one-off SQL UPDATE can merge these when they matter.
  • Music copyright. Designed for household/personal use only. Distribution of the resulting reels with copyrighted music would need separate licensing or substitution with Creative Commons material.

Related components

  • photo-dedupe — ingest, dedup, AI tagging, blur culling, and similar-scene culling for the photo archive. Independent development, same household, shared design philosophy.
  • match.py (planned) — the jukebox matching engine. Uses SQLite ATTACH DATABASE to join music_index.db and ingest_index.db at query time.
  • player/ (planned) — the jukebox itself. Likely a single-file HTML app using sql.js for local database access, or a Raspberry Pi embedded variant with a physical interface.

References

Istvandity, L. (2017). Combining music and reminiscence therapy interventions for wellbeing in elderly populations: A systematic review. Complementary Therapies in Clinical Practice, 28, 18–25. https://doi.org/10.1016/j.ctcp.2017.03.003

Janata, P. (2009). The neural architecture of music-evoked autobiographical memories. Cerebral Cortex, 19(11), 2579–2594. https://doi.org/10.1093/cercor/bhp008

Krumhansl, C. L., & Zupnick, J. A. (2013). Cascading reminiscence bumps in popular music. Psychological Science, 24(10), 2057–2068. https://doi.org/10.1177/0956797613486486

Lazar, A., Thompson, H., & Demiris, G. (2014). A systematic review of the use of technology for reminiscence therapy. Health Education & Behavior, 41(1_suppl), 51S–61S. https://doi.org/10.1177/1090198114537067

Westerhof, G. J., & Bohlmeijer, E. T. (2014). Celebrating fifty years of research and applications in reminiscence and life review: State of the art and new directions. Journal of Aging Studies, 29, 107–114. https://doi.org/10.1016/j.jaging.2014.02.003

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages