Skip to content

Box sets#1513

Draft
audiomuze wants to merge 10 commits intoLMS-Community:public/9.2from
audiomuze:box-sets
Draft

Box sets#1513
audiomuze wants to merge 10 commits intoLMS-Community:public/9.2from
audiomuze:box-sets

Conversation

@audiomuze
Copy link
Author

damned signature again I presume.

@mikeysas
Copy link

Just adding that this PR will also address Issue #1430

Copy link
Member

@michaelherger michaelherger left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Don't be shocked! Despite my many comments I believe this really looks good overall and will be a welcome enhancement!

The one big question I'd have was about the need for the schema change. In what case would an album have a cover and artwork? Is the former rather something like boxset_artwork?

One other question would be: this doesn't cover albums (compilations) with an album artwork, but per track cover art, does it?

A lot of the rest is what I would consider "cleanup", because presumably the AI left some cruft behind after a few iterations.

Thanks a lot for your effort!

sub updateStandaloneArtwork {
my $class = shift;
my $cb = shift; # optional callback when done (main process async mode)
my ($cb, $opts) = _extractArtworkTaskArgs(@_);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see no changed call of updateStandaloneArtwork() - why would we need to change how we're reading parameters? I didn't find any caller passing additional options.

Comment on lines +251 to +256
my @bind;
if ( my @albumFilter = _sanitizeAlbumIds( $opts->{albums} ) ) {
my $placeholders = join(',', ('?') x @albumFilter);
$where .= " AND tracks.album IN ($placeholders)";
push @bind, @albumFilter;
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unless I'm missing something we'd never get an albums list. Therefore this block would never be executed.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It’s been removed.

Comment on lines 29 to +30
artwork
cover
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What's the difference between artwork and cover? Wouldn't an album always have one single artwork, namely the cover artwork? When would we display the artwork, when the cover?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No schema change would have been needed if albums were strictly limited to a single cover per album, but the box-sets flow requires two different concepts:

albums.artwork already stores the 8-char coverid token that UI/APIs pass around. That column must stay a compact identifier; it isn’t a path and can’t point directly to a filesystem image. It also gets overwritten during normal scans any time track-level art changes, because the scanner updates albums.artwork with whichever coverid it last saw for the album.
(See box_sets_artwork.md:21-58.)

The new albums.cover column—added via schema_27_up.sql and schema_27_up.sql—persists the actual filesystem path to the “parent” artwork that should represent the entire box set. Without that extra column we have no durable way to remember “use this shared folder image for the album, even if each disc has its own coverid,” because the moment a track with different artwork is scanned, albums.artwork would flip to that disc’s ID and the parent image would be lost.

In short: albums.artwork is still populated and used everywhere, but box-set handling needs a second field to pin the authoritative source image so we can keep disc-specific art and album-level art simultaneously.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@audiomuze .

albums.artwork already stores the 8-char coverid token that UI/APIs pass around. That column must stay a compact identifier; it isn’t a path and can’t point directly to a filesystem image. It also gets overwritten during normal scans any time track-level art changes, because the scanner updates albums.artwork with whichever coverid it last saw for the album. (See box_sets_artwork.md:21-58.)

Yes albums.artwork gets set every time a track is processed in the scanner, but the post-scan process resets it to (at the moment) the image associated with the first track of the album. #1517 changes this so that a stand-alone image will be preferred over an embedded image, even if track 1 has a different embedded image. I believe it is fairly simple to enhance this further so that an image further up the directory tree (or otherwise meeting your criteria) is preferred if it exists.

BTW, the "compact identifier" is actually the first 8 characters of an md5 sum which is used as the cache key.

The new albums.cover column—added via schema_27_up.sql and schema_27_up.sql—persists the actual filesystem path to the “parent” artwork that should represent the entire box set. Without that extra column we have no durable way to remember “use this shared folder image for the album, even if each disc has its own coverid,” because the moment a track with different artwork is scanned, albums.artwork would flip to that disc’s ID and the parent image would be lost.

In short: albums.artwork is still populated and used everywhere, but box-set handling needs a second field to pin the authoritative source image so we can keep disc-specific art and album-level art simultaneously.

As long as we ensure the logic I describe above in all scan events, I don't think we need to store the full path. I pulled your branch earlier and gave it a go: you are updating the albums.artwork column to match proper image so I don't think we need the new column.

Unless I'm missing something? Always a possibility!

I'm working on some ideas along these lines in order to simplify things a bit and should have something for you to look at tomorrow.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I’ve not mapped out the full logic required for using covers for GROUPING yet and avoided making changes to scanner logic as I figured it had a high chance of rejection. Now that’s not the case I’d agree the schema change would not be required so long as the scanner gives precedence to the external as album cover in the root folder when it exists, and does the same for in-subfolder cover art where it exists. I’m inclined to deal with GROUPING cover art handling as a separate PR,but perhaps that’s naive because it may mean having to make changes to the logic implemented via this PR if accepted. Thoughts appreciated.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The path might still be needed when a client requests a size which hasn't been pre-cached. Eg. Default would NOT pre-cache the size used in the Now Playing area, as it's only used once every few minutes. It only pre-caches sizes which are requested in large numbers, eg. when browsing.

Can we make sure the terminology used for tracks and albums is either identical, or more self-explanatory?

(and now I'm wondering what I did for the artist portraits...)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hold on, the coverid is not the cache key itself, but an identifier which is a hash of different attributes (path, size, timestamp), like a checksum.

Comment on lines +511 to +513
my $dir_blob = $album->{directories} || '';
my @dirs = $dir_blob ? split(/,(?=file:\/\/)/, $dir_blob) : ();
my @paths = map { Slim::Utils::Misc::pathFromFileURL($_) } @dirs;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I commented above to leave the parsing to the Perl code. So if you received a list of file://path/to/music/track01.flac,file://path/to/music/track02.flac,... the following might do the job:

my @paths = Slim::Utils::Misc::uniq(map {
	Path::Class::dir( Slim::Utils::Misc::pathFromFileURL($_) )->parent;
} split ',', $album->{directories} || '');

I didn't test this. But what it's supposed to do is create a list @paths from the query's comma separated directories. Start by splitting it into a list. Then take each elemnt of the list, transform form URL to path, and use Path::Class to get its parent. uniq will remove duplicates. I hope you get the idea and can fix what I've broken there 😁.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Refactored Artwork.pm so updateDiscSetArtwork() now queries one row per album, using GROUP_CONCAT(DISTINCT tracks.url) to build a directories blob, and parse it inside Perl. Now processes each album with a clean list of unique directories before resolving the common parent and generating the coverid.
Added _albumDirectoriesToPaths() in Artwork.pm:625-666 to convert the aggregated blob into deduplicated filesystem paths (via pathFromFileURL + Path::Class::file(...)->dir) to let Perl handle the parsing.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@michaelherger, think I've addressed all your comments locally on my desktop and updated my box-sets branch online also. Guessing that means all changes reflected here also?

@michaelherger
Copy link
Member

Oh, one more thing: please rebase on top of 9.2. I'm sorry about that...

@michaelherger
Copy link
Member

One more thing... in the tracks table we store a cover ID which uses file path and some more metadata to make sure we don't miss file updates. Wouldn't this be needed here, too?

@audiomuze
Copy link
Author

I'll work through these over the weekend. Anywhere you see additional logging is temporary and just me wanting to see evidence of certain activities to understand whether something's working as intended.

@mikeysas
Copy link

One other question would be: this doesn't cover albums (compilations) with an album artwork, but per track cover art, does it?

From my testing it does if the per track cover art is embedded in a tag. I tested varied embedded artwork in tracks with a cover.jpg file that was picked up for the Album Artwork.

I don’t think you need to add support for a track_title.jpg personally. IMHO, supporting that with embedded artwork is sufficient.

@audiomuze
Copy link
Author

Oh, one more thing: please rebase on top of 9.2. I'm sorry about that...

done.

@darrell-k
Copy link
Contributor

Can we hold off on merging this for now, I need to get to the bottom of a bug in a related area, see https://forums.lyrion.org/forum/user-forums/logitech-media-server/1812320-issue-with-displaying-classical-works-in-lms-9-1-0

I'll have a proper look at this PR tomorrow to ascertain whether there is crossover or not.

@audiomuze
Copy link
Author

There are still some of @michaelherger's comments I need to address. Don't think there'll be anything to merge until next weekend of so, and as I'm expecting further feedback, even that is optimistic. Also, I've not yet addressed handling covers by grouping rather than discnumber. I'll likely tack that when my tokens reset and there's not a 3x multiplier on Claude.

@audiomuze
Copy link
Author

I don’t think you need to add support for a track_title.jpg personally. IMHO, supporting that with embedded artwork is sufficient.

I'd agree with that - anyone adding individual track covers to compilations has worse OCD than I do, and is likely embedding them. As someone recently said, "Fix your tags" would be my response.

@mikeysas
Copy link

I'd agree with that - anyone adding individual track covers to compilations has worse OCD than I do, and is likely embedding them. As someone recently said, "Fix your tags" would be my response.

FYI - The better use case I have seen for varying artwork by tracks is with some artists who are providing digital downloads of albums with embedded artwork unique to each track, but again it is embedded so this will be covered. I agree that that tagging a compilation like is a bit over the top, but embedding the artwork will allow for support of that also.

@darrell-k
Copy link
Contributor

There are still some of @michaelherger's comments I need to address. Don't think there'll be anything to merge until next weekend of so, and as I'm expecting further feedback, even that is optimistic. Also, I've not yet addressed handling covers by grouping rather than discnumber. I'll likely tack that when my tokens reset and there's not a 3x multiplier on Claude.

I've had a look through this PR and also scanned the forum thread. I don't think there's any crossover with #1517 that I submitted earlier today, as that problem was with track level images which I don't think you are changing here, and also the solution was an amendment to Slim/Control/Queries.pm, which again is not part of this PR.

One thing I did find in my analysis for #1517 though: the post-scan process Slim::Music::Artwork::updateStandaloneArtwork (which does affect embedded artwork despite its name) will mostly but not always consolidate tracks.trackid for an album to a single value for embedded images of the same size (the scanner initially creates a different coverid for each track because it's based on an md5 sum which includes the track path). I don't know if this is something that needs to be considered for this PR?


- `tracks.coverid` / `tracks.cover` represent per-track artwork.
- `tracks.coverid` is an 8-hex ID.
- `tracks.cover` is either a filesystem path (standalone image) or a special value indicating embedded artwork.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's actually the size of the embedded artwork.

@darrell-k
Copy link
Contributor

One thing I did find in my analysis for #1517 though: the post-scan process Slim::Music::Artwork::updateStandaloneArtwork...

Also Slim::Music::Artwork::precacheAllArtwork

Copy link
Member

@michaelherger michaelherger left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I appreciate your effort looking into this, really. But honestly: I'm struggling to even finish the review... I've already spent more than an hour going through the latest iteration, and I'm far from done. There's so much noise in this code. Half of it seems to be validations for whatever could be wrong. Nothing wrong with defensive programming. But it's just too much. Sometimes we'd check a value before passing it to a sub, then first thing in the sub we would check it again. Or we remove whitespace from a potential number in a string using one regex, just to follow it by another regex which would extract the number only. Check whether the value was a ref, just to then ignore the result. And so on. Some helper routines would basically be a one-liner without the validation - which was already done before the function was called. So just more noise in the form of a separate sub which is being used exactly once.

I really don't know how to continue with this. Whether I should just merge it and clean it all up myself. Or invest the time hoping you can pick some up for the future. Or whether I should try to re-implement the crucial parts myself. Or whether I should wait for @darrell-k to pick it up 😂. But I believe this is not going to be a success story for vibe coding... I believe the change might be doing what it's supposed to do. But it's not maintainable the way it is.

Comment on lines +190 to +193
my ($class, %args) = @_;
my $cb = delete $args{cb};
$cb = undef unless ref $cb eq 'CODE';
my $albumFilterArg = delete $args{albums};
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do not change the signature of an existing function. It's unlikely anyone is using it, but still. If you need an additional parameter - so be it, add another parameter to the list. But don't touch existing behaviour as you might break existing code from third parties.

my $isInfo = main::INFOLOG && $log->is_info;

my @names = qw(cover Cover thumb Thumb album Album folder Folder);
my $discNumber = _normalizedDiscNumber( eval { $track->disc } );
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No need for an eval: if you have a track object, that attribute must be defined. It might potentially hide other issues.

my $isInfo = main::INFOLOG && $log->is_info;

my $art = $findArtCache{$dirurl};
my $discNumber = _normalizedDiscNumber( $trackAttributes->{disc} // $trackAttributes->{'tracks.disc'} );
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Under what circumstances would we have ->{disc} and when ->{'tracks.disc'}? Do we really need both?

$candidate = Slim::Utils::Misc::pathFromFileURL($candidate) || $candidate;
}

my $base = eval { basename($candidate) } || $candidate;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No eval please. basename would already return the full string if can't be parsed. And then you could return directly instead of checking for $base repeatedly below:

my $base = basename($candidate) || return;

Comment on lines +705 to +706
return _normalizedDiscNumber($1) if defined $base && $base =~ /^\s*(\d+)[-_]/;
return _normalizedDiscNumber($1) if defined $base && $base =~ /^(?:disc|cd)\s*(\d+)/i;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's making a bunch of too simplistic assumptions. What about 01. First Track.mp3? Or German DIsk_01? French Disque No. 2? (I don't speak more languages, but you get the point 😁). And then there are albums like Adele's "18" and "21".

We certainly can't cover them all, but need to think about this some more.

Comment on lines +75 to +83
my $discCacheKey = defined $discNumber ? $dirurl . '|disc:' . $discNumber : undef;
my $art;

if ( defined $discCacheKey && exists $findArtCache{$discCacheKey} ) {
$art = $findArtCache{$discCacheKey};
}
elsif ( exists $findArtCache{$dirurl} ) {
$art = $findArtCache{$dirurl};
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's again way too noisy for my taste: the exists $findArtCache($arg) and followed assignment basically do the work twice. First to check whether a value exists, then to read it.

my $art = $findArtCache{"$dirurl|disc:$discNumber"} || $findArtCache{$dirurl};

The above would do the same (assuming an empty string or 0 are invalid path names).

Comment on lines +402 to +405
my ($class, %args) = @_;
my $cb = delete $args{cb};
$cb = undef unless ref $cb eq 'CODE';
my $albumFilterArg = delete $args{albums};
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please as before: just use positional parameters, even if only for consistency.

Copy link
Author

@audiomuze audiomuze Feb 26, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Clearly vibe coding this isn't working, and I've zero Perl skills. I think the lesson to be learned for me here is that I should at least take a peek at the code and try to follow the logic of what's being done to catch stupid things. If this were Python it'd be a different story, but I was hoping to be able to implement changes without having to invest in Perl. If you decide to abandon it, no offence taken.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the idea is a good one. I've been concentrating on simplifying the higher level logic, rather than @michaelherger 's more technical review. Later today I'll create a new draft PR to compare, and let's take it from there.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

FWIW, I had it compare the branch against the main upstream slimserver and then look at box-sets critically looking for inefficiency, redundancy etc.

This what it conjured:
Box Sets Functionality

  • Artwork.pm:23-239 now derives disc numbers from tags and path heuristics, seeds disc/subtitle-specific filename templates, and keeps a disc-aware cache so cover-disc2.jpg in shared folders can coexist with the standard names that upstream prefers.
  • Artwork.pm:392-530 adds updateDiscSetArtwork(), which gathers every local multi-disc album, computes the common parent directory, and scans it for cover|folder|album|thumb (plus prefs-defined names) to populate the new albums.cover column introduced in schema_27.
  • When artwork is served, Graphics.pm:260-314 now checks whether a requested coverid is tied to albums.cover and, if so, serves the parent-folder asset instead of a track-level image; the accompanying behavior is documented for users in box_sets_artwork.md:1-210.

Issues & Redundancies

  • Artwork.pm:193-244 accepts an albums => [...] filter for updateStandaloneArtwork, but the parameter is never consumed in the SQL. Targeted album rescans (e.g., Commands.pm:2768-2811) still force a full-table walk. Either wire the IDs into the WHERE clause or drop the arg to avoid dead API surface.
  • When the scanner runs, every track now logs candidate filenames with $log->error (Artwork.pm:64-110). That produces error-level noise for normal operation and slows scans substantially on large libraries. Switch this to DEBUG (or guard it behind an explicit pref) to keep logs and IO sane.
  • _findPreferredArtworkFile() scans every image in a directory, builds an array, sorts preferred names, and only then returns the first match (Artwork.pm:556-626). Because findStandaloneArtwork() calls this per track directory, the box-sets branch now performs an O(N) walk even when the first preferred filename exists. Stream the iterator (return as soon as a preferred basename is seen) or cache directory listings so multi-disc albums don’t repeatedly traverse the same folder.
  • updateDiscSetArtwork() unconditionally writes artwork and cover for every multi-disc album on every scan (Artwork.pm:492-517), even when the stored values already match. That thrashes the database, invalidates artwork caches, and causes unnecessary schema commits. Compare against the current albums.cover path/coverid before issuing the UPDATE.
  • The SQL feeding updateDiscSetArtwork() builds GROUP_CONCAT(DISTINCT tracks.url) per album (Artwork.pm:428-462). On MySQL the default group_concat_max_len (1024) truncates long path lists, so large box sets stop finding a common parent. Instead, select DISTINCT dirname(tracks.url) via a subquery or temp table so you never hit string limits and you ship fewer bytes across Perl.
  • The new post-processing pass is invoked twice: once in the dedicated scanner (Import.pm:458-495) and again when in-process rescans finish (Local.pm:1206-1234). Running updateStandaloneArtwork → updateDiscSetArtwork → precacheAllArtwork twice back-to-back doubles the work for online rescans. Gate one of the call sites (e.g., only run the Local.pm chain when main::SCANNER is false) or refactor to a single entry point.
  • Every artwork HTTP request now hits the DB with SELECT cover FROM albums WHERE artwork = ? regardless of whether the coverid is even album-scoped (Graphics.pm:260-314). For standard albums this adds an extra prepare/execute/fetch per request with no benefit. Cache the handful of coverids that actually map to albums.cover, or only perform the lookup after a track-level query fails.
  • GROUP_CONCAT aside, there is no code to clear albums.cover when a parent image disappears; once set, the path remains even if the file is deleted, and neither updateStandaloneArtwork() nor updateDiscSetArtwork() ever removes it. Add a verification step (stat the file and UPDATE cover = NULL when missing) to prevent serving stale filesystem paths.
  • Addressing these items will trim the cruft, cut redundant scanning work, and keep the new box-set behavior maintainable. Next steps I’d suggest are (1) deciding whether updateStandaloneArtwork should actually honor the caller’s album list and (2) profiling the directory walk helper so you can choose between caching and early exit strategies.

Comment on lines +539 to +553
my @urls = split /,(?=file:\/\/)/, $blob;
my @paths;

for my $url (@urls) {
next unless $url =~ /^file:\/\//;
my $path = Slim::Utils::Misc::pathFromFileURL($url);
next unless $path;

my $dir = eval { Path::Class::file($path)->dir->stringify };
next unless $dir;

push @paths, $dir;
}

return Slim::Utils::Misc::uniq(@paths);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sub _albumDirectoriesToPaths {
	my $blob = shift || '';

	return Slim::Utils::Misc::uniq(
		grep { $_ }
		map { 
			my $path = Slim::Utils::Misc::pathFromFileURL($_);
			eval { Path::Class::file($path)->dir->stringify };
		}
		grep { /^file:\/\// } split(/,(?=file:\/\/)/, $blob)
	);
}

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

my suggestion is untested only wanted to highlight a less noisy way to do the same.

Comment on lines +683 to +685
if ( !ref $value ) {
$value =~ s/^\s+|\s+$//g;
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shouldn't we return immediately if the value was a ref?

Comment on lines +687 to +690
if ( defined $value && $value =~ /(\d+)/ ) {
my $disc = int($1);
return $disc if $disc > 0;
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here we're extracting the number from $value. We wouldn't have needed to remove trailing/leading spaces from it above.

@darrell-k
Copy link
Contributor

Just an update as to where I am with this.

I found there are bugs in the existing code when doing a new & changed scan (in the Slim::Music::Artwork->updateStandaloneArtwork() routine) when an album is in multiple folders with multiple images.

This is something else the LLM approach isn't addressing. I'm currently working through the problems as well as trying to refactor the changes in this PR.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants