ETT-1346: testing for populate_rights_data.pl#79
Conversation
aelkiss
commented
May 12, 2026
- wrap populate_rights_data.pl in "main" function so we can unit test
- use Test2::Tools::Spec (similar to Test::Spec but maintained)
- remove unused --source option from populate_rights
- don't set a default source (from old HT-1733)
| 'archive=s' => \$archive, | ||
| 'rights_dir=s' => \$rights_dir, | ||
| 'note=s' => \$note, | ||
| 'source=s' => \$new_source_cmdline, |
There was a problem hiding this comment.
As far as I know this is not an option we would ever use.
| } | ||
|
|
||
| # Structure to keep track of results - which records were created. | ||
| my %results; |
There was a problem hiding this comment.
As written now, these need to be global, which makes this a bit more difficult to test. We could consider turning this whole thing into an object to have some place to store those results.
| print $rights "prtest.goodline2\tpd\tbib\ttestuser\tgoogle\n"; | ||
| close($rights); | ||
|
|
||
| my $res = qx(perl -w bin/populate_rights_data.pl --data=$tempdir/testfile3.rights --archive=$tempdir/archive 2>&1); |
There was a problem hiding this comment.
I wanted to avoid too much refactoring without tests. Adding main got it to the point where we could unit test anything, but running main itself leaves things in a weird state and potentially calls exit and ends the tests early. I guess the question is if it's worth it right now to do more refactoring on populate_rights_data.pl to make this more amenable to integration tests as well without needing to shell out.
There was a problem hiding this comment.
I would put a ticket in to refactor the logic in a more OO way, but would be perfectly satisfied with the new level of testing represented here.
* wrap populate_rights_data.pl in "main" function so we can unit test * use Test2::Tools::Spec (similar to Test::Spec but maintained) * remove unused --source option from populate_rights * don't set a default source (from old HT-1733)
7feb98f to
cacc03f
Compare
|
@moseshll questions to consider for review:
|
| use File::Temp qw(tempdir); | ||
| use POSIX qw(strftime); | ||
| use Test2::Bundle::Extended; | ||
| use Test2::Tools::Spec; |
There was a problem hiding this comment.
This is similar to Test::Spec in syntax, but appears to be maintained.
| like $@, qr(Invalid source); | ||
| }; | ||
|
|
||
| it "requires source if not previously loaded" => sub { |
There was a problem hiding this comment.
This is new behavior. Bib rights is generally responsible for getting the "source" (digitizer) from the Zephir metadata and setting it. In practice, this prevents us from accidentally loading rights for things that aren't in the repository.
| is([2,1,4],$rights); | ||
| }; | ||
|
|
||
| it "new source updates access profile (as specified in sources)" => sub { |
There was a problem hiding this comment.
The access profile stuff is kind of a mess; it's specified in multiple tables -- ht_collection_digitizers as well as sources. Ideally we'd probably be using ht_collection_digitizers in preference to sources. (There are also multiple keys for the digitization agent, and part of this stems from work around sources and access profiles that was started in 2014 and never fully completed after some staff departures.)
| ok($?); | ||
| ok($res =~ /Invalid namespace\/barcode/); | ||
|
|
||
| # Should have loaded goodline1, but not goodline2 (since it bailed out after badline) |
There was a problem hiding this comment.
I'm not sure this is really the behavior that we want, but it is the current behavior.
We should probably consider what we want to happen in this case -- I suspect ideally it would load the rest of the file and alert us in some way to the problem line, or maybe load nothing (but still alert.) Loading part of the file seems like a recipe for problems.
There was a problem hiding this comment.
Personally I do not like the idea of loading part of a file. Bad lines should be skipped and reported, and since they're independent of each other a bad one need not bring everything to a halt. (I suspect the most likely issue we will see is Unicode or other text weirdness in the note field.)
|
I have a couple of suggestions. is a repeated trope that can be stuck in something like Another thing I would add to increase coverage (and in the gfv overrides tests. More test coverage we want in place for OO refactor. In the main the changes make sense and seem well exercised. The coverage achieved here is respectable for a monolith with so much edge case resilience baked in. |
moseshll
left a comment
There was a problem hiding this comment.
I would definitely add the suggested additional gfv tests, provided they are actually correct (I think they are). All else passes muster.
* add constants for numerical values
* make process_rights_line tests more readable by using join("\t",...)
* stub out tests for harvard access profile (skipped; currently failing)
* use selectrow_array when only fetching a single value
* use placeholders for values selectrow_arrayref (more to do)
* add additional gfv tests
|
I don't think I went ahead and added the gfv tests. I also did some other refactoring for readability and work on the way towards the desired behavior w/ access profile for harvard material. |
|
Verified the constant checks for database error don't do anything -- if there is a database error, DBI raises an exception itself rather than returning, at least in the way we're using it. |
* Avoid explicit error checking -- if there is an error, the DBI functions themselves raise an exception. * Prepare statements for things previously using selectrow_arrayref / selectcol_arrayref in populate_rights_data.pl
See listed TODOs and FIXMEs.
https://metacpan.org/pod/DBI#finish says: "When all the data has been fetched from a SELECT statement, the driver will automatically call finish for you. So you should not call it explicitly except when you know that you've not fetched all the data from a statement handle and the handle won't be destroyed soon." Generally we're fetching one row from things that should return one or zero rows, so there isn't a need to use it.
|
What's here now passes the tests and I believe implements the requirement around using the 'open' access profile for Harvard, but there are several things we should fix while we have the opportunity; having the tests enables refactoring. I went ahead and removed the explicit error checking for database operations, because I don't think it was doing anything. I set this to draft for now; there are a few things I'd like to do:
If it makes sense to extract some objects in the course of doing that I think there's no reason to hold off. |
|
This also has the risk of becoming a bigger/long running PR, and it might make sense to draw a line at adding the tests and doing a little bit of the cleanup/refactoring before moving on to the support for the Harvard access profiles. @moseshll let me know what you think; again, I can take care of that when I'm back. |