Skip to content

DO NOT MERGE! No more deadlocks#424

Closed
wolfsage wants to merge 5 commits intoandk:masterfrom
wolfsage:no-more-deadlocks
Closed

DO NOT MERGE! No more deadlocks#424
wolfsage wants to merge 5 commits intoandk:masterfrom
wolfsage:no-more-deadlocks

Conversation

@wolfsage
Copy link
Collaborator

Fixes #406

This cannot be done until the database is cleaned up of duplicates however. (Soon, I hope!)

To use this, we'll first have to do:

ALTER TABLE perms MODIFY `package` char(245) NOT NULL DEFAULT '';
ALTER TABLE packages MODIFY `package` varchar(128) NOT NULL DEFAULT '';
ALTER TABLE primeur MODIFY `package` char(245) NOT NULL DEFAULT ''; 

This avoids deadlocks that are very easy to trigger right now.

The problem is that to update or delete a row related to a
package, we must do:

    where LOWER(package) = LOWER(?)

And what this means is that mysql *cannot* use an index to resolve
the query, so it scans the *entire table*.

This is slow, and also causes lock waits in simple situations:

  1. You have two processes, A1 and B1
  2. A1 begins a transaction and does:

    INSERT INTO primeur VALUES('pac::kage', 'testuser');

  3. B1 does, *outside of a transaction*:

    DELETE FROM primeur WHERE LOWER(package) = LOWER('foo::bar');

At this point, B1 is stuck waiting on a lock!

By switching to using case insensitive keys, we can do index lookups:

    DELETE FROM primeur WHERE package = 'foo::BaR';

and no weird lock waits occur, and we avoid the deadlocks.

Case sensitive collation on the 'package' column made sense back
when we imagined a world where different case files had different
meanings, but we decided against that more recently, and this will
help further enforce that.
Otherwise, we could (historically) deadlock when trying to
delete from primeur *after* we've already deleted from perms,
and folks would lose comaint permissions, or other weird cases
could happen

By bringing this logic into the transaction, we should avoid
weird broken indexing.

Surely not doing this in a transaction was a mistake when this
code was introduced.
The new definition of package in primeur, perms, and packages
is case insensitive in mysql, so it needs to be case insensitive
in sqlite too, for testing.

This is ugly, but what can you do?
Now that the package column is case-insensitive and unique,
you can't *have* duplicate mixed case entries, so remove a
test that relies on that behaviour.
Since collation has changed from latin1_bin to latin1_swedish_ci
(the default - case insensitive), the order of things out of
the database is slightly different, so we have to adjust our
tests to account for that.
@wolfsage wolfsage changed the title No more deadlocks DO NOT MERGE! No more deadlocks Apr 30, 2023
@andk
Copy link
Owner

andk commented May 20, 2023

Hi Matthew,
could you please consider the scenario that people change case within their releases over time? Will the code in the MR properly deal with this? I believe proper handling of this would be a rejection. From reading the idea to make the package column case-insensitive I get the impression that mysql might lose control of the correct case sensitive package names. The idea came up while talking with @neilb about RLAUGHLIN/UDPmsg-0.10.tar.gz vs RLAUGHLIN/UDPmsg-0.11.tar.gz and about SHERZODR/CGI-Session-3.95 vs MARKSTOS/CGI-Session-4.48. There are more such examples in the history of the CPAN, where individual files within a package changed case. Breaks code on case-sensitive file systems. It follows quite naturally from working on a case insensitive filesystem, that developers make such changes, but it is something a repository like CPAN should politely not support.

I hope I'm not opening a can of worms with this question.

@neilb
Copy link
Collaborator

neilb commented May 20, 2023

I think we should allow / support people changing the case of their modules. Most of the examples I've seen have been internal modules, not really intended for direct public use. But I'm sure there will have been cases where people have changed the case of public-facing modules.

We already support changing case of modules in the indexer – the canonicaliser makes sure that all permissions are switched to the new casing. This works right now, I'm pretty sure, it's just historical cases that we're clearing up. The main thing is to ensure @wolfsage's change doesn't break that (I've not looked at it).

Yes this can cause problems, but it's a natural thing for people to do. For example, where there's a large project with multiple contributors, and someone adds a module that doesn't following their naming convention, so at some later date someone wants to bring it in line.

It's worth a blog post talking about the problems this can cause, so people are aware, and it would be even cooler if PAUSE could spot when this happens and direct people to the blog post (though most people don't read the emails from PAUSE, so shrug).

@wolfsage
Copy link
Collaborator Author

I can't see how this change would cause that problem but I suggest we (I) write a test case if one doesn't already exist to account for this before merging.

@andk
Copy link
Owner

andk commented Apr 26, 2024

Thanks, we let this be superceded by #440 but we cherry-picked the commit "mldistwatch: Rewrite perms/primeur in the transaction" from here. Deployed a few minutes ago.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Pause Deadlocks/database timeouts

3 participants