Skip to content

HABTM to HMT for prototype taxons#327

Merged
jordan-brough merged 7 commits into
solidusio:masterfrom
BenMorganIO:remove-habtm-for-taxons-prototypes
Nov 2, 2015
Merged

HABTM to HMT for prototype taxons#327
jordan-brough merged 7 commits into
solidusio:masterfrom
BenMorganIO:remove-habtm-for-taxons-prototypes

Conversation

@BenMorganIO

Copy link
Copy Markdown
Contributor

No description provided.

@BenMorganIO BenMorganIO mentioned this pull request Aug 26, 2015
7 tasks
@BenMorganIO BenMorganIO changed the title promotion rule taxons HABTM to HMT for promotion rule taxons Aug 26, 2015

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you want spree_taxons_promotion_rules here right? also, you'll add a primary key as well yeah?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lols whoops...

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually, I cherry-picked the wrong commit. If you check the branch name, this was actually correct. Through an interactive rebase, I've removed the commits that aren't specific to this PR. Apologies.

@gmacdougall

Copy link
Copy Markdown
Member

This is looking good. Could you please add a note to the Changelog, so we can let people know about this one?

@jordan-brough jordan-brough changed the title HABTM to HMT for promotion rule taxons HABTM to HMT for prototype taxons Oct 15, 2015

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we want to cascade deletes by default at the database level? I'm not sure one way or the other but would like to know what others think.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like our DB maintaining referential integrity. It does a much better job of it than ActiveRecord. I also feel that cascade is the correct on_delete to use here.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm pro referential integrity. My only question is on the cascade. There might be some benefits to having the cascades be done in activerecord? I'm not sure.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The only loss I can think of is possibly no after_destroy/before_destroy callback. Not sure if we get to keep that.

@BenMorganIO

Copy link
Copy Markdown
Contributor Author

@gmacdougall some of the DB changes are already in master. I agree on adding it to the Changelog. I'm thinking that I'll do one bullet point notifying the db updates for HTM instead of HABTM and then sub bullet points for all the details.

Comment thread CHANGELOG.md Outdated

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I want to add a note about referential integrity here too. But seeing that Prototype <=> Taxon and Product <=> PromotionRule doesn't have it, then it might not be appropriate just yet.

@jordan-brough

Copy link
Copy Markdown
Contributor

Note: Github has hidden the comment thread but I'm still interested in people's thoughts on the "cascade" question. I don't have a super strong opinion but I wonder if we shouldn't keep the responsibility for cascades in Rails-land intead of DB-land. Having foreign keys is great, it's just the cascade part that I'm wondering about.

@BenMorganIO

Copy link
Copy Markdown
Contributor Author

@jordan-brough yeah I know. The pro here is that you get to kill the 0.2ms of time spent in a callback with rails searching for the values. Instead, PG just does it for you. So, if you have ~1000 associations, deletion is going to be a lot faster.

My main worry is the AR callbacks. Those nice hookable points that tend to get abused. After some reading, I don't believe they get called when you do taxon.destroy!. Why? Well...

Lets say you have a Taxon associated to 10 prototypes. If you perform taxon.destroy! then the AR hooks will run since a ruby object is loaded. Since there's referential integrity, the 10 PrototypeTaxon rows will be deleted. Since you don't load up the 10 ProtyotypeTaxon objects, then no AR callback is going to happen.

Wait, what if a taxon has 1000 prototypes? Uh oh... That on_delete: :cascade is looking mighty nice. But, if you have an after_destroy: :some_cleanup_method then it wont' get called.

I think they're fairly safe TBH. I don't recommend using AR callbacks unless you have a good reason to. They tend to cause bugs down the road.

What I'm also concerned about is counter caches, but there's none here, so I'm not going to fret too much about it.

@gmacdougall

Copy link
Copy Markdown
Member

I'm a big fan of having your database maintain referential integrity, because (in good databases) there's no way to bypass it. You can trust the database. Rails callbacks and validations can be easily bypassed.

If there's a situation where it's really important that we don't just cascade the things away, we should use a more appropriate on delete restriction for the foreign key (like restrict). Make sure rails does the dirty work, then let it destroy the record.

@jordan-brough

Copy link
Copy Markdown
Contributor

I'm a big fan of having your database maintain referential integrity, because (in good databases) there's no way to bypass it.

@gmacdougall I think most people here agree with you on that. The question (for me at least) isn't about whether to use the database to maintain referential integrity, it's whether to automatically cascade deletes. You don't get any extra integrity by adding "cascade", you just make it potentially faster and/or easier to work with the DB.

And DB cascades would skip AR callbacks right? That might be fine, but seems like something to consider. In the chatroom we kind of came to a consensus of we're not sure so let's start with just introducing foreign keys to Solidus and then we can think about cascades later. Do you think we should revisit that?

@gmacdougall

Copy link
Copy Markdown
Member

I do feel that cascade is the appropriate thing to do in this case. I don't feel that many-to-many join tables should contain logic as heavy as ActiveRecord callbacks.

We can note in the documentation for the class that these objects are subject to deletion through cascades and as such, callbacks should be written on one of the classes on either side of the many-to-many relationship.

@jordan-brough

Copy link
Copy Markdown
Contributor

So we'd set a tentative policy for now of only adding cascades to lightweight many-to-many join tables and we'll add prominent yardoc notes to those classes when we do so, right? I think that could be a good experiment, especially on these brand new join tables where we're sure that nobody is hooking into them yet.

@jhawthorn? @BenMorganIO? Others?

@BenMorganIO

Copy link
Copy Markdown
Contributor Author

Thoughts

@jordan-brough I think that's a good compromise. I do feel AR is handling a tiny bit too much now that there's referential integrity. I'm 👍 on making sure on not just documenting the cascades, but the referential integrity, when it exists, and under what conditions.

We can probably sit back and wait for the community to give us their thoughts. Since there's referential integrity in rails now, I'm curious as to how the community is going to move towards it and respond to it. We've already hit an odd spot.

I feel like we're tip toeing around the API design of AR. Maybe we should ask an AR contributor their thoughts? Mainly because AR + RI might not be mixing so well. (Or maybe it is mixing and we're missing something or it is "mixable" and Rails is missing something.)

We're all trying really hard to find that "sweet spot". I think we're getting there, but I feel like AR still needs to evolve more to help with that.

Moving forward:

  • Document what we're doing with referential integrity so no one is caught off guard.
  • Use referential integrity in new join tables or HABTM models gone HMT.
  • Wait on community feedback?

Example App

There's two things I'm worried about:

  1. before_destroy/after_destroy callbacks; and
  2. counter_cache

I feel like some devs will want to implement one of these two in the future. Since no dev has made a PR to Solidus to move the HABTM => HMT then its safe to say ensuring callbacks is probably not that important. If it was, where are the PRs to extend the models? I added the PRs mainly for apartment support.

Now, there's two SQL calls that will happen in the traditional rails sense. Lets assume there's 10 Authors for a post. Here's the example app: post_author_app

Post.first.destroy!
  Post Load (0.2ms)  SELECT  "posts".* FROM "posts"  ORDER BY "posts"."id" ASC LIMIT 1
   (0.1ms)  begin transaction
  AuthorPost Load (0.2ms)  SELECT "author_posts".* FROM "author_posts" WHERE "author_posts"."post_id" = ?  [["post_id", 1]]
  SQL (0.9ms)  DELETE FROM "author_posts" WHERE "author_posts"."id" = ?  [["id", 1]]
  SQL (0.0ms)  DELETE FROM "author_posts" WHERE "author_posts"."id" = ?  [["id", 2]]
  SQL (0.0ms)  DELETE FROM "author_posts" WHERE "author_posts"."id" = ?  [["id", 3]]
  SQL (0.0ms)  DELETE FROM "author_posts" WHERE "author_posts"."id" = ?  [["id", 4]]
  SQL (0.0ms)  DELETE FROM "author_posts" WHERE "author_posts"."id" = ?  [["id", 5]]
  SQL (0.0ms)  DELETE FROM "author_posts" WHERE "author_posts"."id" = ?  [["id", 6]]
  SQL (0.0ms)  DELETE FROM "author_posts" WHERE "author_posts"."id" = ?  [["id", 7]]
  SQL (0.0ms)  DELETE FROM "author_posts" WHERE "author_posts"."id" = ?  [["id", 8]]
  SQL (0.0ms)  DELETE FROM "author_posts" WHERE "author_posts"."id" = ?  [["id", 9]]
  SQL (0.0ms)  DELETE FROM "author_posts" WHERE "author_posts"."id" = ?  [["id", 10]]
  SQL (0.1ms)  DELETE FROM "posts" WHERE "posts"."id" = ?  [["id", 1]]
   (0.6ms)  commit transaction

After adding referential integrity and removing the dependent destroy:

2.2.3 :007 > AuthorPost.count
   (0.4ms)  SELECT COUNT(*) FROM "author_posts"
 => 10
2.2.3 :009 > Post.first.destroy!
  Post Load (0.4ms)  SELECT  "posts".* FROM "posts"  ORDER BY "posts"."id" ASC LIMIT 1
   (0.1ms)  BEGIN
  SQL (0.4ms)  DELETE FROM "posts" WHERE "posts"."id" = $1  [["id", 2]]
   (1.2ms)  COMMIT
 => #<Post id: 2, created_at: "2015-10-19 23:28:47", updated_at: "2015-10-19 23:28:47">
2.2.3 :010 > AuthorPost.count
   (0.4ms)  SELECT COUNT(*) FROM "author_posts"
 => 0

Now, with referential integrity AND dependent destroy:

2.2.3 :003 > AuthorPost.count
   (0.4ms)  SELECT COUNT(*) FROM "author_posts"
 => 10
2.2.3 :004 > Post.first.destroy!
  Post Load (0.5ms)  SELECT  "posts".* FROM "posts"  ORDER BY "posts"."id" ASC LIMIT 1
   (0.1ms)  BEGIN
  AuthorPost Load (0.2ms)  SELECT "author_posts".* FROM "author_posts" WHERE "author_posts"."post_id" = $1  [["post_id", 4]]
  SQL (0.2ms)  DELETE FROM "author_posts" WHERE "author_posts"."id" = $1  [["id", 31]]
  SQL (0.2ms)  DELETE FROM "author_posts" WHERE "author_posts"."id" = $1  [["id", 32]]
  SQL (0.1ms)  DELETE FROM "author_posts" WHERE "author_posts"."id" = $1  [["id", 33]]
  SQL (0.1ms)  DELETE FROM "author_posts" WHERE "author_posts"."id" = $1  [["id", 34]]
  SQL (0.1ms)  DELETE FROM "author_posts" WHERE "author_posts"."id" = $1  [["id", 35]]
  SQL (0.1ms)  DELETE FROM "author_posts" WHERE "author_posts"."id" = $1  [["id", 36]]
  SQL (0.1ms)  DELETE FROM "author_posts" WHERE "author_posts"."id" = $1  [["id", 37]]
  SQL (0.1ms)  DELETE FROM "author_posts" WHERE "author_posts"."id" = $1  [["id", 38]]
  SQL (0.1ms)  DELETE FROM "author_posts" WHERE "author_posts"."id" = $1  [["id", 39]]
  SQL (0.1ms)  DELETE FROM "author_posts" WHERE "author_posts"."id" = $1  [["id", 40]]
  SQL (0.2ms)  DELETE FROM "posts" WHERE "posts"."id" = $1  [["id", 4]]
   (1.1ms)  COMMIT
 => #<Post id: 4, created_at: "2015-10-19 23:31:18", updated_at: "2015-10-19 23:31:18">
2.2.3 :005 > AuthorPost.count
   (0.5ms)  SELECT COUNT(*) FROM "author_posts"
 => 0

This all looks great. All the ruby objects were loaded up :D. This is really good news. This means that AR callbacks get called and counter caches are updated :).

And what about #delete?

2.2.3 :006 > post = Post.create! authors: Author.all
 => #<Post id: 7, created_at: "2015-10-19 23:34:37", updated_at: "2015-10-19 23:34:37">
2.2.3 :014 > AuthorPost.count
   (0.4ms)  SELECT COUNT(*) FROM "author_posts"
 => 10
2.2.3 :015 > post.delete
  SQL (1.5ms)  DELETE FROM "posts" WHERE "posts"."id" = $1  [["id", 7]]
 => #<Post id: 7, created_at: "2015-10-19 23:34:37", updated_at: "2015-10-19 23:34:37">
2.2.3 :016 > AuthorPost.count
   (0.4ms)  SELECT COUNT(*) FROM "author_posts"
 => 0

It looks like the delete call also helps enforce the referential integrity and removes the data appropriately :).

However, in SQLite, foreign keys and referential integrity is a no-op (see foreigner docs which the rails feature is based on) so adding dependent: :destroy is still needed.

Conclusion

Since referential integrity doesn't work in SQLite, then we need to add dependent: :destroy. If you're using PG or MySQL, then you'll be happy to know that the callbacks will still be supported (phew) and when you do a #delete the data is still removed.

Question

How does this conclusion compare against the bullet points for Thoughts at the beginning of the post?

@jordan-brough

Copy link
Copy Markdown
Contributor

Hm, I didn't realize that Rails doesn't support SQLite foreign keys right now.

That makes me think we should hold off on cascading deletes for now. (Unless we wanted to drop support for SQLite.) Adding the foreign keys themselves seems OK because it's mostly functioning as a safety measure.

@BenMorganIO

Copy link
Copy Markdown
Contributor Author

@jordan-brough My conclusion is to add foreign keys with cascading deletes and add the dependent destroys. Everything is retained and foreign key support is added.

@jordan-brough

Copy link
Copy Markdown
Contributor

@BenMorganIO we chatted about this in the core meeting yesterday and the consensus was let's omit the cascading for now and just get the foreign keys in.

I think that means this PR is good to go. @gmacdougall looking at the MySQL and Postgres docs it seems like we don't need to specify "restrict". In MySQL "restrict" is equivalent to "no action" and in Postgres "restrict" would just means the check cannot be deferred until later in a transaction (e.g. this). Does that sound right?

@jhawthorn

Copy link
Copy Markdown
Contributor

Looks good to me 👍

@jordan-brough ?

@jordan-brough

Copy link
Copy Markdown
Contributor

👍

jordan-brough added a commit that referenced this pull request Nov 2, 2015
…otypes

HABTM to HMT for prototype taxons
@jordan-brough jordan-brough merged commit 09e1a82 into solidusio:master Nov 2, 2015
@BenMorganIO BenMorganIO deleted the remove-habtm-for-taxons-prototypes branch November 2, 2015 22:37
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants