Skip to content

HITS primitive based implementation#1898

Merged
rapids-bot[bot] merged 10 commits intorapidsai:branch-21.12from
kaatish:hits-prim-based
Nov 8, 2021
Merged

HITS primitive based implementation#1898
rapids-bot[bot] merged 10 commits intorapidsai:branch-21.12from
kaatish:hits-prim-based

Conversation

@kaatish
Copy link
Collaborator

@kaatish kaatish commented Oct 20, 2021

Depends on #1903. Fixes #1657

@kaatish kaatish requested review from a team as code owners October 20, 2021 18:10
@kaatish kaatish marked this pull request as draft October 20, 2021 18:11
@BradReesWork BradReesWork added DRAFT improvement Improvement / enhancement to an existing function non-breaking Non-breaking change labels Oct 20, 2021
@BradReesWork BradReesWork added this to the 21.12 milestone Oct 20, 2021
@kaatish kaatish self-assigned this Oct 24, 2021
@codecov-commenter
Copy link

codecov-commenter commented Oct 25, 2021

Codecov Report

❗ No coverage uploaded for pull request base (branch-21.12@61950dd). Click here to learn what that means.
The diff coverage is n/a.

❗ Current head 1075226 differs from pull request most recent head ccf5b78. Consider uploading reports for the commit ccf5b78 to get more accurate results
Impacted file tree graph

@@               Coverage Diff               @@
##             branch-21.12    #1898   +/-   ##
===============================================
  Coverage                ?   25.06%           
===============================================
  Files                   ?        6           
  Lines                   ?      371           
  Branches                ?        0           
===============================================
  Hits                    ?       93           
  Misses                  ?      278           
  Partials                ?        0           

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 61950dd...ccf5b78. Read the comment docs.

@BradReesWork
Copy link
Member

rerun tests

@kaatish kaatish removed the DRAFT label Oct 27, 2021
@kaatish kaatish marked this pull request as ready for review October 27, 2021 19:44
std::tuple<rmm::device_uvector<result_t>, // hubs
rmm::device_uvector<result_t>, // authorities
result_t, // error
size_t> // iteration count
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ChuckHastings @aschaffer
So, if HITS returns results in rmm::device_uvector, can we develop a C interface for this? (or should we better assume that we take pointers to the result arrays similar to other algorithms?).

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And we may discuss a common format (define a struct for every algorithm?) to return auxiliary information instead of just using a tuple.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can. rmm::device_uvector has a release method which returns a device_buffer. The device_buffer is the basic construct that we can pass across the C API because it is type erased.

My thoughts had been to let the C++ API use all of the C++17 features we like (return the tuple if that makes sense), and leave it to the C API implementation to do whatever translation is required.

Perhaps a discussion on a common format would be more appropriate in #1907

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Typed must be erased at the C-API level. So, the way to do it is via a visitor which wraps this call. This call can return whatever makes sense in the C++ ecosystem (device_uvector<T>). But the visitor would wrap that and translate the return into a device_buffer by doing a device_vector<T>::release().

Copy link
Contributor

@seunghwak seunghwak left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@BradReesWork
Copy link
Member

rerun tests

std::tuple<rmm::device_uvector<result_t>, // hubs
rmm::device_uvector<result_t>, // authorities
result_t, // error
size_t> // iteration count
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Typed must be erased at the C-API level. So, the way to do it is via a visitor which wraps this call. This call can return whatever makes sense in the C++ ecosystem (device_uvector<T>). But the visitor would wrap that and translate the return into a device_buffer by doing a device_vector<T>::release().

Copy link
Collaborator

@ChuckHastings ChuckHastings left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The approach looks correct. But at this point the implementation isn't tied in (although the PR is still marked in-progress, so that's not a problem yet).

Be sure to instantiate hits for SG and MG in separate files - as Seunghwa mentioned in his comments.

It would be nice to have an MG test also in this PR, but if that has to be deferred until later, I'm OK with adding an issue to the backlog to add an MG test for hits and we can address that in a separate PR.

@BradReesWork
Copy link
Member

rerun tests

@BradReesWork
Copy link
Member

rerun tests

Copy link
Contributor

@seunghwak seunghwak left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good in overall but I have some minor complaints about style & naming. I will also think about better names for few variables I complained.

@kaatish kaatish requested a review from seunghwak November 5, 2021 18:13
Copy link
Contributor

@seunghwak seunghwak left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks great!!!

@seunghwak
Copy link
Contributor

seunghwak commented Nov 8, 2021

Depends on #1903. Fixes #1657

You need to update this comment.

@ChuckHastings
Copy link
Collaborator

@gpucibot merge

@rapids-bot rapids-bot bot merged commit b3d14d8 into rapidsai:branch-21.12 Nov 8, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

improvement Improvement / enhancement to an existing function non-breaking Non-breaking change

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[ENH] Implement HITS using the graph primitives

6 participants