HITS primitive based implementation#1898
HITS primitive based implementation#1898rapids-bot[bot] merged 10 commits intorapidsai:branch-21.12from
Conversation
Codecov Report
@@ Coverage Diff @@
## branch-21.12 #1898 +/- ##
===============================================
Coverage ? 25.06%
===============================================
Files ? 6
Lines ? 371
Branches ? 0
===============================================
Hits ? 93
Misses ? 278
Partials ? 0 Continue to review full report at Codecov.
|
|
rerun tests |
cpp/src/link_analysis/hits_impl.cuh
Outdated
| std::tuple<rmm::device_uvector<result_t>, // hubs | ||
| rmm::device_uvector<result_t>, // authorities | ||
| result_t, // error | ||
| size_t> // iteration count |
There was a problem hiding this comment.
@ChuckHastings @aschaffer
So, if HITS returns results in rmm::device_uvector, can we develop a C interface for this? (or should we better assume that we take pointers to the result arrays similar to other algorithms?).
There was a problem hiding this comment.
And we may discuss a common format (define a struct for every algorithm?) to return auxiliary information instead of just using a tuple.
There was a problem hiding this comment.
We can. rmm::device_uvector has a release method which returns a device_buffer. The device_buffer is the basic construct that we can pass across the C API because it is type erased.
My thoughts had been to let the C++ API use all of the C++17 features we like (return the tuple if that makes sense), and leave it to the C API implementation to do whatever translation is required.
Perhaps a discussion on a common format would be more appropriate in #1907
There was a problem hiding this comment.
Typed must be erased at the C-API level. So, the way to do it is via a visitor which wraps this call. This call can return whatever makes sense in the C++ ecosystem (device_uvector<T>). But the visitor would wrap that and translate the return into a device_buffer by doing a device_vector<T>::release().
seunghwak
left a comment
There was a problem hiding this comment.
And we need explicit instantiation files as well.
See
https://github.com/rapidsai/cugraph/blob/branch-21.12/cpp/src/centrality/katz_centrality_mg.cu
and
https://github.com/rapidsai/cugraph/blob/branch-21.12/cpp/src/centrality/katz_centrality_sg.cu
|
rerun tests |
cpp/src/link_analysis/hits_impl.cuh
Outdated
| std::tuple<rmm::device_uvector<result_t>, // hubs | ||
| rmm::device_uvector<result_t>, // authorities | ||
| result_t, // error | ||
| size_t> // iteration count |
There was a problem hiding this comment.
Typed must be erased at the C-API level. So, the way to do it is via a visitor which wraps this call. This call can return whatever makes sense in the C++ ecosystem (device_uvector<T>). But the visitor would wrap that and translate the return into a device_buffer by doing a device_vector<T>::release().
There was a problem hiding this comment.
The approach looks correct. But at this point the implementation isn't tied in (although the PR is still marked in-progress, so that's not a problem yet).
Be sure to instantiate hits for SG and MG in separate files - as Seunghwa mentioned in his comments.
It would be nice to have an MG test also in this PR, but if that has to be deferred until later, I'm OK with adding an issue to the backlog to add an MG test for hits and we can address that in a separate PR.
|
rerun tests |
|
rerun tests |
seunghwak
left a comment
There was a problem hiding this comment.
Looks good in overall but I have some minor complaints about style & naming. I will also think about better names for few variables I complained.
|
@gpucibot merge |
Depends on #1903. Fixes #1657