[FEA] Added Sorensen algorithm to Python API#1820
[FEA] Added Sorensen algorithm to Python API#1820rapids-bot[bot] merged 15 commits intorapidsai:branch-21.10from
Conversation
Codecov Report
@@ Coverage Diff @@
## branch-21.10 #1820 +/- ##
================================================
+ Coverage 59.85% 69.60% +9.74%
================================================
Files 77 143 +66
Lines 3547 8672 +5125
================================================
+ Hits 2123 6036 +3913
- Misses 1424 2636 +1212 Continue to review full report at Codecov.
|
rlratzel
left a comment
There was a problem hiding this comment.
Looks ok, the only real issue is the pandas import which I mentioned below. I'm also wondering if there's a better way to test since there's a lot of repetition and calling Nx may not be necessary (seems like it's mostly testing Jaccard that way).
| G, isNx = check_nx_graph(G) | ||
|
|
||
| if isNx is True and ebunch is not None: | ||
| vertex_pair = cudf.from_pandas(pd.DataFrame(ebunch)) |
There was a problem hiding this comment.
This conversion seems odd. Can you just pass the ebunch directly to cudf?
There was a problem hiding this comment.
That conversion is indeed odd. My Cugraph Sorensen implementation mirrors the already existing Jaccard one which has a few more dubious calls and I didn't pay attention to. I tried to make Sorensen consistent with Jaccard. I am refactoring Jaccard, overlap as well.
|
rerun tests |
… remove unnecessary block, use RAPIDS_DATASET_ROOT_DIR_PATH instead
|
rerun tests |
|
@gpucibot merge |
Add a python implementation of the Sorensen and the wSorensen from a prior Jaccard implementation
Add tests for both algorithms
Since there is no current implementation of networkX Sorensen, the tests convert networkX Jaccard to Sorensen and compare it to cugraph Sorensen