[REVIEW] Update cugraph_dgl to use the new FeatureStore by VibhuJawa · Pull Request #3143 · rapidsai/cugraph

VibhuJawa · 2023-01-16T20:42:00Z

This PR updates cugraph_dgl to use the new FeatureStore (We see a drastic increase in fetching feature perf, 1900x even with host pytorch Tensors. I expect better perf with UVA tensors which we should be able to do with rmm's managed memory .

Speedup :

1900x faster get_feature_data
20x faster in Heterograph Sampling on obgn-mag

| Implimentation    | N_GPUS | Earlier Time Taken (s) | PR time taken (s) | SpeedUP | Batch Size |
|-------------------|--------|------------------------|-------------------|---------|------------|
| DGL Upstream CUDA | 1      | 0.26                   | NA                | NA      | 100_000    |
| Cugraph Local     | 1      | 2.175                  | 0.499             | 5x      | 100_000    |
| Cugraph Dask      | 1      | 113.511                | 5.20              | 21.73x  | 100_000    |
| Cugraph Dask      | 2      | 106.184                | 5.722             | 18x     | 100_000    |
| Cugraph Dask      | 4      | 102.991                | 6.302             | 16.19x  | 100_000    |

This PR also closes #3146 and #3039 and #3106

Passes cugraph_dgl unit tests
Add tests for dgl_uniform_sampler.py
Verify cugraph_dgl works with obgn-products dataloader
~~Verify cugraph_dgl works with ogbn-papers100M (To verify large scale scaling of the graphstore)~~
Need bigger machine will do in a followup
Add a dataloader based examples to verify E2E data loading works.

To decice:
~~Should we remove all the dead-code related to cugraph_service_store, cugraph_store etc from the cugraph repo ?~~

Benchmarks:

PR:

indices=np.random.randint(0,cugraph_g.num_nodes(),size=20_000)
%time output = cugraph_g.get_node_storage(key='feat',ntype='_N').fetch(indices)
CPU times: user 132 ms, sys: 263 ms, total: 395 ms
Wall time: 14 ms

Mainline:

indices=np.random.randint(0,cugraph_g.num_nodes(),size=20_000)
%time output = cugraph_g.get_node_storage(key='feat',ntype='_N').fetch(indices)
CPU times: user 132 ms, sys: 263 ms, total: 395 ms
Wall time: 27.8 s

codecov-commenter · 2023-01-16T22:41:49Z

Codecov Report

Base: 55.31% // Head: 55.60% // Increases project coverage by +0.29% 🎉

Coverage data is based on head (0e3cc82) compared to base (ea132d3).
Patch coverage: 93.51% of modified lines in pull request are covered.

Additional details and impacted files

@@               Coverage Diff                @@
##           branch-23.02    #3143      +/-   ##
================================================
+ Coverage         55.31%   55.60%   +0.29%     
================================================
  Files               148      143       -5     
  Lines              9423     9318     -105     
================================================
- Hits               5212     5181      -31     
+ Misses             4211     4137      -74

Impacted Files	Coverage Δ
...graph/cugraph/gnn/dgl_extensions/utils/sampling.py	`88.88% <87.50%> (+1.70%)`	⬆️
.../cugraph/gnn/dgl_extensions/dgl_uniform_sampler.py	`95.00% <95.00%> (ø)`
...ugraph/cugraph/gnn/feature_storage/feat_storage.py	`85.10% <100.00%> (ø)`
python/pylibcugraph/pylibcugraph/testing/utils.py
...thon/pylibcugraph/pylibcugraph/testing/__init__.py
...n/pylibcugraph/pylibcugraph/utilities/api_tools.py
...pylibcugraph/pylibcugraph/experimental/__init__.py
python/pylibcugraph/pylibcugraph/__init__.py
python/pylibcugraph/pylibcugraph/_version.py

Help us with your feedback. Take ten seconds to tell us how you rate us. Have a feature suggestion? Share it here.

☔ View full report at Codecov.
📢 Do you have feedback about the report comment? Let us know in this issue.

review-notebook-app · 2023-01-17T03:06:13Z

Check out this pull request on

See visual diffs & provide feedback on Jupyter Notebooks.

Powered by ReviewNB

…nto switch_to_fs

rlratzel

Thanks, this is a large PR which I may have to go over again post-merge, but I did find a few minor things now which shouldn't hold up approval.

python/cugraph-dgl/cugraph_dgl/__init__.py

python/cugraph-dgl/cugraph_dgl/convert.py

python/cugraph-dgl/cugraph_dgl/cugraph_storage.py

python/cugraph-dgl/cugraph_dgl/utils/cugraph_storage_utils.py

Co-authored-by: Rick Ratzel <3039903+rlratzel@users.noreply.github.com>

alexbarghi-nv · 2023-01-19T17:53:23Z

/merge

This PR updates `cugraph_dgl` to use the new FeatureStore (We see a drastic increase in fetching feature perf, 1900x even with host pytorch Tensors. I expect better perf with UVA tensors which we should be able to do with `rmm's managed memory` . Speedup : - [x] 1900x faster get_feature_data - [x] 20x faster in Heterograph Sampling on obgn-mag ```md | Implimentation | N_GPUS | Earlier Time Taken (s) | PR time taken (s) | SpeedUP | Batch Size | |-------------------|--------|------------------------|-------------------|---------|------------| | DGL Upstream CUDA | 1 | 0.26 | NA | NA | 100_000 | | Cugraph Local | 1 | 2.175 | 0.499 | 5x | 100_000 | | Cugraph Dask | 1 | 113.511 | 5.20 | 21.73x | 100_000 | | Cugraph Dask | 2 | 106.184 | 5.722 | 18x | 100_000 | | Cugraph Dask | 4 | 102.991 | 6.302 | 16.19x | 100_000 | ``` This PR also closes rapidsai/cugraph#3146 and rapidsai/cugraph#3039 and rapidsai/cugraph#3106 - [x] Passes cugraph_dgl unit tests - [x] Add tests for `dgl_uniform_sampler.py` - [x] Verify cugraph_dgl works with `obgn-products` dataloader <s>Verify cugraph_dgl works with `ogbn-papers100M` (To verify large scale scaling of the graphstore) </s> Need bigger machine will do in a followup - [x] Add a dataloader based examples to verify E2E data loading works. To decice: <s>Should we remove all the dead-code related to cugraph_service_store, cugraph_store etc from the cugraph repo ? </s> Benchmarks: PR: ```python3 indices=np.random.randint(0,cugraph_g.num_nodes(),size=20_000) %time output = cugraph_g.get_node_storage(key='feat',ntype='_N').fetch(indices) CPU times: user 132 ms, sys: 263 ms, total: 395 ms Wall time: 14 ms ``` Mainline: ```python3 indices=np.random.randint(0,cugraph_g.num_nodes(),size=20_000) %time output = cugraph_g.get_node_storage(key='feat',ntype='_N').fetch(indices) CPU times: user 132 ms, sys: 263 ms, total: 395 ms Wall time: 27.8 s ``` Authors: - Vibhu Jawa (https://github.com/VibhuJawa) Approvers: - Rick Ratzel (https://github.com/rlratzel) URL: rapidsai/cugraph#3143

VibhuJawa added 5 commits January 14, 2023 17:56

Adding first version

6147cab

Working sampling unit-test

28929c7

Working sampling unit-test

bdcfbba

Working sampling hetrographs

e00e40f

Working sampling unit-test

fd1f84a

VibhuJawa added breaking Breaking change improvement Improvement / enhancement to an existing function labels Jan 16, 2023

VibhuJawa added 2 commits January 16, 2023 15:32

Fix minor API access error

8a4ff56

Working dataloader

f50ee91

Add scale testing

ba02995

rlratzel mentioned this pull request Jan 17, 2023

Explore cluster configuration for host spilling to allow larger scale benchmarks #3106

Closed

VibhuJawa added 2 commits January 17, 2023 13:31

Added DLGSampler Tests

c4ecb11

switch_to_fs

1a2b870

VibhuJawa changed the title ~~[WIP] Update cugraph_dgl to use the new FeatureStore~~ [REVIEW] Update cugraph_dgl to use the new FeatureStore Jan 17, 2023

Merge branch 'branch-23.02' into switch_to_fs

8bb6ede

VibhuJawa marked this pull request as ready for review January 17, 2023 22:55

VibhuJawa requested review from a team as code owners January 17, 2023 22:55

VibhuJawa added 2 commits January 17, 2023 17:31

Fix pytest

095ab72

Merge branch 'switch_to_fs' of https://github.com/VibhuJawa/cugraph i…

54d2117

…nto switch_to_fs

rlratzel approved these changes Jan 18, 2023

View reviewed changes

VibhuJawa and others added 4 commits January 18, 2023 14:14

Update python/cugraph-dgl/cugraph_dgl/convert.py

7612640

Co-authored-by: Rick Ratzel <3039903+rlratzel@users.noreply.github.com>

Update convert.py

f43e8f5

Fix Style problems

dcccc79

Update convert.py

0e3cc82

rapids-bot bot merged commit a529d76 into rapidsai:branch-23.02 Jan 19, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[REVIEW] Update cugraph_dgl to use the new FeatureStore#3143

[REVIEW] Update cugraph_dgl to use the new FeatureStore#3143
rapids-bot[bot] merged 17 commits intorapidsai:branch-23.02from
VibhuJawa:switch_to_fs

VibhuJawa commented Jan 16, 2023 •

edited

Loading

Uh oh!

codecov-commenter commented Jan 16, 2023 •

edited

Loading

Uh oh!

review-notebook-app bot commented Jan 17, 2023

Uh oh!

rlratzel left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

alexbarghi-nv commented Jan 19, 2023

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

Conversation

VibhuJawa commented Jan 16, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

codecov-commenter commented Jan 16, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

review-notebook-app bot commented Jan 17, 2023

Uh oh!

rlratzel left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

alexbarghi-nv commented Jan 19, 2023

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

VibhuJawa commented Jan 16, 2023 •

edited

Loading

codecov-commenter commented Jan 16, 2023 •

edited

Loading