Skip to content

[REVIEW] Update cugraph_dgl to use the new FeatureStore#3143

Merged
rapids-bot[bot] merged 17 commits intorapidsai:branch-23.02from
VibhuJawa:switch_to_fs
Jan 19, 2023
Merged

[REVIEW] Update cugraph_dgl to use the new FeatureStore#3143
rapids-bot[bot] merged 17 commits intorapidsai:branch-23.02from
VibhuJawa:switch_to_fs

Conversation

@VibhuJawa
Copy link
Member

@VibhuJawa VibhuJawa commented Jan 16, 2023

This PR updates cugraph_dgl to use the new FeatureStore (We see a drastic increase in fetching feature perf, 1900x even with host pytorch Tensors. I expect better perf with UVA tensors which we should be able to do with rmm's managed memory .

Speedup :

  • 1900x faster get_feature_data
  • 20x faster in Heterograph Sampling on obgn-mag
| Implimentation    | N_GPUS | Earlier Time Taken (s) | PR time taken (s) | SpeedUP | Batch Size |
|-------------------|--------|------------------------|-------------------|---------|------------|
| DGL Upstream CUDA | 1      | 0.26                   | NA                | NA      | 100_000    |
| Cugraph Local     | 1      | 2.175                  | 0.499             | 5x      | 100_000    |
| Cugraph Dask      | 1      | 113.511                | 5.20              | 21.73x  | 100_000    |
| Cugraph Dask      | 2      | 106.184                | 5.722             | 18x     | 100_000    |
| Cugraph Dask      | 4      | 102.991                | 6.302             | 16.19x  | 100_000    |

This PR also closes #3146 and #3039 and #3106

  • Passes cugraph_dgl unit tests
  • Add tests for dgl_uniform_sampler.py
  • Verify cugraph_dgl works with obgn-products dataloader
    Verify cugraph_dgl works with ogbn-papers100M (To verify large scale scaling of the graphstore)
    Need bigger machine will do in a followup
  • Add a dataloader based examples to verify E2E data loading works.

To decice:
Should we remove all the dead-code related to cugraph_service_store, cugraph_store etc from the cugraph repo ?

Benchmarks:

PR:

indices=np.random.randint(0,cugraph_g.num_nodes(),size=20_000)
%time output = cugraph_g.get_node_storage(key='feat',ntype='_N').fetch(indices)
CPU times: user 132 ms, sys: 263 ms, total: 395 ms
Wall time: 14 ms

Mainline:

indices=np.random.randint(0,cugraph_g.num_nodes(),size=20_000)
%time output = cugraph_g.get_node_storage(key='feat',ntype='_N').fetch(indices)
CPU times: user 132 ms, sys: 263 ms, total: 395 ms
Wall time: 27.8 s

@VibhuJawa VibhuJawa added breaking Breaking change improvement Improvement / enhancement to an existing function labels Jan 16, 2023
@codecov-commenter
Copy link

codecov-commenter commented Jan 16, 2023

Codecov Report

Base: 55.31% // Head: 55.60% // Increases project coverage by +0.29% 🎉

Coverage data is based on head (0e3cc82) compared to base (ea132d3).
Patch coverage: 93.51% of modified lines in pull request are covered.

Additional details and impacted files
@@               Coverage Diff                @@
##           branch-23.02    #3143      +/-   ##
================================================
+ Coverage         55.31%   55.60%   +0.29%     
================================================
  Files               148      143       -5     
  Lines              9423     9318     -105     
================================================
- Hits               5212     5181      -31     
+ Misses             4211     4137      -74     
Impacted Files Coverage Δ
...graph/cugraph/gnn/dgl_extensions/utils/sampling.py 88.88% <87.50%> (+1.70%) ⬆️
.../cugraph/gnn/dgl_extensions/dgl_uniform_sampler.py 95.00% <95.00%> (ø)
...ugraph/cugraph/gnn/feature_storage/feat_storage.py 85.10% <100.00%> (ø)
python/pylibcugraph/pylibcugraph/testing/utils.py
...thon/pylibcugraph/pylibcugraph/testing/__init__.py
...n/pylibcugraph/pylibcugraph/utilities/api_tools.py
...pylibcugraph/pylibcugraph/experimental/__init__.py
python/pylibcugraph/pylibcugraph/__init__.py
python/pylibcugraph/pylibcugraph/_version.py

Help us with your feedback. Take ten seconds to tell us how you rate us. Have a feature suggestion? Share it here.

☔ View full report at Codecov.
📢 Do you have feedback about the report comment? Let us know in this issue.

@review-notebook-app
Copy link

Check out this pull request on  ReviewNB

See visual diffs & provide feedback on Jupyter Notebooks.


Powered by ReviewNB

@VibhuJawa VibhuJawa changed the title [WIP] Update cugraph_dgl to use the new FeatureStore [REVIEW] Update cugraph_dgl to use the new FeatureStore Jan 17, 2023
@VibhuJawa VibhuJawa marked this pull request as ready for review January 17, 2023 22:55
@VibhuJawa VibhuJawa requested review from a team as code owners January 17, 2023 22:55
Copy link
Contributor

@rlratzel rlratzel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, this is a large PR which I may have to go over again post-merge, but I did find a few minor things now which shouldn't hold up approval.

@alexbarghi-nv
Copy link
Member

/merge

@rapids-bot rapids-bot bot merged commit a529d76 into rapidsai:branch-23.02 Jan 19, 2023
nv-rliu pushed a commit to nv-rliu/cugraph-gnn that referenced this pull request Jul 15, 2025
This PR updates `cugraph_dgl` to use the new FeatureStore (We see a drastic increase in fetching feature perf, 1900x even with host pytorch Tensors. I expect better perf with UVA tensors which we should be able to do with `rmm's managed memory` .    

Speedup :
- [x] 1900x faster get_feature_data
- [x]  20x faster in Heterograph Sampling on obgn-mag
```md
| Implimentation    | N_GPUS | Earlier Time Taken (s) | PR time taken (s) | SpeedUP | Batch Size |
|-------------------|--------|------------------------|-------------------|---------|------------|
| DGL Upstream CUDA | 1      | 0.26                   | NA                | NA      | 100_000    |
| Cugraph Local     | 1      | 2.175                  | 0.499             | 5x      | 100_000    |
| Cugraph Dask      | 1      | 113.511                | 5.20              | 21.73x  | 100_000    |
| Cugraph Dask      | 2      | 106.184                | 5.722             | 18x     | 100_000    |
| Cugraph Dask      | 4      | 102.991                | 6.302             | 16.19x  | 100_000    |
```

This PR also closes rapidsai/cugraph#3146 and  rapidsai/cugraph#3039  and rapidsai/cugraph#3106 
- [x] Passes cugraph_dgl unit tests
- [x] Add tests for `dgl_uniform_sampler.py`
- [x] Verify cugraph_dgl works with  `obgn-products` dataloader
<s>Verify cugraph_dgl works with  `ogbn-papers100M` (To verify large scale scaling of the graphstore) </s>
Need bigger machine will do in a followup
- [x] Add a dataloader based examples to verify E2E data loading works.

To decice:
<s>Should we remove all the dead-code related to cugraph_service_store, cugraph_store etc from the cugraph repo ? </s> 


Benchmarks:

PR:
```python3
indices=np.random.randint(0,cugraph_g.num_nodes(),size=20_000)
%time output = cugraph_g.get_node_storage(key='feat',ntype='_N').fetch(indices)
CPU times: user 132 ms, sys: 263 ms, total: 395 ms
Wall time: 14 ms
```

Mainline:
```python3
indices=np.random.randint(0,cugraph_g.num_nodes(),size=20_000)
%time output = cugraph_g.get_node_storage(key='feat',ntype='_N').fetch(indices)
CPU times: user 132 ms, sys: 263 ms, total: 395 ms
Wall time: 27.8 s
```

Authors:
  - Vibhu Jawa (https://github.com/VibhuJawa)

Approvers:
  - Rick Ratzel (https://github.com/rlratzel)

URL: rapidsai/cugraph#3143
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

breaking Breaking change improvement Improvement / enhancement to an existing function

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[IMP] Remove PG from cugraph-dgl

4 participants