Skip to content

Branch 23.06 resolve merge conflict for forward merge#3409

Merged
rapids-bot[bot] merged 28 commits intorapidsai:branch-23.06from
alexbarghi-nv:branch-23.06
Apr 10, 2023
Merged

Branch 23.06 resolve merge conflict for forward merge#3409
rapids-bot[bot] merged 28 commits intorapidsai:branch-23.06from
alexbarghi-nv:branch-23.06

Conversation

@alexbarghi-nv
Copy link
Member

Resolves merge conflict

jnke2016 and others added 2 commits March 31, 2023 19:19
A CAPI implementation of betweenness centrality is available and this PR:
1. Implement PLC betweenness centrality by leveraging the CAPI
2. Refactor the python SG implementation of betweenness centrality by leveraging the PLC implementation
3. Add a python MG implementation of betweenness centrality with tests

closes rapidsai#3145 
closes rapidsai#2605
closes rapidsai#2648 
closes rapidsai#2649
closes rapidsai#2650

Authors:
  - Joseph Nke (https://github.com/jnke2016)
  - Chuck Hastings (https://github.com/ChuckHastings)
  - Rick Ratzel (https://github.com/rlratzel)
  - Alex Barghi (https://github.com/alexbarghi-nv)
  - Vyas Ramasubramani (https://github.com/vyasr)
  - Vibhu Jawa (https://github.com/VibhuJawa)

Approvers:
  - Rick Ratzel (https://github.com/rlratzel)
  - Chuck Hastings (https://github.com/ChuckHastings)
  - Alex Barghi (https://github.com/alexbarghi-nv)
  - Brad Rees (https://github.com/BradReesWork)

URL: rapidsai#2971
@alexbarghi-nv alexbarghi-nv added feature request New feature or request non-breaking Non-breaking change labels Mar 31, 2023
@alexbarghi-nv alexbarghi-nv self-assigned this Mar 31, 2023
VibhuJawa and others added 20 commits April 2, 2023 20:49
This PR adds a working Multi-GPU Graph (on 2 dask workers)  being trained/loaded on multiple pytorch trainers.  (3)

Todo: 
- [x] Verify works on multiple trainers and multiple dask workers
- [x] Show scaling as you increase training GPUs 

At 1 second we become bottlenecked by sampling dask cluster, but we see perf improvement by going from `1 GPU`->`2GPU`.   
**On OBGN-Products**
```md
| Number of Training GPUs | Time per epoch |
|-------------------------|----------------|
| 1                       | 2.3 s          |
| 2                       | 0.582 s        |
| 4                       | 0.792 s        |
```

This PR depends upon:  rapidsai#3393
CC: @rlratzel , @alexbarghi-nv , @BradReesWork

Authors:
  - Vibhu Jawa (https://github.com/VibhuJawa)
  - Alex Barghi (https://github.com/alexbarghi-nv)

Approvers:
  - Alex Barghi (https://github.com/alexbarghi-nv)

URL: rapidsai#3212
resolves rapidsai#3297 

new website structure with some migrations.

Authors:
  - Don Acosta (https://github.com/acostadon)
  - Brad Rees (https://github.com/BradReesWork)

Approvers:
  - Brad Rees (https://github.com/BradReesWork)
  - Alex Barghi (https://github.com/alexbarghi-nv)

URL: rapidsai#3343
…dsai#3289)

Updates tests to support presence of PyTorch and PyG in the environment.

Updates cugraph-pyg to include support for loading from samples on disk. Makes CuGraphStore serializable and allows creating CuGraphStore and BulkSampleLoader instances without the graph structure.

Merge after rapidsai#3288 

Closes rapidsai#3287 
Closes rapidsai#3176

Authors:
  - Alex Barghi (https://github.com/alexbarghi-nv)

Approvers:
  - Vibhu Jawa (https://github.com/VibhuJawa)
  - Rick Ratzel (https://github.com/rlratzel)

URL: rapidsai#3289
This PR uses dependencies.yaml to generate the dependency lists in pyproject.toml

Authors:
  - Vyas Ramasubramani (https://github.com/vyasr)

Approvers:
  - AJ Schmidt (https://github.com/ajschmidt8)
  - Brad Rees (https://github.com/BradReesWork)

URL: rapidsai#3355
The `__version__` attribute appears to have been accidentally deleted in rapidsai#2971.

Authors:
  - Vyas Ramasubramani (https://github.com/vyasr)

Approvers:
  - Alex Barghi (https://github.com/alexbarghi-nv)

URL: rapidsai#3411
…cugraph-pyg (rapidsai#3382)

Resolves an issue where the wrong version of `searchsorted` caused a device to host copy.  Also removes the backend option from `CuGraphStore` entirely to prevent similar bugs from happening in the future and better align cugraph-pyg with the pyg/pytorch ecosystem.

Merge after rapidsai#3289 
Closes rapidsai#2995

Authors:
  - Alex Barghi (https://github.com/alexbarghi-nv)

Approvers:
  - Rick Ratzel (https://github.com/rlratzel)
  - Vibhu Jawa (https://github.com/VibhuJawa)

URL: rapidsai#3382
…#3393)

This PR fixes a bug where output sample batch ids do not match those expected when using the bulk sampler, causing subgraphs that are larger than expected and incorrect.  Without reindexing, the wrong batch ids are assigned to the start vertices.  Reindexing ensures that the same order is preserved for batch ids and start vertices.

This PR also changes the empty dataframe passed to dask in `uniform_neighbor_sample` to match the correct ordering of batch_id and hop_id.  This ensures that the columns are named correctly and are not inadvertently renamed due to them being created in a different order.

This PR is non-breaking because it restores the original behavior of bulk sampling and reverses a bug that was inadvertently introduced with the dask updates.

Resolves rapidsai#3390

Authors:
  - Alex Barghi (https://github.com/alexbarghi-nv)

Approvers:
  - Rick Ratzel (https://github.com/rlratzel)
  - Vibhu Jawa (https://github.com/VibhuJawa)
  - Joseph Nke (https://github.com/jnke2016)

URL: rapidsai#3393
This PR adds an MG implementation of induced subgraph by leveraging the CAPI

closes rapidsai#2535 
closes rapidsai#2536

Authors:
  - Joseph Nke (https://github.com/jnke2016)
  - Alex Barghi (https://github.com/alexbarghi-nv)

Approvers:
  - Chuck Hastings (https://github.com/ChuckHastings)
  - Rick Ratzel (https://github.com/rlratzel)

URL: rapidsai#3391
### Leiden algorithm 
This PR adds Leiden implementation using cugraph primitives.  It also add an implementation of `maximal independent set ` using cugraph primitives that is used by the Leiden implementation. 

- It reuses code from Louvain implementation to find locally optimal moves for vertices. 
- Locally optimal movement creates  clusters of `Louvain communities` 
- In refinement phase, which is the main feature of Leiden algorithm, vertices are moved only within a `Louvain community` to subdivide a `Louvain community` in multiple `sub communities` or `Leiden communities`
- In graph contraction phase, vertices belonging to a `Leiden community` are merged to become a node in aggregated graph.

This code has been tested to work on SG. We aim to test MG version in next release.
Future improvement:  

- Move vertices randomly instead of greedily refinement phase
- Test for MNMG

Authors:
  - Naim (https://github.com/naimnv)
  - Chuck Hastings (https://github.com/ChuckHastings)

Approvers:
  - Chuck Hastings (https://github.com/ChuckHastings)
  - Seunghwa Kang (https://github.com/seunghwak)

URL: rapidsai#2980
- Adds new examples for cugraph-pyg.
- Removes outdated examples.
- Moves MG scripts to top-level directory.
- Makes the input to `_get_vertex_groups_from_sample` a tensor instead of Series
- Adds `is_sorted` arg to `_get_vertex_groups_from_sample` to skip sorting if tensor already sorted
- Some fixes to `CuGraphStore` for running multi-GPU workflows

Merge after rapidsai#3288  - merged
Merge after rapidsai#3289  - merged
Merge after rapidsai#3382  - merged

Closes rapidsai#3316 
Closes rapidsai#3226 
Closes rapidsai#3072

Authors:
  - Alex Barghi (https://github.com/alexbarghi-nv)

Approvers:
  - Vibhu Jawa (https://github.com/VibhuJawa)
  - Rick Ratzel (https://github.com/rlratzel)

URL: rapidsai#3326
…3300)

The allocator callbacks now live in their own submodules (so that RMM does not, for example, import pytorch unless required) and so must be explicitly imported.

Authors:
  - Lawrence Mitchell (https://github.com/wence-)

Approvers:
  - Alex Barghi (https://github.com/alexbarghi-nv)
  - Rick Ratzel (https://github.com/rlratzel)

URL: rapidsai#3300
…rm Neighbor Sample (rapidsai#3416)

Currently, cudf does not merge series properly when they already share an index.  I'm not sure if this is a bug in cudf, or intentional behavior.  This issue does not occur with dask_cudf.  The resolution is to use `cudf.concat` when passing a `cudf.Series` for start vertices and batch ids, and `df.to_frame().merge` when passing in a `dask_cudf.Series` for start vertices and batch ids.

This PR also adds an additional test which tests both cudf and dask_cudf inputs to catch these sort of problems in the future.

Authors:
  - Alex Barghi (https://github.com/alexbarghi-nv)

Approvers:
  - Rick Ratzel (https://github.com/rlratzel)
  - Vibhu Jawa (https://github.com/VibhuJawa)
  - Joseph Nke (https://github.com/jnke2016)

URL: rapidsai#3416
Required to support the upcoming minor release of PyG, which will allow full compatibility with pylibcugraphops.

Authors:
  - Alex Barghi (https://github.com/alexbarghi-nv)

Approvers:
  - Vibhu Jawa (https://github.com/VibhuJawa)
  - Ray Douglass (https://github.com/raydouglass)

URL: rapidsai#3422
Trying to fix doc build issues

Authors:
  - Brad Rees (https://github.com/BradReesWork)

Approvers:
  - Don Acosta (https://github.com/acostadon)
  - Rick Ratzel (https://github.com/rlratzel)

URL: rapidsai#3418
…#3360)

Update cugraph-ops models to use pylibcugraphops 23.04. This PR also supersedes rapidsai#3264 

CC: @MatthiasKohl

Authors:
  - Tingyu Wang (https://github.com/tingyu66)
  - Alex Barghi (https://github.com/alexbarghi-nv)

Approvers:
  - AJ Schmidt (https://github.com/ajschmidt8)
  - Vibhu Jawa (https://github.com/VibhuJawa)
  - Rick Ratzel (https://github.com/rlratzel)

URL: rapidsai#3360
This PR pins `dask` and `distributed` to `2023.3.2` and `2023.3.2.1` respectively for `23.04` release.

xref: rapidsai/cudf#13070

Authors:
  - GALI PREM SAGAR (https://github.com/galipremsagar)

Approvers:
  - Ray Douglass (https://github.com/raydouglass)
  - Peter Andreas Entschev (https://github.com/pentschev)
  - Rick Ratzel (https://github.com/rlratzel)
  - Joseph (https://github.com/jolorunyomi)

URL: rapidsai#3427
@alexbarghi-nv alexbarghi-nv marked this pull request as ready for review April 6, 2023 15:36
@alexbarghi-nv alexbarghi-nv requested review from a team as code owners April 6, 2023 15:36
@galipremsagar
Copy link
Contributor

Not sure why but seems like there's additional commits in #3408, that aren't showing up here.

@galipremsagar
Copy link
Contributor

galipremsagar commented Apr 6, 2023

@alexbarghi-nv I think we will need to perform an no-squash merge from branch-23.04 to branch-23.06: https://docs.rapids.ai/maintainers/forward-merger/

I tried doing it but there are quite a bit of conflicts that I'm not familiar of. I'm happy to hop on a call with you if you need help.

@alexbarghi-nv
Copy link
Member Author

I know how to do the merge, just got out a meeting now. I'll take care of it.

@galipremsagar
Copy link
Contributor

I know how to do the merge, just got out a meeting now. I'll take care of it.

Thanks @alexbarghi-nv !

@alexbarghi-nv alexbarghi-nv requested review from a team as code owners April 6, 2023 18:44
@review-notebook-app
Copy link

Check out this pull request on  ReviewNB

See visual diffs & provide feedback on Jupyter Notebooks.


Powered by ReviewNB

@alexbarghi-nv
Copy link
Member Author

alexbarghi-nv commented Apr 6, 2023

@galipremsagar all the changes should be pulled in now and conflicts resolved.

package-name: cugraph
# Always want to test against latest dask/distributed.
test-before-amd64: "cd ./datasets && bash ./get_test_data.sh && cd - && RAPIDS_PY_WHEEL_NAME=pylibcugraph_cu11 rapids-download-wheels-from-s3 ./local-pylibcugraph-dep && pip install --no-deps ./local-pylibcugraph-dep/*.whl && pip install git+https://github.com/dask/dask.git@2023.3.2 git+https://github.com/dask/distributed.git@2023.3.2.1 git+https://github.com/rapidsai/dask-cuda.git@branch-23.04"
test-before-amd64: "cd ./datasets && bash ./get_test_data.sh && cd - && RAPIDS_PY_WHEEL_NAME=pylibcugraph_cu11 rapids-download-wheels-from-s3 ./local-pylibcugraph-dep && pip install --no-deps ./local-pylibcugraph-dep/*.whl && pip install git+https://github.com/dask/dask.git@main git+https://github.com/dask/distributed.git@main git+https://github.com/rapidsai/dask-cuda.git@branch-23.06"
Copy link
Contributor

@galipremsagar galipremsagar Apr 6, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Lets unpin dask & distributed all repos at a time. Unpinning in just cugraph might have unintended consequences.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@galipremsagar should be fixed now

Copy link
Contributor

@galipremsagar galipremsagar left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @alexbarghi-nv !

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

feature request New feature or request non-breaking Non-breaking change

Projects

None yet

Development

Successfully merging this pull request may close these issues.