Adds `fail_on_nonconvergence` option to `pagerank` to provide pagerank results even on non-convergence by rlratzel · Pull Request #3639 · rapidsai/cugraph

rlratzel · 2023-06-06T17:33:01Z

Prior to this PR, pagerank will raise a RuntimeError if it fails to converge, often because the max_iter param is set too small (intentionally or otherwise). This PR adds the optional paramter fail_on_nonconvergence which defaults to True (ie. the current behavior to ensure backwards-compatibility) that allows a caller to run pagerank and get results even if it did not converge. When fail_on_nonconvergence is False, pagerank will return a tuple containing the pagerank results and a bool indicating if the results converged or not).

…agerank call to not converge yet still return a result with an additional flag indicating if the results converged or not.

…rror_on_nonconvergence.

…ence_option

…edToConvergeError exception type, adds tests for MG pagerank and personalization options.

…hub.com:rlratzel/cugraph into branch-23.08-python_pagerank_convergence_option

…_algorithms.pxd, adds exceptions module to PLC, remaining updates to PLC and cugraph code for initial passing tests.

…converged bool separately.

…8-python_pagerank_convergence_option

cpp/include/cugraph/algorithms.hpp

cpp/tests/link_analysis/pagerank_test.cpp

cpp/include/cugraph/algorithms.hpp

cpp/src/link_analysis/pagerank_impl.cuh

python/cugraph/cugraph/exceptions.py

VibhuJawa

Approving from the Python/Dask cugraph layer.

…hub.com:rlratzel/cugraph into branch-23.08-python_pagerank_convergence_option

…ence_option

seunghwak

LGTM except for one additional complaint.

seunghwak · 2023-06-13T19:44:09Z

cpp/include/cugraph/algorithms.hpp

+  raft::handle_t const& handle,
+  graph_view_t<vertex_t, edge_t, true, multi_gpu> const& graph_view,
+  std::optional<edge_property_view_t<edge_t, weight_t const*>> edge_weight_view,
+  std::optional<weight_t const*> precomputed_vertex_out_weight_sums,


Shouldn't this better be std::optional<device_span<>>?

Added #3659 to address this in a separate PR.

Nevermind. Addressed in this PR since we had to make other changes.

eriknw

For user-facing API, I wonder whether fail_on_nonconvergence is the clearest and most convenient:

pagerank(..., max_iter=3, fail_on_nonconvergence=False)

I think I would prefer a more direct, affirmative argument, such as:

pagerank(..., num_iter=3)

python/cugraph/cugraph/dask/link_analysis/pagerank.py

eriknw · 2023-06-13T19:54:38Z

python/cugraph/cugraph/dask/link_analysis/pagerank.py

+    result_tuples = [
+        client.submit(convert_to_return_tuple, cp_arrays) for cp_arrays in result
+    ]

-    wait(cudf_result)
+    # Convert the futures to dask delayed objects so the tuples can be
+    # split. nout=2 is passed since each tuple/iterable is a fixed length of 2.
+    result_tuples = [dask.delayed(r, nout=2) for r in result_tuples]
+
+    # Create the ddf and get the converged bool from the delayed objs.  Use a
+    # meta DataFrame to pass the expected dtypes for the DataFrame to prevent
+    # another compute to determine them automatically.
+    meta = cudf.DataFrame(columns=["vertex", "pagerank"])
+    meta = meta.astype({"pagerank": "float64", "vertex": vertex_dtype})
+    ddf = dask_cudf.from_delayed([t[0] for t in result_tuples], meta=meta).persist()
+    converged = all(dask.compute(*[t[1] for t in result_tuples]))


An alternative implementation to this could be something like:

import operator as op ... result_tuples = client.map(convert_to_return_tuple, cp_arrays) meta = cudf.DataFrame(columns=["vertex", "pagerank"]) meta = meta.astype({"pagerank": "float64", "vertex": vertex_dtype}) ddf = dask_cudf.from_delayed(client.map(op.itemgetter(0), result_tuples), meta=meta).persist() converged = client.submit(all, client.map(op.itemgetter(1), result_tuples)).result()

Oh Nice, Did not know we could do op.itemgetter like this. Very cool to learn. Thanks

python/cugraph/cugraph/link_analysis/pagerank.py

python/cugraph/cugraph/dask/link_analysis/pagerank.py

… exceptions using proper exception chaining.

…8-python_pagerank_convergence_option

…ps://github.com/rlratzel/cugraph into branch-23.08-python_pagerank_convergence_option

…hub.com:rlratzel/cugraph into branch-23.08-python_pagerank_convergence_option

ChuckHastings · 2023-06-14T04:02:19Z

/merge

Adds initial updates (still WIP) to provide an option that allows a p…

8a42ecd

…agerank call to not converge yet still return a result with an additional flag indicating if the results converged or not.

rlratzel added feature request New feature or request non-breaking Non-breaking change labels Jun 6, 2023

rlratzel added this to the 23.08 milestone Jun 6, 2023

rlratzel self-assigned this Jun 6, 2023

rlratzel added 2 commits June 6, 2023 12:52

Adds initial (untested) code for returning the proper type based on e…

e7cc335

…rror_on_nonconvergence.

Changes error_on_nonconvergence to fail_on_nonconvergence.

f4031d4

rlratzel changed the title ~~Adds error_on_nonconvergence option to pagerank to provide pagerank results even on non-convergence~~ Adds fail_on_nonconvergence option to pagerank to provide pagerank results even on non-convergence Jun 6, 2023

ChuckHastings and others added 8 commits June 6, 2023 23:20

add new C and C++ pagerank API

27fe784

Merge branch 'branch-23.08' into branch-23.08-python_pagerank_converg…

c0939de

…ence_option

Adds new exceptions module for cugraph-specific exceptions, adds Fail…

dc2dbba

…edToConvergeError exception type, adds tests for MG pagerank and personalization options.

Update regular pagerank in C API to also support nonconvergence

7e7e56a

Merge branch 'branch-23.08-python_pagerank_convergence_option' of git…

cb562e4

…hub.com:rlratzel/cugraph into branch-23.08-python_pagerank_convergence_option

Adds cugraph_pagerank_allow_nonconvergence to pylibcugraph centrality…

0c48a22

…_algorithms.pxd, adds exceptions module to PLC, remaining updates to PLC and cugraph code for initial passing tests.

Updates to remove the need to do a separate client.submit to get the …

50c78e6

…converged bool separately.

Minor refactor to match SG and for clarity.

09c16aa

rlratzel marked this pull request as ready for review June 12, 2023 21:23

rlratzel requested review from a team as code owners June 12, 2023 21:23

Merge remote-tracking branch 'upstream/branch-23.08' into branch-23.0…

e688969

…8-python_pagerank_convergence_option

rlratzel requested a review from VibhuJawa June 12, 2023 21:43

ChuckHastings requested review from jnke2016, naimnv and seunghwak June 12, 2023 22:33

seunghwak reviewed Jun 12, 2023

View reviewed changes

cpp/include/cugraph/algorithms.hpp Outdated Show resolved Hide resolved

cpp/tests/link_analysis/pagerank_test.cpp Show resolved Hide resolved

cpp/include/cugraph/algorithms.hpp Outdated Show resolved Hide resolved

BradReesWork approved these changes Jun 13, 2023

View reviewed changes

BradReesWork requested a review from eriknw June 13, 2023 14:40

naimnv reviewed Jun 13, 2023

View reviewed changes

cpp/src/link_analysis/pagerank_impl.cuh Show resolved Hide resolved

VibhuJawa reviewed Jun 13, 2023

View reviewed changes

python/cugraph/cugraph/exceptions.py Show resolved Hide resolved

VibhuJawa approved these changes Jun 13, 2023

View reviewed changes

rlratzel and others added 5 commits June 13, 2023 12:26

Adds docstrings to new exception classes.

2b06e26

respond to PR comments

95b85cf

Merge branch 'branch-23.08-python_pagerank_convergence_option' of git…

c78223c

…hub.com:rlratzel/cugraph into branch-23.08-python_pagerank_convergence_option

respond to PR comments

7851a1e

Merge branch 'branch-23.08' into branch-23.08-python_pagerank_converg…

56d1a5a

…ence_option

naimnv approved these changes Jun 13, 2023

View reviewed changes

seunghwak approved these changes Jun 13, 2023

View reviewed changes

eriknw reviewed Jun 13, 2023

View reviewed changes

ChuckHastings mentioned this pull request Jun 13, 2023

New Page Rank API update #3659

Closed

rlratzel and others added 5 commits June 13, 2023 16:57

Refactors dask return type conversion function for clarity, re-raises…

88156bf

… exceptions using proper exception chaining.

Merge remote-tracking branch 'upstream/branch-23.08' into branch-23.0…

b7bed8f

…8-python_pagerank_convergence_option

Merge branch 'branch-23.08-python_pagerank_convergence_option' of htt…

3a45ad2

…ps://github.com/rlratzel/cugraph into branch-23.08-python_pagerank_convergence_option

fix error in CI testing, address Seunghwa's last comment

95e02e6

Merge branch 'branch-23.08-python_pagerank_convergence_option' of git…

d2d30b2

…hub.com:rlratzel/cugraph into branch-23.08-python_pagerank_convergence_option

rapids-bot bot merged commit 5a18cde into rapidsai:branch-23.08 Jun 14, 2023

rlratzel deleted the branch-23.08-python_pagerank_convergence_option branch September 28, 2023 20:42

Conversation

rlratzel commented Jun 6, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

VibhuJawa left a comment

Choose a reason for hiding this comment

Uh oh!

seunghwak left a comment

Choose a reason for hiding this comment

Uh oh!

seunghwak Jun 13, 2023

Choose a reason for hiding this comment

Uh oh!

ChuckHastings Jun 13, 2023

Choose a reason for hiding this comment

Uh oh!

ChuckHastings Jun 13, 2023

Choose a reason for hiding this comment

Uh oh!

eriknw left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

eriknw Jun 13, 2023

Choose a reason for hiding this comment

Uh oh!

VibhuJawa Jun 15, 2023

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

ChuckHastings commented Jun 14, 2023

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

7 participants

rlratzel commented Jun 6, 2023 •

edited

Loading