Random Walks - Python Bindings by jnke2016 · Pull Request #1516 · rapidsai/cugraph

jnke2016 · 2021-04-07T01:40:09Z

Python bindings for random walks
closes #1488
check the rendering after the PR is merged to make sure everything render as expected

codecov-io · 2021-04-07T03:30:43Z

Codecov Report

Merging #1516 (59fca7d) into branch-0.19 (b442f3b) will decrease coverage by 1.00%.
The diff coverage is 0.00%.

@@               Coverage Diff               @@
##           branch-0.19    #1516      +/-   ##
===============================================
- Coverage        60.54%   59.53%   -1.01%     
===============================================
  Files               70       72       +2     
  Lines             3153     3188      +35     
===============================================
- Hits              1909     1898      -11     
- Misses            1244     1290      +46

Impacted Files	Coverage Δ
python/cugraph/sampling/__init__.py	`0.00% <0.00%> (ø)`
python/cugraph/sampling/random_walks.py	`0.00% <0.00%> (ø)`
python/cugraph/utilities/utils.py	`69.23% <0.00%> (-3.08%)`	⬇️
python/cugraph/structure/number_map.py	`63.82% <0.00%> (-2.04%)`	⬇️
python/cugraph/_version.py	`44.40% <0.00%> (-0.40%)`	⬇️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update b442f3b...59fca7d. Read the comment docs.

BradReesWork

Please add a tests

afender

Please add Random Walk to:

the main __init__.py
the .rst file
the list of algorithms in the readme

python/cugraph/sampling/random_walks.py

afender

Thanks, @jnke2016 !

BradReesWork · 2021-04-07T18:34:27Z

python/cugraph/tests/test_random_walks.py

+    directed=False,
+    max_depth=None
+):
+    """


your description text does not match the function API

Yes, I can either make changes to match this API's description or modify the description

rlratzel

The cython code looks fine, but I just had some comments/suggestions about the API and py tests.

rlratzel · 2021-04-07T18:47:03Z

python/cugraph/sampling/random_walks.py

+        Use weight parameter if weights need to be considered
+        (currently not supported)


I wonder if this last sentence should just be removed until we support weights, since it's a little confusing now (ie. is that referring to a weight parameter to this function, or is that referring to just a weighted graph, etc.)

This should be removed. There is no weight parameter

rlratzel · 2021-04-07T18:48:26Z

python/cugraph/sampling/random_walks.py

+    seeds_offsets: cudf.Series
+        Series containing the starting offset in the returned edge list
+        for each vertex in start_vertices.
+    """


Only if there's time for this: a lot of our other docstrings also include examples, and it might be nice to have an example for this too. It might especially useful since the output type is somewhat unique.

rlratzel · 2021-04-07T18:51:22Z

python/cugraph/sampling/random_walks.py

+        Series containing the starting offset in the returned edge list
+        for each vertex in start_vertices.
+    """
+    if max_depth is None:


if you just remove the default value of =Nonein the function def, python will do this check for you:

def random_walks( G, start_vertices, max_depth ):

rlratzel · 2021-04-07T18:56:29Z

python/cugraph/sampling/random_walks.py

+    next_path_idx = 0
+    offsets = [0]
+
+    df = cudf.DataFrame()


Our convention is that if a NetworkX graph is passed in, we return pandas dataframes (ie. we return types that are "native" to the input type). If there's no time for this, it might have to be a FIXME.

You may be able to just call this utility as is done here (you would have test with a Nx input to be sure though).

Ok. And Brad asked to remove networkX

rlratzel · 2021-04-07T18:58:45Z

python/cugraph/sampling/random_walks_wrapper.pyx

+from rmm._lib.device_buffer cimport DeviceBuffer
+from cudf.core.buffer import Buffer
+from cython.operator cimport dereference as deref
+def random_walks(input_graph, start_vertices, max_depth):


minor: since these don't get style checks, we try to manually conform to a Python style (I think?), meaning there would be 2 blank lines between the imports and the def.

rlratzel · 2021-04-07T19:03:35Z

python/cugraph/tests/test_random_walks.py

+# =============================================================================
+DIRECTED_GRAPH_OPTIONS = [False, True]
+WEIGHTED_GRAPH_OPTIONS = [False, True]
+DATASETS = [pytest.param(d) for d in utils.DATASETS]


I've been adding ids to make it easier to see what dataset is being run in the event of a failure:

Suggested change

DATASETS = [pytest.param(d) for d in utils.DATASETS]

DATASETS = [pytest.param(d) for d in utils.DATASETS,

ids=[f"dataset={d.as_posix()}" for d in utils.DATASETS]]

rlratzel · 2021-04-07T19:14:08Z

python/cugraph/tests/test_random_walks.py

+# =============================================================================
+
+
+def prepare_test():


If you make this a setup function, pytest will automatically call it for you before each test, as done here.

rlratzel · 2021-04-07T19:14:29Z

python/cugraph/tests/test_random_walks.py

+    max_depth
+):
+    """Test calls random_walks an invalid type"""
+    prepare_test()


You can remove this line if you change the above function to a setup function.

rlratzel · 2021-04-07T19:15:31Z

python/cugraph/tests/test_random_walks.py

+    graph_file,
+    directed
+):
+    max_depth = random.randint(2, 10)


If the test fails, will the user know what randomly chosen max_depth was used to get the results? This may need to be printed somewhere too so devs can reproduce the error if necessary.

rlratzel · 2021-04-07T19:18:38Z

python/cugraph/tests/test_random_walks.py

+        if i == offsets[offsets_idx]:
+            if df['src'].iloc[i] != seeds[offsets_idx]:
+                invalid_seeds += 1
+                print(


This can be a FIXME, but tests probably shouldn't rely on print statements to show failures (ie. they should use specific assertions). I see that this allows you to check every path instead of stopping on the first failure, so we may need to rethink how the test works if we want both assertions instead of prints, and having it not stop on the first failure.

I borrowed the idea from test_BFS. I can fix and Fail the test when the first assertion is not met. I liked this approach because the test does not crash at the first failure and I get to see the other mismatches to find a pattern when debugging.

I agree that can be nice when you want to see everything. A FIXME will let us revisit this later in a way that can give us both assertions and the ability to see other mismatches if you'd rather do that (which means they'd be individual tests that inspect individual paths, so it would require some thought...).

BradReesWork · 2021-04-07T20:56:54Z

rerun tests

BradReesWork · 2021-04-07T23:51:30Z

@gpucibot merge

jnke2016 added 2 commits April 6, 2021 20:21

all

c853df6

all

59fca7d

jnke2016 requested a review from a team as a code owner April 7, 2021 01:40

BradReesWork requested review from BradReesWork, afender, aschaffer and rlratzel April 7, 2021 13:06

BradReesWork added this to the 0.19 milestone Apr 7, 2021

BradReesWork added improvement Improvement / enhancement to an existing function non-breaking Non-breaking change labels Apr 7, 2021

BradReesWork assigned jnke2016 Apr 7, 2021

BradReesWork approved these changes Apr 7, 2021

View reviewed changes

BradReesWork requested changes Apr 7, 2021

View reviewed changes

afender requested changes Apr 7, 2021

View reviewed changes

python/cugraph/sampling/random_walks.py Outdated Show resolved Hide resolved

python/cugraph/sampling/random_walks.py Outdated Show resolved Hide resolved

aschaffer approved these changes Apr 7, 2021

View reviewed changes

include tests and updated README, etc

18db2dc

jnke2016 requested a review from a team as a code owner April 7, 2021 18:15

removed a print statement

bd4516d

afender approved these changes Apr 7, 2021

View reviewed changes

BradReesWork approved these changes Apr 7, 2021

View reviewed changes

rlratzel reviewed Apr 7, 2021

View reviewed changes

rapids-bot bot merged commit 63e69fc into rapidsai:branch-0.19 Apr 7, 2021

jnke2016 deleted the fea-Python-RW branch September 24, 2022 23:07

		Use weight parameter if weights need to be considered
		(currently not supported)

	DATASETS = [pytest.param(d) for d in utils.DATASETS]
	DATASETS = [pytest.param(d) for d in utils.DATASETS,
	ids=[f"dataset={d.as_posix()}" for d in utils.DATASETS]]

		# =============================================================================


		def prepare_test():

Conversation

jnke2016 commented Apr 7, 2021 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

codecov-io commented Apr 7, 2021

Codecov Report

Uh oh!

BradReesWork left a comment

Choose a reason for hiding this comment

Uh oh!

afender left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

afender left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

rlratzel left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

jnke2016 Apr 7, 2021 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

rlratzel Apr 7, 2021 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

BradReesWork commented Apr 7, 2021

Uh oh!

BradReesWork commented Apr 7, 2021

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

6 participants

jnke2016 commented Apr 7, 2021 •

edited

Loading

jnke2016 Apr 7, 2021 •

edited

Loading

rlratzel Apr 7, 2021 •

edited

Loading