Skip to content

Update Uniform Neighborhood Sampling API#2997

Merged
rapids-bot[bot] merged 23 commits intorapidsai:branch-23.02from
ChuckHastings:update_uniform_sampling_api
Jan 17, 2023
Merged

Update Uniform Neighborhood Sampling API#2997
rapids-bot[bot] merged 23 commits intorapidsai:branch-23.02from
ChuckHastings:update_uniform_sampling_api

Conversation

@ChuckHastings
Copy link
Collaborator

@ChuckHastings ChuckHastings commented Nov 29, 2022

Update the Uniform Neighborhood Sampling C API to address some shortcomings:

  • Return edge id, edge type, edge weight directly from graph
  • Get rid of the remove duplicates functionality
  • Improve C API testing to cover a wider range of conditions (multigraph, batch capability, error conditions)

closes #2994
closes #2597
closes #2774

@ChuckHastings ChuckHastings self-assigned this Nov 29, 2022
@ChuckHastings ChuckHastings added this to the 23.02 milestone Nov 29, 2022
@ChuckHastings ChuckHastings marked this pull request as ready for review December 9, 2022 20:09
@ChuckHastings ChuckHastings requested a review from a team as a code owner December 9, 2022 20:09
@ChuckHastings ChuckHastings added 3 - Ready for Review improvement Improvement / enhancement to an existing function non-breaking Non-breaking change and removed 2 - In Progress labels Dec 9, 2022
@codecov-commenter
Copy link

codecov-commenter commented Dec 10, 2022

Codecov Report

Base: 55.31% // Head: 55.20% // Decreases project coverage by -0.10% ⚠️

Coverage data is based on head (bcce588) compared to base (ea132d3).
Patch has no changes to coverable lines.

Additional details and impacted files
@@               Coverage Diff                @@
##           branch-23.02    #2997      +/-   ##
================================================
- Coverage         55.31%   55.20%   -0.11%     
================================================
  Files               148      142       -6     
  Lines              9423     9229     -194     
================================================
- Hits               5212     5095     -117     
+ Misses             4211     4134      -77     
Impacted Files Coverage Δ
...pylibcugraph/pylibcugraph/experimental/__init__.py
python/pylibcugraph/pylibcugraph/_version.py
python/pylibcugraph/pylibcugraph/__init__.py
...n/pylibcugraph/pylibcugraph/utilities/api_tools.py
...thon/pylibcugraph/pylibcugraph/testing/__init__.py
python/pylibcugraph/pylibcugraph/testing/utils.py

Help us with your feedback. Take ten seconds to tell us how you rate us. Have a feature suggestion? Share it here.

☔ View full report at Codecov.
📢 Do you have feedback about the report comment? Let us know in this issue.

@alexbarghi-nv
Copy link
Member

rerun tests

@BradReesWork
Copy link
Member

rerun tests

Copy link
Contributor

@seunghwak seunghwak left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I will add more reviews later.

std::optional<
edge_property_view_t<edge_t,
thrust::zip_iterator<thrust::tuple<edge_t const*, edge_type_t const*>>>>
edge_type_view,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shouldn't this be edge_id_type_view?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed in next push

* This function traverses from a set of starting vertices, traversing outgoing edges and
* randomly selects from these outgoing neighbors to extract a subgraph.
*
* Output from this function a set of tuples (src, dst, edge_id, edge_type, weight, hop, label),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Output from this function a set of tuples=>Output from this function is a tuple of vectors?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated in next push

d_result_edge_id.resize(new_sz, handle.get_stream());
d_result_hop.resize(new_sz, handle.get_stream());
if (d_result_weight) d_result_weight->resize(new_sz, handle.get_stream());
if (d_result_edge_type) d_result_edge_type->resize(new_sz, handle.get_stream());
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shouldn't we resize d_result_label as well?

i.e.
if (d_result_label) d_result_label->resize(new_sz, handle.get_stream());

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And note that up-sizing involves 1) reallocating a new array and 2) copying the old contents. Copying can add significant overhead if there are many levels.

We may better store the data for each level in a separate array and concatenate once at the end to avoid frequent resizing (and copying the same data multiple times).

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added a FIXME to mention this performance issue.

* @param graph_view Non-owning graph object.
* @param active_majors Device vector containing all the vertex id that are processed by
* gpus in the column communicator
* @return A tuple of device vector containing the majors, minors and weights gathered locally
Copy link
Contributor

@naimnv naimnv Dec 16, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@return A tuple of device vectors containing the majors, minors, ids and optional weights, types and labels

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed in next push.

Comment on lines +98 to +100
} else if constexpr (cugraph::is_thrust_tuple_of_arithmetic<EdgeProperties>::value &&
(thrust::tuple_size<EdgeProperties>::value == 3)) {
return thrust::make_optional(thrust::make_tuple(src,
Copy link
Contributor

@naimnv naimnv Dec 16, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder, what if we have EdgeProperties are tuple of size 4 or more?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Use thrust_tuple_cat & to_thrust_tuple.

https://github.com/rapidsai/cugraph/blob/branch-23.02/cpp/include/cugraph/utilities/thrust_tuple_utils.hpp#L215
https://github.com/rapidsai/cugraph/blob/branch-23.02/cpp/include/cugraph/utilities/thrust_tuple_utils.hpp#L201

Something like

return thrust::make_optional(thrust_tuple_cat(thrust::make_tuple(src), thrust::make_tuple(dst), to_thrust_tuple(edge_properties)));

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I didn't think that thrust_tuple_cat worked on device. At least I couldn't get it to work on device, and the examples I saw were all in host code.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Clearly a working thrust_tuple_cat would be a better long-term solution.

In our current code base, this set of if statements cover all of the possible cases. Since this is resolved at compile time, I'm comfortable leaving this as tech debt, unless you think there's a way to make thrust_tuple_cat do what we want.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added a FIXME about using thrust_tuple_cat

@alexbarghi-nv
Copy link
Member

I think there may be a logic error somewhere; When testing on MG, I'm getting intermittent frontier out of range errors for valid vertex ids, and sometimes arrays of incorrect size back from sample_edges. Also, I'm getting some garbage values for edge types. SG appears to be ok after my fix to copy only add_sz elements in uniform_neighbor_sample_impl.

Copy link
Member

@alexbarghi-nv alexbarghi-nv left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Found a few things, might have more comments later

@alexbarghi-nv
Copy link
Member

There are also a couple C++ bugs related to edge properties I've fixed and working on a PR for that will stop this from working correctly.

@alexbarghi-nv
Copy link
Member

Finally, I'm still seeing the frontier out of range errors which are blocking further testing.

* This function traverses from a set of starting vertices, traversing outgoing edges and
* randomly selects from these outgoing neighbors to extract a subgraph.
*
* Output from this function is a tuple of vectors (src, dst, edge_id, edge_type, weight, hop, label),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(src, dst, edge_id, edge_type, wieght, ...)=>(src, dst, weight, edge_id, edge_type, ...) to match the actual return order.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated documentation

*
* Output from this function is a tuple of vectors (src, dst, edge_id, edge_type, weight, hop, label),
* identifying the randomly selected edges. src is the source vertex, dst is the destination
* vertex, edge_id identifies the edge id, edge_type identifies the edge type, weight is the edge
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

edge_id identifies the edge id, edge_type identifies the edge type, weight is the edge weight,
=>
weight is the edge weight, edge_id identifies the edge ID, edge_type identifies the edge type,

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated documentation

* Output from this function is a tuple of vectors (src, dst, edge_id, edge_type, weight, hop, label),
* identifying the randomly selected edges. src is the source vertex, dst is the destination
* vertex, edge_id identifies the edge id, edge_type identifies the edge type, weight is the edge
* weight, hop identifies which hop the edge was encountered in. Label is optional, if input labels
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Edge weights, edge IDs, and edge types are optional, too, right? Shouldn't we document this as well.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This may mislead users that only label is optional.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated documentation

*
* @tparam vertex_t Type of vertex identifiers. Needs to be an integral type.
* @tparam edge_t Type of edge identifiers. Needs to be an integral type.
* @tparam weight_t Type of edge weights. Needs to be a floating point type.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

documentation for edge_type_t & store_transposed are missing

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated documentation

* handles to various CUDA libraries) to run graph algorithms.
* @param graph_view Graph View object to generate NBR Sampling on.
* @param edge_weight_view Optional view object holding edge weights for @p graph_view.
* @param edge_type_view Optional view object holding edge types for @p graph_view.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

edge_type_view=>edge_id_type_view
holding edge types=>holding edge IDs and types

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated documentation

* (true); or, without replacement (false); default = true;
* @param seed A seed to initialize the random number generator
* @return tuple device vectors (vertex_t source_vertex, vertex_t destination_vertex,
* optional weight_t weight, optional edge_t edge id, optional edge_type_t edge type, int32 hop,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

int32 hop=>int32_t hop

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated documentation

@@ -78,26 +78,63 @@ count_and_remove_duplicates(raft::handle_t const& handle,
std::move(result_src), std::move(result_dst), std::move(result_wgt), std::move(result_count));
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah... I guess better be deleted.

thrust::optional<thrust::tuple<vertex_t, vertex_t, W>>>
__device__ operator()(vertex_t src, vertex_t dst, thrust::nullopt_t, thrust::nullopt_t, W wgt)
template <typename vertex_t>
struct sample_edges_op_t {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Gotcha.

Comment on lines +331 to +343
rmm::device_uvector<vertex_t> d_start_vs(starting_vertices.size(), handle.get_stream());
raft::copy(
d_start_vs.data(), starting_vertices.data(), starting_vertices.size(), handle.get_stream());

std::optional<rmm::device_uvector<int32_t>> d_start_labels{std::nullopt};
if (starting_labels) {
d_start_labels = std::make_optional(
rmm::device_uvector<int32_t>(starting_labels->size(), handle.get_stream()));
raft::copy(d_start_labels->data(),
starting_labels->data(),
starting_labels->size(),
handle.get_stream());
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So, if I am not mistaken,

If the intention is to move them and overwrite, shouldn't we better pass d_start_labels as an R-value? If we're passing d_start_labels as an L-value reference, that implies that we're willing to use d_start_labels after the function call and d_start_labels will be in a certain well-defined state after the function call.

And I guess we don't call the shuffle function in single-GPU, so this overhead can be avoided in signle-GPU, right? Then, shouldn't we better create a copy before calling a shuffle function rather than creating a copy here? And I guess it will make the intention clearer. To groupby elements based on destination GPUs, we need to modify the input, but 'starting_labels is const, so we need to make a copy. This is clear. But the intention of creating a copy to call a detail space implementation function here is not very clear I think.

@ChuckHastings ChuckHastings requested a review from a team as a code owner January 14, 2023 01:59
Copy link
Member

@alexbarghi-nv alexbarghi-nv left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍

Copy link
Member

@VibhuJawa VibhuJawa left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have a question which i think is unrelated to this PR but i still think is worth asking

We currently fail for isolated vertices in the main-line version, like check out below example. Would this PR or a followup be able to handle sampling on such a vertex. I just except us to not return any samples rather than erroring like above .

CC: @ChuckHastings , @seunghwak , @alexbarghi-nv

edges_df = dask_cudf.from_cudf(cudf.DataFrame({'src':[0,1,3],'dst':[0,1,3]}), npartitions=2)
g = cugraph.MultiGraph(directed=True)
g.from_dask_cudf_edgelist(edges_df,source='src',destination='dst',legacy_renum_only=True, renumber=True)
sample_d = cugraph.dask.uniform_neighbor_sample(g, cudf.Series([2]), with_replacement=False, fanout_vals=[10])
sample_d.compute()
RuntimeError: non-success value returned from cugraph_uniform_neighbor_sample: CUGRAPH_UNKNOWN_ERROR cuGraph failure at file=/home/nfs/vjawa/dgl/cugraph/cpp/src/prims/per_v_random_select_transform_outgoing_e.cuh line=328: Invalid input argument: frontier includes out-of-range keys.
Obtained 32 stack frames
#0 in /datasets/vjawa/miniconda3/envs/all_cuda-115_arch-x86_64/lib/libcugraph.so(+0x85f434) [0x7f038c094434]
#1 in /datasets/vjawa/miniconda3/envs/all_cuda-115_arch-x86_64/lib/libcugraph.so(+0xb8579d) [0x7f038c3ba79d]
#2 in /datasets/vjawa/miniconda3/envs/all_cuda-115_arch-x86_64/lib/libcugraph.so(_ZN7cugraph6detail12sample_edgesIllfLb1EEESt5tupleIJN3rmm14device_uvectorIT_EES6_St8optionalINS4_IT1_EEEEERKN4raft8handle_tERKNS_12graph_view_tIS5_T0_Lb0EXT2_EvEES7_INS_20edge_property_view_tISH_PKS8_EEERNSC_6random8RngStateERKS6_mb+0x3f41) [0x7f038ce566f1]
#3 in /datasets/vjawa/miniconda3/envs/all_cuda-115_arch-x86_64/lib/libcugraph.so(_ZN7cugraph6detail23uniform_nbr_sample_implIllfLb0ELb1EEESt5tupleIJN3rmm14device_uvectorIT_EES6_NS4_IT1_EENS4_IT0_EEEERKN4raft8handle_tERKNS_12graph_view_tIS5_S9_XT2_EXT3_EvEESt8optionalINS_20edge_property_view_tIS9_PKS7_EEERS6_NSC_4spanIKiLb0ELm18446744073709551615EEEbm+0x4bf) [0x7f038ceeab6f]
#4 in /datasets/vjawa/miniconda3/envs/all_cuda-115_arch-x86_64/lib/libcugraph.so(_ZN7cugraph18uniform_nbr_sampleIllfLb0ELb1EEESt5tupleIJN3rmm14device_uvectorIT_EES5_NS3_IT1_EENS3_IT0_EEEERKN4raft8handle_tERKNS_12graph_view_tIS4_S8_XT2_EXT3_EvEESt8optionalINS_20edge_property_view_tIS8_PKS6_EEENSB_4spanIS4_Lb1ELm18446744073709551615EEENSP_IKiLb0ELm18446744073709551615EEEbm+0x135) [0x7f038ceebc45]
#5 in /datasets/vjawa/miniconda3/envs/all_cuda-115_arch-x86_64/lib/libcugraph_c.so(+0x1f5b37) [0x7f039a5a0b37]
#6 in /datasets/vjawa/miniconda3/envs/all_cuda-115_arch-x86_64/lib/libcugraph_c.so(cugraph_uniform_neighbor_sample+0x114) [0x7f039a5aa734]
#7 in /datasets/vjawa/miniconda3/envs/all_cuda-115_arch-x86_64/lib/python3.9/site-packages/pylibcugraph/uniform_neighbor_sample.cpython-39-x86_64-linux-gnu.so(+0x7679) [0x7f04993a6679]
#8 in /datasets/vjawa/miniconda3/envs/all_cuda-115_arch-x86_64/bin/python(_PyObject_MakeTpCall+0x347) [0x5576d07daa57]
#9 in /datasets/vjawa/miniconda3/envs/all_cuda-115_arch-x86_64/bin/python(_PyEval_EvalFrameDefault+0x55fc) [0x5576d07d6dac]
#10 in /datasets/vjawa/miniconda3/envs/all_cuda-115_arch-x86_64/bin/python(+0x12a7d7) [0x5576d07d07d7]
#11 in /datasets/vjawa/miniconda3/envs/all_cuda-115_arch-x86_64/bin/python(_PyFunction_Vectorcall+0xb9) [0x5576d07e2bb9]
#12 in /datasets/vjawa/miniconda3/envs/all_cuda-115_arch-x86_64/bin/python(PyObject_Call+0xb4) [0x5576d07f2254]
#13 in /datasets/vjawa/miniconda3/envs/all_cuda-115_arch-x86_64/bin/python(_PyEval_EvalFrameDefault+0x3d99) [0x5576d07d5549]
#14 in /datasets/vjawa/miniconda3/envs/all_cuda-115_arch-x86_64/bin/python(+0x13cec3) [0x5576d07e2ec3]
#15 in /datasets/vjawa/miniconda3/envs/all_cuda-115_arch-x86_64/bin/python(_PyEval_EvalFrameDefault+0x3c3) [0x5576d07d1b73]
#16 in /datasets/vjawa/miniconda3/envs/all_cuda-115_arch-x86_64/bin/python(+0x13cec3) [0x5576d07e2ec3]
#17 in /datasets/vjawa/miniconda3/envs/all_cuda-115_arch-x86_64/bin/python(_PyEval_EvalFrameDefault+0x3d99) [0x5576d07d5549]
#18 in /datasets/vjawa/miniconda3/envs/all_cuda-115_arch-x86_64/bin/python(+0x13cec3) [0x5576d07e2ec3]
#19 in /datasets/vjawa/miniconda3/envs/all_cuda-115_arch-x86_64/bin/python(_PyEval_EvalFrameDefault+0x672) [0x5576d07d1e22]
#20 in /datasets/vjawa/miniconda3/envs/all_cuda-115_arch-x86_64/bin/python(+0x13cec3) [0x5576d07e2ec3]
#21 in /datasets/vjawa/miniconda3/envs/all_cuda-115_arch-x86_64/bin/python(_PyEval_EvalFrameDefault+0x3d99) [0x5576d07d5549]
#22 in /datasets/vjawa/miniconda3/envs/all_cuda-115_arch-x86_64/bin/python(+0x13cec3) [0x5576d07e2ec3]
#23 in /datasets/vjawa/miniconda3/envs/all_cuda-115_arch-x86_64/bin/python(_PyEval_EvalFrameDefault+0x672) [0x5576d07d1e22]
#24 in /datasets/vjawa/miniconda3/envs/all_cuda-115_arch-x86_64/bin/python(+0x13cec3) [0x5576d07e2ec3]
#25 in /datasets/vjawa/miniconda3/envs/all_cuda-115_arch-x86_64/bin/python(_PyEval_EvalFrameDefault+0x672) [0x5576d07d1e22]
#26 in /datasets/vjawa/miniconda3/envs/all_cuda-115_arch-x86_64/bin/python(+0x13cec3) [0x5576d07e2ec3]
#27 in /datasets/vjawa/miniconda3/envs/all_cuda-115_arch-x86_64/bin/python(+0x14bc75) [0x5576d07f1c75]
#28 in /datasets/vjawa/miniconda3/envs/all_cuda-115_arch-x86_64/bin/python(+0x22f7f5) [0x5576d08d57f5]
#29 in /datasets/vjawa/miniconda3/envs/all_cuda-115_arch-x86_64/bin/python(+0x22f7a4) [0x5576d08d57a4]
#30 in /lib/x86_64-linux-gnu/libpthread.so.0(+0x76db) [0x7f05caef86db]
#31 in /lib/x86_64-linux-gnu/libc.so.6(clone+0x3f) [0x7f05ca27461f]

@alexbarghi-nv
Copy link
Member

alexbarghi-nv commented Jan 17, 2023

I have a question which i think is unrelated to this PR but i still think is worth asking

We currently fail for isolated vertices in the main-line version, like check out below example. Would this PR or a followup be able to handle sampling on such a vertex. I just except us to not return any samples rather than erroring like above .

CC: @ChuckHastings , @seunghwak , @alexbarghi-nv

edges_df = dask_cudf.from_cudf(cudf.DataFrame({'src':[0,1,3],'dst':[0,1,3]}), npartitions=2)
g = cugraph.MultiGraph(directed=True)
g.from_dask_cudf_edgelist(edges_df,source='src',destination='dst',renumber=False)
sample_d = cugraph.dask.uniform_neighbor_sample(g, start_list=cudf.Series([4677]), with_replacement=False, fanout_vals=[10])
sample_d.compute()
RuntimeError: non-success value returned from cugraph_uniform_neighbor_sample: CUGRAPH_UNKNOWN_ERROR cuGraph failure at file=/home/nfs/vjawa/dgl/cugraph/cpp/src/prims/per_v_random_select_transform_outgoing_e.cuh line=328: Invalid input argument: frontier includes out-of-range keys.
Obtained 32 stack frames
#0 in /datasets/vjawa/miniconda3/envs/all_cuda-115_arch-x86_64/lib/libcugraph.so(+0x85f434) [0x7f038c094434]
#1 in /datasets/vjawa/miniconda3/envs/all_cuda-115_arch-x86_64/lib/libcugraph.so(+0xb8579d) [0x7f038c3ba79d]
#2 in /datasets/vjawa/miniconda3/envs/all_cuda-115_arch-x86_64/lib/libcugraph.so(_ZN7cugraph6detail12sample_edgesIllfLb1EEESt5tupleIJN3rmm14device_uvectorIT_EES6_St8optionalINS4_IT1_EEEEERKN4raft8handle_tERKNS_12graph_view_tIS5_T0_Lb0EXT2_EvEES7_INS_20edge_property_view_tISH_PKS8_EEERNSC_6random8RngStateERKS6_mb+0x3f41) [0x7f038ce566f1]
#3 in /datasets/vjawa/miniconda3/envs/all_cuda-115_arch-x86_64/lib/libcugraph.so(_ZN7cugraph6detail23uniform_nbr_sample_implIllfLb0ELb1EEESt5tupleIJN3rmm14device_uvectorIT_EES6_NS4_IT1_EENS4_IT0_EEEERKN4raft8handle_tERKNS_12graph_view_tIS5_S9_XT2_EXT3_EvEESt8optionalINS_20edge_property_view_tIS9_PKS7_EEERS6_NSC_4spanIKiLb0ELm18446744073709551615EEEbm+0x4bf) [0x7f038ceeab6f]
#4 in /datasets/vjawa/miniconda3/envs/all_cuda-115_arch-x86_64/lib/libcugraph.so(_ZN7cugraph18uniform_nbr_sampleIllfLb0ELb1EEESt5tupleIJN3rmm14device_uvectorIT_EES5_NS3_IT1_EENS3_IT0_EEEERKN4raft8handle_tERKNS_12graph_view_tIS4_S8_XT2_EXT3_EvEESt8optionalINS_20edge_property_view_tIS8_PKS6_EEENSB_4spanIS4_Lb1ELm18446744073709551615EEENSP_IKiLb0ELm18446744073709551615EEEbm+0x135) [0x7f038ceebc45]
#5 in /datasets/vjawa/miniconda3/envs/all_cuda-115_arch-x86_64/lib/libcugraph_c.so(+0x1f5b37) [0x7f039a5a0b37]
#6 in /datasets/vjawa/miniconda3/envs/all_cuda-115_arch-x86_64/lib/libcugraph_c.so(cugraph_uniform_neighbor_sample+0x114) [0x7f039a5aa734]
#7 in /datasets/vjawa/miniconda3/envs/all_cuda-115_arch-x86_64/lib/python3.9/site-packages/pylibcugraph/uniform_neighbor_sample.cpython-39-x86_64-linux-gnu.so(+0x7679) [0x7f04993a6679]
#8 in /datasets/vjawa/miniconda3/envs/all_cuda-115_arch-x86_64/bin/python(_PyObject_MakeTpCall+0x347) [0x5576d07daa57]
#9 in /datasets/vjawa/miniconda3/envs/all_cuda-115_arch-x86_64/bin/python(_PyEval_EvalFrameDefault+0x55fc) [0x5576d07d6dac]
#10 in /datasets/vjawa/miniconda3/envs/all_cuda-115_arch-x86_64/bin/python(+0x12a7d7) [0x5576d07d07d7]
#11 in /datasets/vjawa/miniconda3/envs/all_cuda-115_arch-x86_64/bin/python(_PyFunction_Vectorcall+0xb9) [0x5576d07e2bb9]
#12 in /datasets/vjawa/miniconda3/envs/all_cuda-115_arch-x86_64/bin/python(PyObject_Call+0xb4) [0x5576d07f2254]
#13 in /datasets/vjawa/miniconda3/envs/all_cuda-115_arch-x86_64/bin/python(_PyEval_EvalFrameDefault+0x3d99) [0x5576d07d5549]
#14 in /datasets/vjawa/miniconda3/envs/all_cuda-115_arch-x86_64/bin/python(+0x13cec3) [0x5576d07e2ec3]
#15 in /datasets/vjawa/miniconda3/envs/all_cuda-115_arch-x86_64/bin/python(_PyEval_EvalFrameDefault+0x3c3) [0x5576d07d1b73]
#16 in /datasets/vjawa/miniconda3/envs/all_cuda-115_arch-x86_64/bin/python(+0x13cec3) [0x5576d07e2ec3]
#17 in /datasets/vjawa/miniconda3/envs/all_cuda-115_arch-x86_64/bin/python(_PyEval_EvalFrameDefault+0x3d99) [0x5576d07d5549]
#18 in /datasets/vjawa/miniconda3/envs/all_cuda-115_arch-x86_64/bin/python(+0x13cec3) [0x5576d07e2ec3]
#19 in /datasets/vjawa/miniconda3/envs/all_cuda-115_arch-x86_64/bin/python(_PyEval_EvalFrameDefault+0x672) [0x5576d07d1e22]
#20 in /datasets/vjawa/miniconda3/envs/all_cuda-115_arch-x86_64/bin/python(+0x13cec3) [0x5576d07e2ec3]
#21 in /datasets/vjawa/miniconda3/envs/all_cuda-115_arch-x86_64/bin/python(_PyEval_EvalFrameDefault+0x3d99) [0x5576d07d5549]
#22 in /datasets/vjawa/miniconda3/envs/all_cuda-115_arch-x86_64/bin/python(+0x13cec3) [0x5576d07e2ec3]
#23 in /datasets/vjawa/miniconda3/envs/all_cuda-115_arch-x86_64/bin/python(_PyEval_EvalFrameDefault+0x672) [0x5576d07d1e22]
#24 in /datasets/vjawa/miniconda3/envs/all_cuda-115_arch-x86_64/bin/python(+0x13cec3) [0x5576d07e2ec3]
#25 in /datasets/vjawa/miniconda3/envs/all_cuda-115_arch-x86_64/bin/python(_PyEval_EvalFrameDefault+0x672) [0x5576d07d1e22]
#26 in /datasets/vjawa/miniconda3/envs/all_cuda-115_arch-x86_64/bin/python(+0x13cec3) [0x5576d07e2ec3]
#27 in /datasets/vjawa/miniconda3/envs/all_cuda-115_arch-x86_64/bin/python(+0x14bc75) [0x5576d07f1c75]
#28 in /datasets/vjawa/miniconda3/envs/all_cuda-115_arch-x86_64/bin/python(+0x22f7f5) [0x5576d08d57f5]
#29 in /datasets/vjawa/miniconda3/envs/all_cuda-115_arch-x86_64/bin/python(+0x22f7a4) [0x5576d08d57a4]
#30 in /lib/x86_64-linux-gnu/libpthread.so.0(+0x76db) [0x7f05caef86db]
#31 in /lib/x86_64-linux-gnu/libc.so.6(clone+0x3f) [0x7f05ca27461f]

Edit: never mind, this is an isolated vertex case. Isolated vertices aren't supported because they are not in the edgelist. It is up to the caller to make sure the vertex ids they input as start vertices are valid. This is correct behavior.

Copy link
Contributor

@rlratzel rlratzel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I only looked at the CMakeLists.txt change on behalf of cmake-codeowners

@VibhuJawa
Copy link
Member

I have a question which i think is unrelated to this PR but i still think is worth asking
We currently fail for isolated vertices in the main-line version, like check out below example. Would this PR or a followup be able to handle sampling on such a vertex. I just except us to not return any samples rather than erroring like above .
CC: @ChuckHastings , @seunghwak , @alexbarghi-nv

edges_df = dask_cudf.from_cudf(cudf.DataFrame({'src':[0,1,3],'dst':[0,1,3]}), npartitions=2)
g = cugraph.MultiGraph(directed=True)
g.from_dask_cudf_edgelist(edges_df,source='src',destination='dst',renumber=False)
sample_d = cugraph.dask.uniform_neighbor_sample(g, start_list=cudf.Series([4677]), with_replacement=False, fanout_vals=[10])
sample_d.compute()
RuntimeError: non-success value returned from cugraph_uniform_neighbor_sample: CUGRAPH_UNKNOWN_ERROR cuGraph failure at file=/home/nfs/vjawa/dgl/cugraph/cpp/src/prims/per_v_random_select_transform_outgoing_e.cuh line=328: Invalid input argument: frontier includes out-of-range keys.
Obtained 32 stack frames
#0 in /datasets/vjawa/miniconda3/envs/all_cuda-115_arch-x86_64/lib/libcugraph.so(+0x85f434) [0x7f038c094434]
#1 in /datasets/vjawa/miniconda3/envs/all_cuda-115_arch-x86_64/lib/libcugraph.so(+0xb8579d) [0x7f038c3ba79d]
#2 in /datasets/vjawa/miniconda3/envs/all_cuda-115_arch-x86_64/lib/libcugraph.so(_ZN7cugraph6detail12sample_edgesIllfLb1EEESt5tupleIJN3rmm14device_uvectorIT_EES6_St8optionalINS4_IT1_EEEEERKN4raft8handle_tERKNS_12graph_view_tIS5_T0_Lb0EXT2_EvEES7_INS_20edge_property_view_tISH_PKS8_EEERNSC_6random8RngStateERKS6_mb+0x3f41) [0x7f038ce566f1]
#3 in /datasets/vjawa/miniconda3/envs/all_cuda-115_arch-x86_64/lib/libcugraph.so(_ZN7cugraph6detail23uniform_nbr_sample_implIllfLb0ELb1EEESt5tupleIJN3rmm14device_uvectorIT_EES6_NS4_IT1_EENS4_IT0_EEEERKN4raft8handle_tERKNS_12graph_view_tIS5_S9_XT2_EXT3_EvEESt8optionalINS_20edge_property_view_tIS9_PKS7_EEERS6_NSC_4spanIKiLb0ELm18446744073709551615EEEbm+0x4bf) [0x7f038ceeab6f]
#4 in /datasets/vjawa/miniconda3/envs/all_cuda-115_arch-x86_64/lib/libcugraph.so(_ZN7cugraph18uniform_nbr_sampleIllfLb0ELb1EEESt5tupleIJN3rmm14device_uvectorIT_EES5_NS3_IT1_EENS3_IT0_EEEERKN4raft8handle_tERKNS_12graph_view_tIS4_S8_XT2_EXT3_EvEESt8optionalINS_20edge_property_view_tIS8_PKS6_EEENSB_4spanIS4_Lb1ELm18446744073709551615EEENSP_IKiLb0ELm18446744073709551615EEEbm+0x135) [0x7f038ceebc45]
#5 in /datasets/vjawa/miniconda3/envs/all_cuda-115_arch-x86_64/lib/libcugraph_c.so(+0x1f5b37) [0x7f039a5a0b37]
#6 in /datasets/vjawa/miniconda3/envs/all_cuda-115_arch-x86_64/lib/libcugraph_c.so(cugraph_uniform_neighbor_sample+0x114) [0x7f039a5aa734]
#7 in /datasets/vjawa/miniconda3/envs/all_cuda-115_arch-x86_64/lib/python3.9/site-packages/pylibcugraph/uniform_neighbor_sample.cpython-39-x86_64-linux-gnu.so(+0x7679) [0x7f04993a6679]
#8 in /datasets/vjawa/miniconda3/envs/all_cuda-115_arch-x86_64/bin/python(_PyObject_MakeTpCall+0x347) [0x5576d07daa57]
#9 in /datasets/vjawa/miniconda3/envs/all_cuda-115_arch-x86_64/bin/python(_PyEval_EvalFrameDefault+0x55fc) [0x5576d07d6dac]
#10 in /datasets/vjawa/miniconda3/envs/all_cuda-115_arch-x86_64/bin/python(+0x12a7d7) [0x5576d07d07d7]
#11 in /datasets/vjawa/miniconda3/envs/all_cuda-115_arch-x86_64/bin/python(_PyFunction_Vectorcall+0xb9) [0x5576d07e2bb9]
#12 in /datasets/vjawa/miniconda3/envs/all_cuda-115_arch-x86_64/bin/python(PyObject_Call+0xb4) [0x5576d07f2254]
#13 in /datasets/vjawa/miniconda3/envs/all_cuda-115_arch-x86_64/bin/python(_PyEval_EvalFrameDefault+0x3d99) [0x5576d07d5549]
#14 in /datasets/vjawa/miniconda3/envs/all_cuda-115_arch-x86_64/bin/python(+0x13cec3) [0x5576d07e2ec3]
#15 in /datasets/vjawa/miniconda3/envs/all_cuda-115_arch-x86_64/bin/python(_PyEval_EvalFrameDefault+0x3c3) [0x5576d07d1b73]
#16 in /datasets/vjawa/miniconda3/envs/all_cuda-115_arch-x86_64/bin/python(+0x13cec3) [0x5576d07e2ec3]
#17 in /datasets/vjawa/miniconda3/envs/all_cuda-115_arch-x86_64/bin/python(_PyEval_EvalFrameDefault+0x3d99) [0x5576d07d5549]
#18 in /datasets/vjawa/miniconda3/envs/all_cuda-115_arch-x86_64/bin/python(+0x13cec3) [0x5576d07e2ec3]
#19 in /datasets/vjawa/miniconda3/envs/all_cuda-115_arch-x86_64/bin/python(_PyEval_EvalFrameDefault+0x672) [0x5576d07d1e22]
#20 in /datasets/vjawa/miniconda3/envs/all_cuda-115_arch-x86_64/bin/python(+0x13cec3) [0x5576d07e2ec3]
#21 in /datasets/vjawa/miniconda3/envs/all_cuda-115_arch-x86_64/bin/python(_PyEval_EvalFrameDefault+0x3d99) [0x5576d07d5549]
#22 in /datasets/vjawa/miniconda3/envs/all_cuda-115_arch-x86_64/bin/python(+0x13cec3) [0x5576d07e2ec3]
#23 in /datasets/vjawa/miniconda3/envs/all_cuda-115_arch-x86_64/bin/python(_PyEval_EvalFrameDefault+0x672) [0x5576d07d1e22]
#24 in /datasets/vjawa/miniconda3/envs/all_cuda-115_arch-x86_64/bin/python(+0x13cec3) [0x5576d07e2ec3]
#25 in /datasets/vjawa/miniconda3/envs/all_cuda-115_arch-x86_64/bin/python(_PyEval_EvalFrameDefault+0x672) [0x5576d07d1e22]
#26 in /datasets/vjawa/miniconda3/envs/all_cuda-115_arch-x86_64/bin/python(+0x13cec3) [0x5576d07e2ec3]
#27 in /datasets/vjawa/miniconda3/envs/all_cuda-115_arch-x86_64/bin/python(+0x14bc75) [0x5576d07f1c75]
#28 in /datasets/vjawa/miniconda3/envs/all_cuda-115_arch-x86_64/bin/python(+0x22f7f5) [0x5576d08d57f5]
#29 in /datasets/vjawa/miniconda3/envs/all_cuda-115_arch-x86_64/bin/python(+0x22f7a4) [0x5576d08d57a4]
#30 in /lib/x86_64-linux-gnu/libpthread.so.0(+0x76db) [0x7f05caef86db]
#31 in /lib/x86_64-linux-gnu/libc.so.6(clone+0x3f) [0x7f05ca27461f]

Edit: never mind, this is an isolated vertex case. Isolated vertices aren't supported because they are not in the edgelist. It is up to the caller to make sure the vertex ids they input as start vertices are valid. This is correct behavior.

Can we support skipping isolated vertices, from a GNN standpoint the train/test node ids of common datasets have node_ids which are isolated vertices and i dont have a straight forward efficient way of filtering it out.

@alexbarghi-nv
Copy link
Member

@VibhuJawa removing isolated vertices from a list is something we would have to take up with the C++ team as a separate issue. That would have to be its own algorithm.

Copy link
Contributor

@seunghwak seunghwak left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

* This function traverses from a set of starting vertices, traversing outgoing edges and
* randomly selects from these outgoing neighbors to extract a subgraph.
*
* Output from this function is a tuple of vectors (src, dst, weight_t, edge_id, edge_type, hop,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

weight_t=>weight

@ChuckHastings
Copy link
Collaborator Author

@VibhuJawa removing isolated vertices from a list is something we would have to take up with the C++ team as a separate issue. That would have to be its own algorithm.

If you want to create an issue for this, please do.

The C API could be modified to filter out the vertices that are not present in the graph. But I would definitely rather do that as a separate activity at this point so we can merge this PR and unblock Alex.

@ChuckHastings
Copy link
Collaborator Author

/merge

@rapids-bot rapids-bot bot merged commit f7d99ff into rapidsai:branch-23.02 Jan 17, 2023
rapids-bot bot pushed a commit that referenced this pull request Jan 23, 2023
Resolves #3073 
Resolves rapidsai/graph_dl#72
Resolves #2562 
Resolves rapidsai/graph_dl#49
Resolves #2871 

Takes Chuck's changes from #2997 and implements the necessary changes in Python.  Adds tests in Python for the handling of edge id, edge type, etc.  Also updates Python tests to reflect the removal of `count_and_remove_duplicates`

Authors:
  - Alex Barghi (https://github.com/alexbarghi-nv)
  - Rick Ratzel (https://github.com/rlratzel)
  - Chuck Hastings (https://github.com/ChuckHastings)
  - Vyas Ramasubramani (https://github.com/vyasr)
  - Vibhu Jawa (https://github.com/VibhuJawa)

Approvers:
  - Chuck Hastings (https://github.com/ChuckHastings)
  - Rick Ratzel (https://github.com/rlratzel)

URL: #3082
@ChuckHastings ChuckHastings deleted the update_uniform_sampling_api branch September 27, 2023 21:39
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

improvement Improvement / enhancement to an existing function non-breaking Non-breaking change

Projects

None yet

8 participants