Define C API and implement induced subgraph#2854
Define C API and implement induced subgraph#2854rapids-bot[bot] merged 6 commits intorapidsai:branch-22.12from
Conversation
| d_vertices.begin(), | ||
| [local_vertex_first] __device__(vertex_t v) { return v - local_vertex_first; }); | ||
|
|
||
| d_local_values.resize(local_vertex_last - local_vertex_first, handle.get_stream()); |
There was a problem hiding this comment.
Not a major issue, but
static_cast<size_t>(thrust::distance(local_vertex_last, local_vertex_first) can be used?
There was a problem hiding this comment.
I could, but these are integer types (vertex_t) so subtraction feels more natural to me in this case.
There was a problem hiding this comment.
Yeah... thrust::distance is for iterators not for scalars.
There was a problem hiding this comment.
My bad, they are not pointer/iterator here.
seunghwak
left a comment
There was a problem hiding this comment.
Looks good in overall but I have few suggestions.
And I am still thinking about improving expand_sparse_offsets, so expect few more comments
|
|
||
| namespace cugraph { | ||
| namespace detail { | ||
|
|
There was a problem hiding this comment.
a short description (documentation) of this function will be helpful.
Here, d_vertices should cover the entire set of vertices (assigned to this GPU in multi-GPU), right?
There was a problem hiding this comment.
This is the implementation file. Documentation for this function is included in the header cugraph/detail/shuffle_wrappers.hpp.
| // returns thrust::nullopt if the destination vertex has | ||
| // a property of 0, return the edge if the destination | ||
| // vertex has a property of 1 | ||
| // |
There was a problem hiding this comment.
This comments looks out-dated.
There was a problem hiding this comment.
Copy/paste error, corrected in next push.
| // vertex has a property of 1 | ||
| // | ||
| return_type __device__ operator()( | ||
| thrust::tuple<vertex_t, size_t> src, vertex_t dst, weight_t wgt, property_t sv, property_t dv) |
There was a problem hiding this comment.
src=>tagged_src might be clearer.
There was a problem hiding this comment.
Fixed in next push.
| // returns thrust::nullopt if the destination vertex has | ||
| // a property of 0, return the edge if the destination | ||
| // vertex has a property of 1 | ||
| // |
There was a problem hiding this comment.
This comments looks out-dated.
There was a problem hiding this comment.
Fixed in next push.
| // a property of 0, return the edge if the destination | ||
| // vertex has a property of 1 | ||
| // | ||
| return_type __device__ operator()(thrust::tuple<vertex_t, size_t> src, |
There was a problem hiding this comment.
src=>tagged_src might be clearer.
There was a problem hiding this comment.
Fixed in next push
| handle.get_stream()); | ||
| thrust::sort(handle.get_thrust_policy(), | ||
| thrust::make_zip_iterator(graph_ids_v.begin(), dst_subgraph_vertices_v.begin()), | ||
| thrust::make_zip_iterator(graph_ids_v.end(), dst_subgraph_vertices_v.end())); |
There was a problem hiding this comment.
This copy might be unnecessary.
We may do something like the following.
raft::device_span<size_t const> dst_subgraph_offsets{};
raft::device_span<vertex_t const> dst_subgraph_vertices{};
std::conditional_t<multi_gpu, rmm::device_uvector<size_t>, byte_t /* dummy */> tmp_dst_subgraph_offsets{};
std::conditional_t<multi_gpu, rmm::device_uvector<vertex_t>, byte_t /* dummy */> tmp_dst_subgraph_vertices{};
if constexpr (multi_gpu) {
// update tmp_dst_subgraph_offsets|vertices as before;
dst_subgraph_offsets = raft::device_span<size_t const>(tmp_dst_subgraph_offsets.data(), tmp_dst_subgraph_offsets.size());
dst_subgraph_vertices = raft::device_span<vertex_t const>(tmp_dst_subgraph_vertices.data(), tmp_dst_subgraph_vertices.size());
}
else {
dst_subgraph_offsets = raft::device_span<size_t const>(subgraph_offsets.data(), subgraph_offsets.size());
dst_subgraph_vertices = raft::device_span<vertex_t const>(subgraph_vertices.data(), subgraph_vertices.size());
}
There was a problem hiding this comment.
If we do this, the above deleted input parameter check is still relevant.
There was a problem hiding this comment.
And group_ids_v is relevant only in the mutli-GPU pass, so it's declaration may better be moved inside the if constexpr (multi_gpu) {}
There was a problem hiding this comment.
I started with something like your suggestion. I could go back to that, if we prefer that approach.
During testing, I discovered that the MG path does not require the input subgraph specifications to be sorted, because we have to sort the data after we shuffle. By adding this copy/sort we can actually drop the requirement that the input be pre-sorted. I dropped the check from the expensive checks - although I just observed that I did not update the documentation in graph_functions.hpp.
If we think that this extra memory/sorting cost isn't worth it in SG, I can go back to requiring the input to be sorted and just use the provided input in SG.
There was a problem hiding this comment.
Based on the work below, I think the graph_ids_v will still be required for populating the frontier.
There was a problem hiding this comment.
I see, I will let you make the call. This might be more of a strategic decision than purely technical. This really depends on whether we want to sacrifice little performance/memory footprint of SG path for better usability in both SG & MG paths.
There was a problem hiding this comment.
I took out the requirement (from the documentation) that the data be sorted and made both paths sort. We can always revisit this later if we decide the sorting overhead needs to be removed in the SG case.
| // fill the edge list buffer (to be returned) for each vetex in the aggregate subgraph vertex | ||
| // list (use the offsets computed in the Phase 1) | ||
| thrust::for_each( | ||
| // FIXME: Shouldn't there be a dummy property equivalent here? |
There was a problem hiding this comment.
Need to do something similar.
There was a problem hiding this comment.
Thanks for the reference. I guess I was just missing the .view(). Fixed in next push.
| graph_view, | ||
| vertex_frontier.bucket(0), | ||
| // edge_src_dummy_property_t{}, | ||
| // edge_dst_dummy_property_t{}, |
There was a problem hiding this comment.
edge_src_dummy_property_t{}.view()
| : std::nullopt; | ||
| vertex_frontier.bucket(0).insert(pair_begin + h_subgraph_offsets[bucket_idx], | ||
| pair_begin + h_subgraph_offsets[bucket_idx + 1]); | ||
| }); |
There was a problem hiding this comment.
A concern with this code is that, if we are extracting thousands of subgraphs, this will create a long chain of kernel launches with little work in a single kernel.
Actually, we can use the same mechanism to create graph_ids_v above, and create a zip iterator of subgraph vertices and graph indices, insert everything in a single kernel call.
There was a problem hiding this comment.
I can do that and defer releasing the graph_ids_v memory until after we've populated the frontier. Will look at that for the next push.
| template <typename vertex_t, typename edge_t> | ||
| rmm::device_uvector<vertex_t> expand_sparse_offsets(raft::device_span<edge_t const> offsets, | ||
| vertex_t base_vertex_id, | ||
| rmm::cuda_stream_view stream_view) |
There was a problem hiding this comment.
Can we use typename vertex_t here?
auto graph_ids_v = cugraph::detail::expand_sparse_offsets(
raft::device_span<size_t const>(d_subgraph_edge_offsets.data(),
d_subgraph_edge_offsets.size()));
d_subgraph_edge_offsets.size()),
vertex_t{0},
handle_->get_stream());
Here, d_subgraph_edge_offsets.size() can theoretically overflow vertex_t, right? # subgraphs to extract can be more than the number of vertices in the graph.
And offsets.back() isn't really the number of edges in this context, but the total number of vertices in all the subgraphs, and this can overflow edge_t as well (may not happen in most cases, but we really can't say this will never happen).
Should we better use either idx_t or size_t for vertex_t and offset_t for edge_t?
There was a problem hiding this comment.
This is actually just a label in the templated function... I used vertex_t and edge_t out of habit more than anything else. idx_t and offset_t would definitely be better labels for this function.
Changed in the next push. Also updated some variable names.
| { | ||
| edge_t num_edges{0}; | ||
|
|
||
| if (offsets.size() > 0) { |
There was a problem hiding this comment.
So, offsets.size() should be at least 0 + 1, so I guess we'd better assume offsets.size() > 0? (add assert(offsetes.size() > 0?)
There was a problem hiding this comment.
There was - during my debugging - a case in MG where on some of the partitions the input was empty, I added this check to make sure that didn't result in a crash. I will change it to an assert.
There was a problem hiding this comment.
Even if the input is empty (i.e. # subgraphs is 0), the offsets.size() should be 0 + 1 = 1 and > 0, right? In this case offsets should be [0] if I am not mistaken.
There was a problem hiding this comment.
If it is specified correctly :-) . I believe it was a bug in my test code which results in the error.
| offsets.begin() + 1, | ||
| offsets.end(), | ||
| [d_vertices = vertices.data(), n_vertices = vertices.size()] __device__(auto offset) { | ||
| if (offset < n_vertices) { atomicAdd(&d_vertices[offset], vertex_t{1}); } |
There was a problem hiding this comment.
Shouldn't this (offset < n_vertices) be always true? I guess this check is unnecessary.
There was a problem hiding this comment.
If the last subgraph list is empty we end up with multiple entries in the array where offset == n_vertices. This check is to catch that condition and not update out-of-bounds in the vertices array.
|
rerun tests |
|
rerun tests |
1 similar comment
|
rerun tests |
Codecov Report
Additional details and impacted files@@ Coverage Diff @@
## branch-22.12 #2854 +/- ##
===============================================
Coverage ? 62.61%
===============================================
Files ? 114
Lines ? 6395
Branches ? 0
===============================================
Hits ? 4004
Misses ? 2391
Partials ? 0 Help us with your feedback. Take ten seconds to tell us how you rate us. Have a feature suggestion? Share it here. ☔ View full report at Codecov. |
|
@gpucibot merge |
Add capability to C API to create an SG graph from an existing CSR. Dependent on PR #2854 Closes #2508 Authors: - Chuck Hastings (https://github.com/ChuckHastings) Approvers: - Seunghwa Kang (https://github.com/seunghwak) - Joseph Nke (https://github.com/jnke2016) - Naim (https://github.com/naimnv) - Rick Ratzel (https://github.com/rlratzel) URL: #2856
Defines the C API and provides implementation of induced subgraph.
Closes #2530
Closes #2532
Closes #2533