Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
24 commits
Select commit Hold shift + click to select a range
fe5c3fd
Propose new API to improve efficient of MG sampling in end-to-end wor…
ChuckHastings Feb 14, 2023
e9fcd31
respond to PR review feedback
ChuckHastings Feb 14, 2023
ebb51a7
Update API based on PR comments
ChuckHastings Feb 16, 2023
fd81f64
implement new uniform neighborhood sampling API
ChuckHastings Feb 23, 2023
12de841
Merge branch 'branch-23.04' into uniform_neighbor_sampling_tuning
ChuckHastings Feb 23, 2023
a451397
missed a few formatting issues
ChuckHastings Feb 23, 2023
75e6acb
update pylibcugraph files to add new parameters to uniform neighborho…
ChuckHastings Feb 23, 2023
38ff456
some cleanup to make names consistent
ChuckHastings Feb 23, 2023
c4d5f0b
debug failing C API test
ChuckHastings Feb 23, 2023
47ed81d
improve tests, some debugging
ChuckHastings Feb 24, 2023
e1c8298
Merge branch 'branch-23.04' into uniform_neighbor_sampling_tuning
ChuckHastings Feb 24, 2023
a09713c
address some PR comments
ChuckHastings Mar 7, 2023
d04ac14
address PR comments with new structure
ChuckHastings Mar 13, 2023
994c52c
update names of parameters in PLC
ChuckHastings Mar 13, 2023
ff49397
Merge branch 'branch-23.04' into uniform_neighbor_sampling_tuning
ChuckHastings Mar 13, 2023
9461846
fix formatting errors
ChuckHastings Mar 13, 2023
5fd2d17
address PR comments
ChuckHastings Mar 15, 2023
cca68de
rename is_span_sorted to is_sorted
ChuckHastings Mar 16, 2023
ccbb3d2
add unit test to check shuffling, need to sort before shuffling
ChuckHastings Mar 17, 2023
27a5647
update and verify python tests
alexbarghi-nv Mar 17, 2023
7f39f16
refactor code to group test seeds into batches
ChuckHastings Mar 20, 2023
99a7a98
Merge branch 'branch-23.04' into uniform_neighbor_sampling_tuning
ChuckHastings Mar 20, 2023
b63f601
Merge branch 'branch-23.04' into uniform_neighbor_sampling_tuning
alexbarghi-nv Mar 21, 2023
b575a5d
Merge branch 'branch-23.04' into uniform_neighbor_sampling_tuning
ChuckHastings Mar 21, 2023
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
63 changes: 46 additions & 17 deletions cpp/include/cugraph/algorithms.hpp
Original file line number Diff line number Diff line change
Expand Up @@ -1711,60 +1711,89 @@ k_core(raft::handle_t const& handle,
* randomly selects from these outgoing neighbors to extract a subgraph.
*
* Output from this function is a tuple of vectors (src, dst, weight, edge_id, edge_type, hop,
* label), identifying the randomly selected edges. src is the source vertex, dst is the
* label, offsets), identifying the randomly selected edges. src is the source vertex, dst is the
* destination vertex, weight (optional) is the edge weight, edge_id (optional) identifies the edge
* id, edge_type (optional) identifies the edge type, hop identifies which hop the edge was
* encountered in, label (optional) identifies which vertex label this edge was derived from.
* encountered in. The label output (optional) identifes the vertex label. The offsets array
* (optional) will be described below and is dependent upon the input parameters.
*
*
* If @p starting_vertex_labels is not specified then no organization is applied to the output, the
* label and offsets values in the return set will be std::nullopt.
*
* If @p starting_vertex_labels is specified and @p label_to_output_comm_rank is not specified then
* the label output has values. This will also result in the output being sorted by vertex label.
* The offsets array in the return will be a CSR-style offsets array to identify the beginning of
* each label range in the data. `labels.size() == (offsets.size() - 1)`.
*
* If @p starting_vertex_labels is specified and @p label_to_output_comm_rank is specified then the
* label output has values. This will also result in the output being sorted by vertex label. The
* offsets array in the return will be a CSR-style offsets array to identify the beginning of each
* label range in the data. `labels.size() == (offsets.size() - 1)`. Additionally, the data will
* be shuffled so that all data with a particular label will be on the specified rank.
*
* @tparam vertex_t Type of vertex identifiers. Needs to be an integral type.
* @tparam edge_t Type of edge identifiers. Needs to be an integral type.
* @tparam weight_t Type of edge weights. Needs to be a floating point type.
* @tparam edge_type_t Type of edge type. Needs to be an integral type.
* @tparam label_t Type of label. Needs to be an integral type.
* @tparam store_transposed Flag indicating whether sources (if false) or destinations (if
* true) are major indices
* @tparam multi_gpu Flag indicating whether template instantiation should target single-GPU (false)
* @param handle RAFT handle object to encapsulate resources (e.g. CUDA stream, communicator, and
* handles to various CUDA libraries) to run graph algorithms.
* @param graph_view Graph View object to generate NBR Sampling on.
* @param edge_weight_view Optional view object holding edge weights for @p graph_view.
* @param edge_id_type_view Optional view object holding edge ids and types for @p graph_view.
* @param starting_vertices Device vector of starting vertex IDs for the sampling.
* @param starting_labels Optional device vector of starting vertex labels for the sampling.
* @param edge_id_view Optional view object holding edge ids for @p graph_view.
* @param edge_type_view Optional view object holding edge types for @p graph_view.
* @param starting_vertices Device span of starting vertex IDs for the sampling.
* In a multi-gpu context the starting vertices should be local to this GPU.
* @param starting_vertex_labels Optional device span of labels associted with each starting vertex
* for the sampling.
* @param label_to_output_comm_rank Optional tuple of device spans mapping label to a particular
* output rank. Element 0 of the tuple identifes the label, Element 1 of the tuple identifies the
* output rank. The label span must be sorted in ascending order.
* @param fan_out Host span defining branching out (fan-out) degree per source vertex for each
* level
* @param rng_state A pre-initialized raft::RngState object for generating random numbers
* @param return_hops boolean flag specifying if the hop information should be returned
* @param with_replacement boolean flag specifying if random sampling is done with replacement
* (true); or, without replacement (false); default = true;
* @param rng_state A pre-initialized raft::RngState object for generating random numbers
* @param do_expensive_check A flag to run expensive checks for input arguments (if set to `true`).
* @return tuple device vectors (vertex_t source_vertex, vertex_t destination_vertex,
* optional weight_t weight, optional edge_t edge id, optional edge_type_t edge type, int32_t hop,
* optional int32_t label)
* optional weight_t weight, optional edge_t edge id, optional edge_type_t edge type,
* optional int32_t hop, optional label_t label, optional size_t offsets)
*/
template <typename vertex_t,
typename edge_t,
typename weight_t,
typename edge_type_t,
typename label_t,
bool store_transposed,
bool multi_gpu>
std::tuple<rmm::device_uvector<vertex_t>,
rmm::device_uvector<vertex_t>,
std::optional<rmm::device_uvector<weight_t>>,
std::optional<rmm::device_uvector<edge_t>>,
std::optional<rmm::device_uvector<edge_type_t>>,
rmm::device_uvector<int32_t>,
std::optional<rmm::device_uvector<int32_t>>>
std::optional<rmm::device_uvector<int32_t>>,
std::optional<rmm::device_uvector<label_t>>,
std::optional<rmm::device_uvector<size_t>>>
uniform_neighbor_sample(
raft::handle_t const& handle,
Copy link
Contributor

@seunghwak seunghwak Feb 14, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So, these are what this API should satisfy.

  • If starting_vertices includes seeds from multiple batches, the caller should be able to separate sampling outputs from different batches and the caller should be able to place sampling outputs on different GPUs based on mapping between batch IDs and GPUs. If the caller is invoking this function for a single batch, the caller may want the sampling outputs for the local sampling_vertices to be stored locally as well.
  • For the output stored on a single GPU, we should be able to sort them based on batch IDs.
  • Once we sort based on the batch ID, some users may want to sort based on hops as a primary key (if they don't need to create a separate tree per seed within a batch) or seed IDs as a primary key and hops as a secondary key (if they need to create a separate tree per seed).

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am thinking about the following,

This function optionally takes rmm::device_uvector<label_t>&& labels and rmm::device_uvector<int>&& dst_comm_ranks (labels.size() == dst_comm_ranks.size() == start_vertices.size()).
Here labels don't necessarily coincide with batch IDs, but users may set labels to batch_id * batch_size + seed index within a batch (if the result need to be sorted within a batch using seed IDs as a primary key and hops as a secondary key) or just batch_id (if no need to create a tree per seed). The output results are shuffled based on dst_comm_ranks for each starting vertex (if dst_comm_ranks is valid) or all the results for local seeds will be stored locally.

We return optional labels and hops for each edge data (src, dst, optional (edge weight, ID, type)). Within each GPU, the output results will be sorted using labels as a primary key (if labels are provided in the input) and hops as a secondary key.

One concern is that computing global "seed index within a batch" might be challenging for the caller if they want to distribute seeds for a single batch to multiple GPUs, but I guess this is not a common use case. If there are 100 batches and 10 GPUs, users may assign batch [0-10) to GPU0 (so start_vertices for GPU0 holding all the seeds for batch [0-10)), batch [10-20) to GPU1, and so on.

We may be able to reduce the output size if we don't return label, hop per every edge data and create offset arrays per label & hop, but I am not sure the benefit out-weights the increase in complexity. If the memory footprint is a major concern, we may just reduce the number of batches we process per call (as the main reason for processing multiple batches in a single call is to saturate GPUs, but if we're hitting the memory limit, we might be going to far in this direction).

Anything am I missing or any other thoughts?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a good point, they're a lot to unpack here. Let me create a few separate comments.

The original objective of processing multiple batches in a single call is to efficiently utilize the parallelism on the GPU. If we do individual calls with a small batch size then we don't get enough parallelism and the overhead of calling the function is large relative to the work accomplished. By processing multiple batches of seeds in the same call we increase the amount of parallel work to be done on the GPU and we reduce the number of times we face the overhead of calling the function, allocating memory, launching kernels, etc. This improves our overall efficiency.

I don't know where the threshold is, but at some point there are enough vertices in a call to get the efficiencies we are looking for. Doing larger numbers of batches in a single call should still drive down overhead, but the improvement should decrease dramatically once we reach saturation of the GPU parallelism. It seems like (with the large graphs we are targeting) that this threshold is much less than "all seeds to be sampled in a training epoch".

In the current pipeline, the output from this processing is going to be moved out of GPU memory to be retrieved by the trainers as they work (currently written at the python layer to parquet files). As long as we process in the first call enough batches to keep the trainers busy while we do a second call, there would be no drop in pipeline throughput by making multiple calls to the sampling code with smaller numbers of batches. That would reduce the memory requirement, and we can overlap the sampling with the training a bit more. It might actually improve the overall latency for training an epoch (although perhaps only marginally, I don't really know the performance numbers you're dealing with).

This does suggest that the extra memory required to drive this function doesn't need to be a driving factor in the design of the API.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Regarding the labels discussion. Part of the reason I went with the name "label" rather than "batch" was this notion that batches might not be the only reason for labeling the data.

There's absolutely no reason that the sampling code needs to understand anything about how the caller assigns labels and uses them. If we keep with the label as a vector (labels.size() == start_vertices.size()) rather than switching to a CSR-style representation, that label can literally be any arbitrary 32-bit integer value (doesn't need to be contiguous values starting from 0). Either route for organizing the result (either mapping label to GPU id or providing a destination vector (dst_comm_ranks.size() == start_vertices.size()) allows for us to shuffle the data to the correct GPU. We could talk about sorting options (beyond sorting by label) if that would be helpful, although that seems like an easy enough feature to add later when we need it.

It seems like much of the use cases you describe can be addressed by the caller creating the labels in a more sophisticated way and then grouping the results once they are returned. There's no reason the caller couldn't create labels such that the training batch is decomposed into k labels that all get mapped to the same GPU and the results will be combined by the caller. That allows for construction of the trees. That allows for the same seed to be repeated within a batch and being able to differentiate between which tree came from which seed.

The beauty there is that the responsibility for managing the sophistication lies in the caller rather than in the library function.

Copy link
Contributor

@seunghwak seunghwak Feb 14, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't know where the threshold is, but at some point there are enough vertices in a call to get the efficiencies we are looking for.

I guess the memory footprint for the sampling is a significant percentage (say more than 10%) of the GPU's total memory, we may not be too far from saturating the GPU. And I agree that "This does suggest that the extra memory required to drive this function doesn't need to be a driving factor in the design of the API." We still need to be frugal with memory but it should not be the #1 priority when we need to make trade-offs.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Regarding the labels discussion. Part of the reason I went with the name "label" rather than "batch" was this notion that batches might not be the only reason for labeling the data.
There's absolutely no reason that the sampling code needs to understand anything about how the caller assigns labels and uses them.

Yes, now I agree with this. And I just want to emphasize that this API should allow creating one tree per seed within a batch, so we need to provide a mechanism to distinguishes sampled edges from different seeds, and labels can serve this purpose (32 bit might be sufficient for the foreseeable future, but in few GPU generations in the future, especially with grace-hopper like systems, we may exceed the 32 bit boundary, so we may keep it as label_t than int32_t).

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I will add a label_t and only instantiate int32_t for now.

graph_view_t<vertex_t, edge_t, store_transposed, multi_gpu> const& graph_view,
std::optional<edge_property_view_t<edge_t, weight_t const*>> edge_weight_view,
std::optional<
edge_property_view_t<edge_t,
thrust::zip_iterator<thrust::tuple<edge_t const*, edge_type_t const*>>>>
edge_id_type_view,
rmm::device_uvector<vertex_t>&& starting_vertices,
std::optional<rmm::device_uvector<int32_t>>&& starting_labels,
std::optional<edge_property_view_t<edge_t, edge_t const*>> edge_id_view,
std::optional<edge_property_view_t<edge_t, edge_type_t const*>> edge_type_view,
raft::device_span<vertex_t const> starting_vertices,
std::optional<raft::device_span<label_t const>> starting_vertex_labels,
std::optional<std::tuple<raft::device_span<label_t const>, raft::device_span<int32_t const>>>
label_to_output_comm_rank,
raft::host_span<int32_t const> fan_out,
raft::random::RngState& rng_state,
bool with_replacement = true);
bool return_hops,
bool with_replacement = true,
bool do_expensive_check = false);

/*
* @brief Compute triangle counts.
Expand Down
49 changes: 27 additions & 22 deletions cpp/include/cugraph/detail/shuffle_wrappers.hpp
Original file line number Diff line number Diff line change
Expand Up @@ -40,23 +40,25 @@ namespace detail {
* partitioning to determine the local GPU.
* @param[in] minors Vector of second elements in vertex pairs.
* @param[in] weights Optional vector of vertex pair weight values.
* @param[in] edge_id_type_tuple Optional tuple of vectors of edge id and edge type values
* @param[in] edge_ids Optional vector of vertex pair edge id values.
* @param[in] edge_types Optional vector of vertex pair edge type values.
*
* @return Tuple of vectors storing shuffled major vertices, minor vertices and optional weights.
* @return Tuple of vectors storing shuffled major vertices, minor vertices and optional weights,
* edge ids and edge types
*/
template <typename vertex_t, typename edge_t, typename weight_t, typename edge_type_id_t>
std::tuple<
rmm::device_uvector<vertex_t>,
rmm::device_uvector<vertex_t>,
std::optional<rmm::device_uvector<weight_t>>,
std::optional<std::tuple<rmm::device_uvector<edge_t>, rmm::device_uvector<edge_type_id_t>>>>
shuffle_ext_vertex_pairs_to_local_gpu_by_edge_partitioning(
std::tuple<rmm::device_uvector<vertex_t>,
rmm::device_uvector<vertex_t>,
std::optional<rmm::device_uvector<weight_t>>,
std::optional<rmm::device_uvector<edge_t>>,
std::optional<rmm::device_uvector<edge_type_id_t>>>
shuffle_ext_vertex_pairs_with_values_to_local_gpu_by_edge_partitioning(
raft::handle_t const& handle,
rmm::device_uvector<vertex_t>&& majors,
rmm::device_uvector<vertex_t>&& minors,
std::optional<rmm::device_uvector<weight_t>>&& weights,
std::optional<std::tuple<rmm::device_uvector<edge_t>, rmm::device_uvector<edge_type_id_t>>>&&
edge_id_type_tuple);
std::optional<rmm::device_uvector<edge_t>>&& edge_ids,
std::optional<rmm::device_uvector<edge_type_id_t>>&& edge_types);

/**
* @brief Shuffle internal (i.e. renumbered) vertex pairs (which can be edge end points) to their
Expand All @@ -75,25 +77,28 @@ shuffle_ext_vertex_pairs_to_local_gpu_by_edge_partitioning(
* partitioning to determine the local GPU.
* @param[in] minors Vector of second elements in vertex pairs.
* @param[in] weights Optional vector of vertex pair weight values.
* @param[in] edge_id_type_tuple Optional tuple of vectors of edge id and edge type values
* @param[in] edge_ids Optional vector of vertex pair edge id values.
* @param[in] edge_types Optional vector of vertex pair edge type values.
*
* @param[in] vertex_partition_range_lasts Vector of each GPU's vertex partition range's last
* (exclusive) vertex ID.
*
* @return Tuple of vectors storing shuffled major vertices, minor vertices and optional weights.
* @return Tuple of vectors storing shuffled major vertices, minor vertices and optional weights,
* edge ids and edge types
*/
template <typename vertex_t, typename edge_t, typename weight_t, typename edge_type_id_t>
std::tuple<
rmm::device_uvector<vertex_t>,
rmm::device_uvector<vertex_t>,
std::optional<rmm::device_uvector<weight_t>>,
std::optional<std::tuple<rmm::device_uvector<edge_t>, rmm::device_uvector<edge_type_id_t>>>>
shuffle_int_vertex_pairs_to_local_gpu_by_edge_partitioning(
std::tuple<rmm::device_uvector<vertex_t>,
rmm::device_uvector<vertex_t>,
std::optional<rmm::device_uvector<weight_t>>,
std::optional<rmm::device_uvector<edge_t>>,
std::optional<rmm::device_uvector<edge_type_id_t>>>
shuffle_int_vertex_pairs_with_values_to_local_gpu_by_edge_partitioning(
raft::handle_t const& handle,
rmm::device_uvector<vertex_t>&& majors,
rmm::device_uvector<vertex_t>&& minors,
std::optional<rmm::device_uvector<weight_t>>&& weights,
std::optional<std::tuple<rmm::device_uvector<edge_t>, rmm::device_uvector<edge_type_id_t>>>&&
edge_id_type_tuple,
std::optional<rmm::device_uvector<edge_t>>&& edge_ids,
std::optional<rmm::device_uvector<edge_type_id_t>>&& edge_types,
std::vector<vertex_t> const& vertex_partition_range_lasts);

/**
Expand Down Expand Up @@ -205,8 +210,8 @@ rmm::device_uvector<size_t> groupby_and_count_edgelist_by_local_partition_id(
rmm::device_uvector<vertex_t>& d_edgelist_majors,
rmm::device_uvector<vertex_t>& d_edgelist_minors,
std::optional<rmm::device_uvector<weight_t>>& d_edgelist_weights,
std::optional<std::tuple<rmm::device_uvector<edge_t>, rmm::device_uvector<edge_type_t>>>&
d_edgelist_id_type_pairs,
std::optional<rmm::device_uvector<edge_t>>& d_edgelist_edge_ids,
std::optional<rmm::device_uvector<edge_type_t>>& d_edgelist_edge_types,
bool groupby_and_count_local_partition_by_minor = false);

/**
Expand Down
13 changes: 13 additions & 0 deletions cpp/include/cugraph/detail/utility_wrappers.hpp
Original file line number Diff line number Diff line change
Expand Up @@ -15,6 +15,7 @@
*/
#pragma once

#include <raft/core/device_span.hpp>
#include <raft/core/handle.hpp>
#include <raft/random/rng_state.hpp>

Expand Down Expand Up @@ -145,5 +146,17 @@ std::tuple<rmm::device_uvector<vertex_t>, rmm::device_uvector<edge_t>> filter_de
rmm::device_uvector<vertex_t>&& d_vertices,
rmm::device_uvector<edge_t>&& d_out_degs);

/**
* @brief Check if device span is sorted
*
* @tparam data_t type of data in span
* @param handle RAFT handle object to encapsulate resources (e.g. CUDA stream, communicator, and
* handles to various CUDA libraries) to run graph algorithms.
* @param span The span of data to check
* @return true if sorted, false if not sorted
*/
template <typename data_t>
bool is_sorted(raft::handle_t const& handle, raft::device_span<data_t> span);

} // namespace detail
} // namespace cugraph
44 changes: 24 additions & 20 deletions cpp/include/cugraph/graph_functions.hpp
Original file line number Diff line number Diff line change
Expand Up @@ -663,7 +663,10 @@ extract_induced_subgraphs(
*
* @tparam vertex_t Type of vertex identifiers. Needs to be an integral type.
* @tparam edge_t Type of edge identifiers. Needs to be an integral type.
* @tparam edge_type_t Type of edge type identifiers. Needs to be an integral type.
* @tparam weight_t Type of edge weight. Needs to be floating point type
* @tparam edge_id_t Type of edge id. Needs to be an integral type
* @tparam edge_type_t Type of edge type. Needs to be an integral type, currently only int32_t is
* supported
* @tparam store_transposed Flag indicating whether to use sources (if false) or destinations (if
* true) as major indices in storing edges using a 2D sparse matrix. transposed.
* @tparam multi_gpu Flag indicating whether template instantiation should target single-GPU (false)
Expand All @@ -679,42 +682,43 @@ extract_induced_subgraphs(
* compute_gpu_id_from_ext_edge_endpoints_t to every edge should return the local GPU ID for this
* function to work (edges should be pre-shuffled).
* @param edgelist_dsts Vector of edge destination vertex IDs.
* @param edgelist_weights Vector of edge weights.
* @param edgelist_id_type_pairs Vector of edge ID and type pairs.
* @param edgelist_weights Vector of weight values for edges
* @param edgelist_edge_ids Vector of edge_id values for edges
* @param edgelist_edge_types Vector of edge_type values for edges
* @param graph_properties Properties of the graph represented by the input (optional vertex list
* and) edge list.
* @param renumber Flag indicating whether to renumber vertices or not (must be true if @p multi_gpu
* is true).
* @param do_expensive_check A flag to run expensive checks for input arguments (if set to `true`).
* @return Tuple of the generated graph and optional edge_property_t objects storing edge weights
* and edge IDs & types (valid if @p edgelist_weights.has_value() and @p
* edgelist_id_type_pairss.has_value() are true, respectively) and a renumber map (if @p renumber is
* true).
* @return Tuple of the generated graph and optional edge_property_t objects storing the provided
* edge properties and a renumber map (if @p renumber is true).
*/
template <typename vertex_t,
typename edge_t,
typename weight_t,
typename edge_id_t,
typename edge_type_t,
bool store_transposed,
bool multi_gpu>
std::tuple<
graph_t<vertex_t, edge_t, store_transposed, multi_gpu>,
std::optional<
edge_property_t<graph_view_t<vertex_t, edge_t, store_transposed, multi_gpu>, weight_t>>,
std::optional<edge_property_t<graph_view_t<vertex_t, edge_t, store_transposed, multi_gpu>,
thrust::tuple<edge_t, edge_type_t>>>,
std::optional<
edge_property_t<graph_view_t<vertex_t, edge_t, store_transposed, multi_gpu>, edge_id_t>>,
std::optional<
edge_property_t<graph_view_t<vertex_t, edge_t, store_transposed, multi_gpu>, edge_type_t>>,
std::optional<rmm::device_uvector<vertex_t>>>
create_graph_from_edgelist(
raft::handle_t const& handle,
std::optional<rmm::device_uvector<vertex_t>>&& vertices,
rmm::device_uvector<vertex_t>&& edgelist_srcs,
rmm::device_uvector<vertex_t>&& edgelist_dsts,
std::optional<rmm::device_uvector<weight_t>>&& edgelist_weights,
std::optional<std::tuple<rmm::device_uvector<edge_t>, rmm::device_uvector<edge_type_t>>>&&
edgelist_id_type_pairs,
graph_properties_t graph_properties,
bool renumber,
bool do_expensive_check = false);
create_graph_from_edgelist(raft::handle_t const& handle,
std::optional<rmm::device_uvector<vertex_t>>&& vertices,
rmm::device_uvector<vertex_t>&& edgelist_srcs,
rmm::device_uvector<vertex_t>&& edgelist_dsts,
std::optional<rmm::device_uvector<weight_t>>&& edgelist_weights,
std::optional<rmm::device_uvector<edge_id_t>>&& edgelist_edge_ids,
std::optional<rmm::device_uvector<edge_type_t>>&& edgelist_edge_types,
graph_properties_t graph_properties,
bool renumber,
bool do_expensive_check = false);

/**
* @brief Find all 2-hop neighbors in the graph
Expand Down
Loading