Some MTMG code cleanup and small optimizations by ChuckHastings · Pull Request #3894 · rapidsai/cugraph

ChuckHastings · 2023-09-27T22:09:53Z

Added some missing documentation.

A couple of optimizations:

Modified the append logic to keep the mutex lock only long enough to compute what needs to be copied and where.
Modified the handle created by the resource manager for each GPU to have a stream pool to enable different threads to operate on different streams.

seunghwak · 2023-10-18T22:04:30Z

cpp/include/cugraph/mtmg/detail/per_device_edgelist.hpp

+    std::vector<std::tuple<vertex_t*, vertex_t const*, size_t>> dst_copies;
+    std::vector<std::tuple<weight_t*, weight_t const*, size_t>> wgt_copies;
+    std::vector<std::tuple<edge_t*, edge_t const*, size_t>> edge_id_copies;
+    std::vector<std::tuple<edge_type_t*, edge_type_t const*, size_t>> edge_type_copies;


Should we maintain 5 variables or one variable storing (input_start_offset, output_start_offset, size) triplets will be sufficient?

Good suggestion, I'll look into that for next push.

seunghwak · 2023-10-18T22:11:00Z

cpp/include/cugraph/mtmg/detail/per_device_edgelist.hpp

+      while (count > 0) {
+        size_t copy_count = std::min(count, (src_.back().size() - current_pos_));
+
+        src_copies.push_back(
+          std::make_tuple(src_.back().begin() + current_pos_, src.begin() + pos, copy_count));
+        dst_copies.push_back(
+          std::make_tuple(dst_.back().begin() + current_pos_, dst.begin() + pos, copy_count));
+        if (wgt)
+          wgt_copies.push_back(
+            std::make_tuple(wgt_->back().begin() + current_pos_, wgt->begin() + pos, copy_count));
+        if (edge_id)
+          edge_id_copies.push_back(std::make_tuple(
+            edge_id_->back().begin() + current_pos_, edge_id->begin() + pos, copy_count));
+        if (edge_type)
+          edge_type_copies.push_back(std::make_tuple(
+            edge_type_->back().begin() + current_pos_, edge_type->begin() + pos, copy_count));
+
+        count -= copy_count;
+        pos += copy_count;
+        current_pos_ += copy_count;
+      }


What happens if count = 1000, src_.back().size() = 100, and current_pos_ = 0?

At the end of the first loop, copy_count = 100, count = 900, pos=100, current_pos_=100. From the second loop, copy_count=0 and this loop won't finish or am I missing something?

Shouldn't we allocate additional buffers and reset current_pos_ to 0 for this loop to finish?

Yes... not sure how I missed that, the original code had that logic, I imagine I accidentally deleted that. I'll add that back in.

seunghwak · 2023-10-18T22:19:17Z

cpp/include/cugraph/mtmg/detail/per_device_edgelist.hpp

+    });

-    handle.raft_handle().sync_stream();
+    handle.raft_handle().sync_stream(handle.get_stream());


If we add get_stream() to mtmg::handle, what about adding sync_stream to mtmg::handle as well?

seunghwak · 2023-10-18T22:23:49Z

cpp/include/cugraph/mtmg/resource_manager.hpp


-      raft::handle_t tmp_handle;
-
+      size_t n_streams{16};


I needed that many for one of the tests I ran :-)

I'll make that a parameter. Any suggestion on a good default?

Maybe # of GPUs? (assuming that 1 stream per thread and # threads == # GPUs)

Each GPU will have its own pool of streams. The pool so far is used by different thread ranks copying data to the GPU independently.

I've added it as a parameter.

seunghwak · 2023-10-18T22:24:59Z

cpp/src/mtmg/vertex_result.cu

    });

-  thrust::gather(handle.raft_handle().get_thrust_policy(),
+  thrust::gather(rmm::exec_policy(handle.get_stream()),


What about adding (mtmg::)handle.get_thrust_policy()?

seunghwak

Looks good to me (besides the reason behind 4 in the code, some documentation will be helpful).

seunghwak · 2023-10-20T19:55:08Z

cpp/tests/mtmg/threaded_test.cu


    auto instance_manager = resource_manager.create_instance_manager(
-      resource_manager.registered_ranks(), instance_manager_id);
+      resource_manager.registered_ranks(), instance_manager_id, 4);


What is 4 here?

Made this a constant, added a comment describing why it's 4 in the latest push.

ChuckHastings · 2023-10-24T02:47:33Z

/merge

some code cleanup and optimizations

46a986a

ChuckHastings self-assigned this Sep 27, 2023

ChuckHastings added improvement Improvement / enhancement to an existing function non-breaking Non-breaking change labels Sep 27, 2023

ChuckHastings added this to the 23.12 milestone Sep 27, 2023

Merge branch 'branch-23.12' into mtmg_small_improvements

d88082e

ChuckHastings marked this pull request as ready for review October 18, 2023 21:37

ChuckHastings requested a review from a team as a code owner October 18, 2023 21:37

seunghwak reviewed Oct 18, 2023

View reviewed changes

address PR comments

235f2b6

ChuckHastings requested a review from seunghwak October 20, 2023 16:56

seunghwak approved these changes Oct 20, 2023

View reviewed changes

ChuckHastings added 2 commits October 23, 2023 16:40

add some documentation, define number of threads per GPU as a constant

f0a27a4

Merge branch 'branch-23.12' into mtmg_small_improvements

03fa6ec

rapids-bot bot merged commit 9b28458 into rapidsai:branch-23.12 Oct 24, 2023

ChuckHastings deleted the mtmg_small_improvements branch December 1, 2023 21:58

Conversation

ChuckHastings commented Sep 27, 2023

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

seunghwak left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

ChuckHastings commented Oct 24, 2023

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants