Improve MG PageRank scalability#2038
Conversation
…utation (currently with the temporary mechanism to support stream priorities, eventually, rmm should be updated to support this)
…e and also seems like having an issue with 2^31 or more elements)
…rtitions in parallel
…avoid malloc failure due to fragmentation with the pool allocator)
|
rerun tests |
Codecov Report
@@ Coverage Diff @@
## branch-22.04 #2038 +/- ##
===============================================
Coverage ? 73.63%
===============================================
Files ? 154
Lines ? 10327
Branches ? 0
===============================================
Hits ? 7604
Misses ? 2723
Partials ? 0 Continue to review full report at Codecov.
|
| major_tmp_buffers.reserve(num_concurrent_loops); | ||
| for (size_t i = 0; i < num_concurrent_loops; ++i) { | ||
| size_t max_size{0}; | ||
| for (size_t j = i; j < graph_view.get_number_of_local_adj_matrix_partitions(); |
There was a problem hiding this comment.
Why not do a max_element() here (lines 581-584)? It's more readable and possibly faster. I understand you have a stride, but you can pass a sequence and calculate the stride inside a lambda comparer, etc.
There was a problem hiding this comment.
How can I pass a sequence?
Can we do this without using thrust/boost (i.e. without using counting_iterator/transform_iterator).
We can create an additional vector storing a sequence, but then, I am not sure the code will be more readable.
Could you show me the code?
There was a problem hiding this comment.
You can use counting_iterator/transform_iterator on std::vector<> with thrust::host policy and, say, thrust::reduce() or thrust::maximum_element() but the counter arithmetic can be messy. You're right the resulting code would probably NOT be any more readable.
| major_tmp_buffers.reserve(num_concurrent_loops); | ||
| for (size_t i = 0; i < num_concurrent_loops; ++i) { | ||
| size_t max_size{0}; | ||
| for (size_t j = i; j < graph_view.get_number_of_local_adj_matrix_partitions(); |
There was a problem hiding this comment.
You can use counting_iterator/transform_iterator on std::vector<> with thrust::host policy and, say, thrust::reduce() or thrust::maximum_element() but the counter arithmetic can be messy. You're right the resulting code would probably NOT be any more readable.
|
@gpucibot merge |
Improve MG PageRank performance & scalability in multi-node many GPU systems