|
38 | 38 | "metadata": {}, |
39 | 39 | "source": [ |
40 | 40 | "- Understand advantages of using Subgroups in SYCL\n", |
41 | | - "- Take advantage of Subgroup collectives in ND-Range kernel implementation\n", |
| 41 | + "- Take advantage of Subgroup algorithms for performance and productivity\n", |
42 | 42 | "- Use Subgroup Shuffle operations to avoid explicit memory operations" |
43 | 43 | ] |
44 | 44 | }, |
|
158 | 158 | "cell_type": "markdown", |
159 | 159 | "metadata": {}, |
160 | 160 | "source": [ |
161 | | - "Once you have the subgroup handle, you can query for more information about the subgroup, do shuffle operations or use collective functions." |
| 161 | + "Once you have the subgroup handle, you can query for more information about the subgroup, do shuffle operations or use group algorithm." |
162 | 162 | ] |
163 | 163 | }, |
164 | 164 | { |
|
280 | 280 | "cell_type": "markdown", |
281 | 281 | "metadata": {}, |
282 | 282 | "source": [ |
283 | | - "For tuning applications for performance, sub-group size may have to be set a specific value. For example Intel(R) GPU supports sub-groups sizes of 8, 16 and 32; by default the compiler implimentation will pick optimal sub-group size, but it can also be forced to use a specific value.\n", |
| 283 | + "For tuning applications for performance, sub-group size may have to be set a specific value. For example, Intel(R) GPU supports sub-groups sizes of 8, 16 and 32; by default the compiler implementation will pick optimal sub-group size, but it can also be forced to use a specific value.\n", |
284 | 284 | "\n", |
285 | 285 | "The supported sub-group sizes for a GPU can be queried from device information as shown below:\n", |
286 | 286 | "\n", |
|
405 | 405 | "Providing these implementations as library functions instead __increases developer productivity__ and gives implementations the ability to __generate highly optimized \n", |
406 | 406 | "code__ for individual target devices.\n", |
407 | 407 | "\n", |
408 | | - "Below are some of the group algorithms available for sub-groups, they include useful fuctionalities to perform shuffles, reductions, scans and votes:\n", |
| 408 | + "Below are some of the group algorithms available for sub-groups, they include useful functionalities to perform shuffles, reductions, scans and votes:\n", |
409 | 409 | "\n", |
410 | 410 | "- select_by_group\n", |
411 | 411 | "- shift_group_left\n", |
|
470 | 470 | "cell_type": "markdown", |
471 | 471 | "metadata": {}, |
472 | 472 | "source": [ |
473 | | - "The code below uses subgroup shuffle to swap items in a subgroup. You can try other shuffle operations or change the fixed constant in the shuffle function to express some common commuinication patterns using `permute_group_by_xor`.\n", |
| 473 | + "The code below uses subgroup shuffle to swap items in a subgroup. You can try other shuffle operations or change the fixed constant in the shuffle function to express some common communication patterns using `permute_group_by_xor`.\n", |
474 | 474 | "\n", |
475 | 475 | "The SYCL code below demonstrates sub-group shuffle operations, the code shows how `permute_group_by_xor` can be used to swap adjacent elements in sub-group, and also you can change the code to reverse the order of element in sub-group using a different mask.\n", |
476 | 476 | "\n", |
|
561 | 561 | " h.parallel_for(nd_range<1>(N,B), [=](nd_item<1> item){\n", |
562 | 562 | " auto sg = item.get_sub_group();\n", |
563 | 563 | " auto i = item.get_global_id(0);\n", |
564 | | - " /* Reduction Collective on Sub-group */\n", |
| 564 | + " /* Reduction algorithm on Sub-group */\n", |
565 | 565 | " int result = reduce_over_group(sg, data[i], plus<>());\n", |
566 | 566 | " //int result = reduce_over_group(sg, data[i], maximum<>());\n", |
567 | 567 | " //int result = reduce_over_group(sg, data[i], minimum<>());\n", |
568 | 568 | " });\n", |
569 | 569 | "\n", |
570 | 570 | "```\n", |
571 | 571 | "\n", |
572 | | - "The SYCL code below demonstrates sub-group collectives: Inspect code, you can change the operator \"_plus_\" to \"_maximum_\" or \"_minimum_\" and check output:\n", |
| 572 | + "The SYCL code below demonstrates sub-group algorithm: Inspect code, you can change the operator \"_plus_\" to \"_maximum_\" or \"_minimum_\" and check output:\n", |
573 | 573 | "\n", |
574 | 574 | "1. Inspect the code cell below and click run ▶ to save the code to file.\n", |
575 | 575 | "\n", |
|
608 | 608 | " auto sg = item.get_sub_group();\n", |
609 | 609 | " auto i = item.get_global_id(0);\n", |
610 | 610 | "\n", |
611 | | - " //# Add all elements in sub_group using sub_group collectives\n", |
| 611 | + " //# Add all elements in sub_group using sub_group algorithm\n", |
612 | 612 | " int result = reduce_over_group(sg, data[i], plus<>());\n", |
613 | 613 | "\n", |
614 | 614 | " //# write sub_group sum in first location for each sub_group\n", |
|
655 | 655 | "cell_type": "markdown", |
656 | 656 | "metadata": {}, |
657 | 657 | "source": [ |
658 | | - "The code below uses subgroup collectives `group_broadcast` function, this enables one work-item in a group to share the value of a variable with all other work-items in the group.\n", |
| 658 | + "The code below uses subgroup algorithm `group_broadcast` function, this enables one work-item in a group to share the value of a variable with all other work-items in the group.\n", |
659 | 659 | "\n", |
660 | 660 | "The SYCL code below demonstrates sub-group broadcast function: Inspect code, there are no modifications necessary:\n", |
661 | 661 | "\n", |
|
742 | 742 | "“vote” functions) enable work-items to compare the result of a Boolean\n", |
743 | 743 | "condition across their group.\n", |
744 | 744 | "\n", |
745 | | - "The SYCL code below demonstrates sub-group collectives `any_of_group`, `all_of_group` and `none_of_group` functions: Inspect code, there are no modifications necessary:\n", |
| 745 | + "The SYCL code below demonstrates sub-group algorithms `any_of_group`, `all_of_group` and `none_of_group` functions: Inspect code, there are no modifications necessary:\n", |
746 | 746 | "\n", |
747 | 747 | "1. Inspect the code cell below and click run ▶ to save the code to file.\n", |
748 | 748 | "\n", |
|
839 | 839 | "cell_type": "markdown", |
840 | 840 | "metadata": {}, |
841 | 841 | "source": [ |
842 | | - "Complete the coding excercise below using Sub-Group concepts:\n", |
| 842 | + "Complete the coding exercise below using Sub-Group concepts:\n", |
843 | 843 | "- The code has an array `data` of size `N=1024` elements initialized\n", |
844 | 844 | "- We will offload kernel task to compute the sum of all items in each sub-group and save in new array `sg_data`\n", |
845 | | - "- We will set a the sub-group size to `S=32`, which will make the `sg_data` array of size `N/S`\n", |
| 845 | + "- We will set the sub-group size to `S=32`, which will make the `sg_data` array of size `N/S`\n", |
846 | 846 | "- Create USM shared allocation for `data` and `sg_data`\n", |
847 | 847 | "- Create a nd-range kernel task with fixed sub-group size of `S`\n", |
848 | 848 | "- In the kernel task, compute the sub-group sum using `reduce_over_group` function\n", |
|
0 commit comments