Skip to content

Commit 56f93c6

Browse files
author
Chao Liu
authored
Update README.md
1 parent f63a23a commit 56f93c6

File tree

1 file changed

+26
-27
lines changed

1 file changed

+26
-27
lines changed

README.md

Lines changed: 26 additions & 27 deletions
Original file line numberDiff line numberDiff line change
@@ -18,36 +18,35 @@ https://www.boost.org/doc/libs/1_66_0/more/getting_started/unix-variants.html#ea
1818

1919

2020
# Build
21-
Change target ID in source code, example below is gfx908
22-
https://github.com/asroy/modular_convolution/blob/aafb5eb18781f1ac9e06a17c3e53d968dd53dcc0/composable_kernel/include/utility/config.amd.hpp.in#L16-L23
23-
2421
Add path of Boost
2522
```
2623
export LD_LIBRARY_PATH=/usr/local/lib:$LD_LIBRARY_PATH
2724
```
2825

2926
```
3027
mkdir build && cd build
28+
```
3129

32-
# need to manually set target ID, example below is gfx908
30+
cmake cmd. Need to Specify target ID, example below is gfx908
31+
```
3332
cmake \
34-
-D CMAKE_BUILD_TYPE=Release \
35-
-D DEVICE_BACKEND=AMD \
36-
-D CMAKE_CXX_FLAGS="-O3 --amdgpu-target=gfx908 -mllvm --amdgpu-spill-vgpr-to-agpr=0 -gline-tables-only -save-temps=$CWD" \
37-
-D CMAKE_CXX_COMPILER=/opt/rocm/bin/hipcc \
38-
-D CMAKE_PREFIX_PATH=/opt/rocm \
39-
-D CMAKE_VERBOSE_MAKEFILE:BOOL=ON \
33+
-D CMAKE_BUILD_TYPE=Release \
34+
-D CMAKE_CXX_FLAGS="-DCK_AMD_GPU_GFX908 -O3 --amdgpu-target=gfx908 -mllvm --amdgpu-spill-vgpr-to-agpr=0 -gline-tables-only -save-temps=$PWD" \
35+
-D HIP_ONLINE_COMPILER_FLAGS="-DCK_AMD_GPU_GFX908" \
36+
-D CMAKE_CXX_COMPILER=/opt/rocm/bin/hipcc \
37+
-D CMAKE_PREFIX_PATH=/opt/rocm \
38+
-D CMAKE_VERBOSE_MAKEFILE:BOOL=ON \
4039
..
4140
```
4241

4342
Build drivers: \
44-
``conv_driver_v2`` is (offline compilation) driver for forward convolution, \
45-
``conv_bwd_data_driver_v2`` is (offline compilation) driver for backward-data convolution \
46-
``conv_driver_v2_olc`` is (online compilation) driver for forward convolution
43+
``conv_fwd_driver_offline`` is (offline compilation) driver for forward convolution, \
44+
``conv_bwd_driver_offline`` is (offline compilation) driver for backward-data convolution \
45+
``conv_fwd_driver_online`` is (online compilation) driver for forward convolution
4746
```
48-
make -j conv_driver_v2
49-
make -j conv_bwd_data_driver_v2
50-
make -j conv_driver_v2_olc
47+
make -j conv_fwd_driver_offline
48+
make -j conv_bwd_driver_offline
49+
make -j conv_fwd_driver_online
5150
```
5251

5352
# Run
@@ -60,18 +59,18 @@ Build drivers: \
6059
* log: 0 = no log; 1 = do log
6160
* repeat: number of time kernel being launched
6261
```
63-
########################### layout algo verify init log repeat N__ K___ C___ Y X Hi_ Wi__ Strides Dilations LeftPads RightPads
64-
./conv_driver_v2 0 6 0 3 0 1 128 256 192 3 3 71 71 2 2 1 1 1 1 1 1
65-
./conv_driver_v2 0 6 0 3 0 1 256 1024 256 3 3 14 14 1 1 1 1 1 1 1 1
66-
./conv_driver_v2 1 9 0 3 0 1 128 256 192 3 3 71 71 2 2 1 1 1 1 1 1
67-
./conv_driver_v2 1 9 0 3 0 1 256 1024 256 3 3 14 14 1 1 1 1 1 1 1 1
68-
./conv_bwd_data_driver_v2 1 1 0 3 0 1 256 256 1024 3 3 14 14 1 1 1 1 1 1 1 1
62+
######################################################## layout algo verify init log repeat N__ K___ C___ Y X Hi_ Wi__ Strides Dilations LeftPads RightPads
63+
./host/driver_offline/conv_fwd_driver_offline 0 4 0 0 0 1 128 256 192 3 3 71 71 2 2 1 1 1 1 1 1
64+
./host/driver_offline/conv_fwd_driver_offline 0 4 0 0 0 1 256 1024 256 3 3 14 14 1 1 1 1 1 1 1 1
65+
./host/driver_offline/conv_fwd_driver_offline 1 5 0 0 0 1 128 256 192 3 3 71 71 2 2 1 1 1 1 1 1
66+
./host/driver_offline/conv_fwd_driver_offline 1 5 0 0 0 1 256 1024 256 3 3 14 14 1 1 1 1 1 1 1 1
67+
./host/driver_offline/conv_bwd_driver_offline 1 5 0 0 0 1 256 256 1024 3 3 14 14 1 1 1 1 1 1 1 1
6968
```
7069

7170
# Result
7271
Forward convoltuion, FP16, NCHW
7372
```
74-
./conv_driver_v2 0 6 0 3 0 1 128 256 192 3 3 71 71 2 2 1 1 1 1 1 1
73+
./host/driver_offline/conv_fwd_driver_offline 0 4 0 0 0 1 128 256 192 3 3 71 71 2 2 1 1 1 1 1 1
7574
7675
layout: 0
7776
in: dim 4, lengths {128, 192, 71, 71}, strides {967872, 5041, 71, 1}
@@ -93,7 +92,7 @@ Average time : 1.4155 ms, 103.686 TFlop/s
9392

9493
Forward convoltuion, FP16, NCHW
9594
```
96-
./conv_driver_v2 0 6 0 3 0 1 256 1024 256 3 3 14 14 1 1 1 1 1 1 1 1
95+
./host/driver_offline/conv_fwd_driver_offline 0 4 0 0 0 1 256 1024 256 3 3 14 14 1 1 1 1 1 1 1 1
9796
9897
layout: 0
9998
in: dim 4, lengths {256, 256, 14, 14}, strides {50176, 196, 14, 1}
@@ -115,7 +114,7 @@ Average time : 2.21357 ms, 106.959 TFlop/s
115114

116115
Forward convolution, FP16, NHWC
117116
```
118-
./conv_driver_v2 1 9 0 3 0 1 128 256 192 3 3 71 71 2 2 1 1 1 1 1 1
117+
./host/driver_offline/conv_fwd_driver_offline 1 5 0 0 0 1 128 256 192 3 3 71 71 2 2 1 1 1 1 1 1
119118
120119
layout: 1
121120
in: dim 4, lengths {128, 71, 71, 192}, strides {967872, 13632, 192, 1}
@@ -137,7 +136,7 @@ Average time : 1.12014 ms, 131.025 TFlop/s
137136

138137
Forward convolution, FP16, NHWC
139138
```
140-
./conv_driver_v2 1 9 0 3 0 1 256 1024 256 3 3 14 14 1 1 1 1 1 1 1 1
139+
./host/driver_offline/conv_fwd_driver_offline 1 5 0 0 0 1 256 1024 256 3 3 14 14 1 1 1 1 1 1 1 1
141140
142141
layout: 1
143142
in: dim 4, lengths {256, 14, 14, 256}, strides {50176, 3584, 256, 1}
@@ -159,7 +158,7 @@ Average time : 1.86877 ms, 126.693 TFlop/s
159158

160159
Backward data convolution, FP16, NHWC
161160
```
162-
./conv_bwd_data_driver_v2 1 1 0 3 0 1 256 256 1024 3 3 14 14 1 1 1 1 1 1 1 1
161+
./host/driver_offline/conv_bwd_driver_offline 1 1 0 3 0 1 256 256 1024 3 3 14 14 1 1 1 1 1 1 1 1
163162
164163
layout: 1
165164
in: dim 4, lengths {256, 14, 14, 1024}, strides {200704, 14336, 1024, 1}

0 commit comments

Comments
 (0)