File tree Expand file tree Collapse file tree 1 file changed +5
-2
lines changed
Expand file tree Collapse file tree 1 file changed +5
-2
lines changed Original file line number Diff line number Diff line change @@ -199,9 +199,12 @@ Always profile with Release mode builds and run without debugging.
199199 (with array size on the independent axis).
200200 * You should use CUDA events for timing GPU code. Be sure ** not** to include
201201 any explicit memory operations (` cudaMalloc ` , ` cudaMemcpy ` ) in your
202- performance measurements, for comparability.
203- * You should use the C++11 ` std::chrono ` API for timing CPU code. See this
202+ performance measurements, for comparability. Note that CUDA events cannot
203+ time CPU code.
204+ * You can use the C++11 ` std::chrono ` API for timing CPU code. See this
204205 [ Stack Overflow answer] ( http://stackoverflow.com/a/23000049 ) for an example.
206+ Note that ` std::chrono ` may not provide high-precision timing. If it does
207+ not, you can either use it to time many iterations, or use another method.
205208 * To guess at what might be happening inside the Thrust implementation (e.g.
206209 allocation, memory copy), take a look at the Nsight timeline for its
207210 execution. Your analysis here doesn't have to be detailed, since you aren't
You can’t perform that action at this time.
0 commit comments