You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
George Bisbas edited this page Mar 13, 2025
·
1 revision
This document aims to help as a guide for profiling the hotspots in the Devito compilation pipeline. For reference, we aim to use the TTI example.
To find the right hotspots, ideally, you should drop the C-land execution time of operators to nearly zero.
We want to drop to zero the percentage of op.apply()
Thus, try to use only a few time steps, and shrink your problem size as much as possible.
To stress the compiler more, it is probably helpful to increase the space order
DEVITO_LOGGING=DEBUG DEVITO_LANGUAGE=openmp python -m cProfile -s tottime -o profile_results.prof examples/seismic/tti/tti_example.py -so 16 -d 10 10 10 --tn 5 | head -20
gprof2dot -f pstats profile_results.prof -o profile_results.dot
dot -Tpdf profile_results.dot -o profile_results.pdf
# -n : This option eliminates nodes (functions) below a specified percentage threshold. It helps reduce the graph size by excluding less# significant functions.
gprof2dot -f pstats -n 0.5 profile_results.prof -o profile_results.dot
# Usually 5%, 7%, or 8% work well enough.
gprof2dot -f pstats -n 5 profile_results.prof -o profile_results.dot
gprof2dot -f pstats -n 7.5 profile_results.prof -o profile_results.dot
gprof2dot -f pstats -n 10 profile_results.prof -o profile_results.dot