[epoch1] Enable C++-only builds on a system which does not have nvcc#133
Merged
Conversation
…ilds) Fix conflicts in epoch1/cuda/ee_mumu/src/Makefile
Fix conflicts in epoch1/cuda/ee_mumu/src/Makefile This commit was originally in the klas PR, but bring it forward to build on gcc10. Note that the gcc10 throughput is 1.15E6 like that of gcc9 at this point time ./check.exe -p 2048 256 12 *********************************************************************** NumBlocksPerGrid = 2048 NumThreadsPerBlock = 256 NumIterations = 12 ----------------------------------------------------------------------- FP precision = DOUBLE (nan=0) Complex type = STD::COMPLEX RanNumb memory layout = AOSOA[4] Momenta memory layout = AOSOA[4] Random number generation = COMMON RANDOM (C++ code) OMP threads / `nproc --all` = 1 / 4 MatrixElements compiler = gcc (GCC) 10.1.0 ----------------------------------------------------------------------- NumberOfEntries = 12 TotalTime[Rnd+Rmb+ME] (123) = ( 7.579283e+00 ) sec TotalTime[Rambo+ME] (23) = ( 7.421322e+00 ) sec TotalTime[RndNumGen] (1) = ( 1.579617e-01 ) sec TotalTime[Rambo] (2) = ( 1.944444e+00 ) sec TotalTime[MatrixElems] (3) = ( 5.476877e+00 ) sec MeanTimeInMatrixElems = ( 4.564064e-01 ) sec [Min,Max]TimeInMatrixElems = [ 4.562069e-01 , 4.566677e-01 ] sec ----------------------------------------------------------------------- TotalEventsComputed = 6291456 EvtsPerSec[Rnd+Rmb+ME](123) = ( 8.300859e+05 ) sec^-1 EvtsPerSec[Rmb+ME] (23) = ( 8.477541e+05 ) sec^-1 EvtsPerSec[MatrixElems] (3) = ( 1.148730e+06 ) sec^-1 *********************************************************************** NumMatrixElements(notNan) = 6291456 MeanMatrixElemValue = ( 1.371988e-02 +- 3.269530e-06 ) GeV^0 [Min,Max]MatrixElemValue = [ 6.071582e-03 , 3.374925e-02 ] GeV^0 StdDevMatrixElemValue = ( 8.200888e-03 ) GeV^0 MeanWeight = ( 4.515827e-01 +- 0.000000e+00 ) [Min,Max]Weight = [ 4.515827e-01 , 4.515827e-01 ] StdDevWeight = ( 0.000000e+00 ) *********************************************************************** 0a ProcInit : 0.000309 sec 0b MemAlloc : 0.072731 sec 0c GenCreat : 0.000390 sec 1b GenRnGen : 0.157962 sec 2a RamboIni : 0.097856 sec 2b RamboFin : 1.846588 sec 3a SigmaKin : 5.476878 sec 4a DumpLoop : 0.101170 sec 8a CompStat : 0.026309 sec 9a GenDestr : 0.000003 sec 9b DumpScrn : 0.012090 sec 9c DumpJson : 0.000002 sec TOTAL : 7.792288 sec TOTAL (123) : 7.579284 sec TOTAL (23) : 7.421322 sec TOTAL (1) : 0.157962 sec TOTAL (2) : 1.944444 sec TOTAL (3) : 5.476878 sec *********************************************************************** real 0m7.816s user 0m8.194s sys 0m0.351s time ./check.exe -p 2048 256 12 *********************************************************************** NumBlocksPerGrid = 2048 NumThreadsPerBlock = 256 NumIterations = 12 ----------------------------------------------------------------------- FP precision = DOUBLE (nan=0) Complex type = STD::COMPLEX RanNumb memory layout = AOSOA[4] Momenta memory layout = AOSOA[4] Random number generation = COMMON RANDOM (C++ code) OMP threads / `nproc --all` = 1 / 4 MatrixElements compiler = gcc (GCC) 9.2.0 ----------------------------------------------------------------------- NumberOfEntries = 12 TotalTime[Rnd+Rmb+ME] (123) = ( 7.569729e+00 ) sec TotalTime[Rambo+ME] (23) = ( 7.399933e+00 ) sec TotalTime[RndNumGen] (1) = ( 1.697960e-01 ) sec TotalTime[Rambo] (2) = ( 1.943793e+00 ) sec TotalTime[MatrixElems] (3) = ( 5.456140e+00 ) sec MeanTimeInMatrixElems = ( 4.546783e-01 ) sec [Min,Max]TimeInMatrixElems = [ 4.542839e-01 , 4.554264e-01 ] sec ----------------------------------------------------------------------- TotalEventsComputed = 6291456 EvtsPerSec[Rnd+Rmb+ME](123) = ( 8.311336e+05 ) sec^-1 EvtsPerSec[Rmb+ME] (23) = ( 8.502044e+05 ) sec^-1 EvtsPerSec[MatrixElems] (3) = ( 1.153096e+06 ) sec^-1 *********************************************************************** NumMatrixElements(notNan) = 6291456 MeanMatrixElemValue = ( 1.371988e-02 +- 3.269530e-06 ) GeV^0 [Min,Max]MatrixElemValue = [ 6.071582e-03 , 3.374925e-02 ] GeV^0 StdDevMatrixElemValue = ( 8.200888e-03 ) GeV^0 MeanWeight = ( 4.515827e-01 +- 0.000000e+00 ) [Min,Max]Weight = [ 4.515827e-01 , 4.515827e-01 ] StdDevWeight = ( 0.000000e+00 ) *********************************************************************** 0a ProcInit : 0.000370 sec 0b MemAlloc : 0.072272 sec 0c GenCreat : 0.000338 sec 1b GenRnGen : 0.169796 sec 2a RamboIni : 0.095956 sec 2b RamboFin : 1.847837 sec 3a SigmaKin : 5.456140 sec 4a DumpLoop : 0.106385 sec 8a CompStat : 0.026501 sec 9a GenDestr : 0.000006 sec 9b DumpScrn : 0.011202 sec 9c DumpJson : 0.000006 sec TOTAL : 7.786809 sec TOTAL (123) : 7.569729 sec TOTAL (23) : 7.399933 sec TOTAL (1) : 0.169796 sec TOTAL (2) : 1.943793 sec TOTAL (3) : 5.456140 sec *********************************************************************** real 0m7.810s user 0m8.158s sys 0m0.374s Note also that the runTest.exe runs successfully on gcc10 This is how it builds now (in the C++-only build) /cvmfs/sft.cern.ch/lcg/releases/gcc/10.1.0-6f386/x86_64-centos7/bin/g++ -O3 -std=c++11 -I. -I../../src -I../../../../../tools -I../../../../../test/googletest/googletest/include/ -I../../../../../test/include/ -Wall -Wshadow -Wextra -fopenmp -DMGONGPU_COMMONRAND_ONHOST -ffast-math -c runTest.cc -o runTest.o /cvmfs/sft.cern.ch/lcg/releases/gcc/10.1.0-6f386/x86_64-centos7/bin/g++ -o runTest.exe CPPProcess.o runTest.o ../../../../../test/src/MadgraphTest.o -O3 -std=c++11 -I. -I../../src -I../../../../../tools -I../../../../../test/googletest/googletest/include/ -I../../../../../test/include/ -Wall -Wshadow -Wextra -fopenmp -DMGONGPU_COMMONRAND_ONHOST -ffast-math -ldl -pthread -L../../lib -lmodel_sm -L../../../../../test/googletest/build/lib// -lgtest -lgtest_main Note that -lgomp is added only for the cuda build of the test
Member
Author
|
I will self merge. This is part also of klas PR #72 but I want to test gcc10 before merging vectorization, I have an issue in gcc10 to debug. |
valassi
added a commit
to valassi/madgraph4gpu
that referenced
this pull request
Mar 28, 2021
This is the same for epoch2 as PR madgraph5#133 was for epoch1
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Enable C++-only builds on a systema which does have nvcc
It is enough to "export CUDA_HOME=invalid" and only the C++ will be built
This enables emulating the CI tests