Skip to content

Clean up and rename files in epoch2/eemumu#140

Merged
valassi merged 11 commits into
madgraph5:masterfrom
valassi:ep2to2ep1
Mar 28, 2021
Merged

Clean up and rename files in epoch2/eemumu#140
valassi merged 11 commits into
madgraph5:masterfrom
valassi:ep2to2ep1

Conversation

@valassi
Copy link
Copy Markdown
Member

@valassi valassi commented Mar 28, 2021

Clean up and rename files in epoch2/eemumu. This is the first batch of changes for issue #139

These changes are all in epoch2, essentially:

  • cleanup: remove duplicate rambo.cc and grambo.cu (were identical)
  • cleanup: remove duplicate gcheck_sa.cu and ../check.cc (they were different and only the former was used - but keep the latter only, which is more recent and closer to epoch1)
  • use the same naming strategy as in epoch1: keep code in .cc files, define .cu files as symlinks (NB eventually we could get rid of symlinks and use the nvcc option to treat cc as cu, used by @hageboeck in the tests... more generally we should get rid of these gXXX filenames that I introduced)
  • (temporarely?) rename check_sa.cc as check.cc, to ease comparisons with epoch1 (NB eventually I agree to go back to check_sa as a name, but now it just complicates things)

valassi added 7 commits March 28, 2021 11:50
It is broken as the file moved from epoch1 to the common test directory.
It is useless because runTest.exe no longer looks for it in the local directory
(it looks for it directly in the common test directory).
This is the same strategy used in epoch1.
Currently P1_Sigma_sm_epem_mupmum in epoch2 uses the opposite strategy,
all .cc files are symlinks to .cu files: will reverse in the next commit.
Implement general strategy: put code in .cc files, use .cu files as symlinks.
Implement general strategy: put code in .cc files, use .cu files as symlinks.
This is consistent with what is used in epoch1.
Implement general strategy: put code in .cc files, use .cu files as symlinks
(and correspondingly, use the cc names of headers for .h files).
…by the latter.

The latter is closer to the version in epoch1.

This is the current epoch2 performance:

[avalassi@itscrd70 gcc9.2/cvmfs] ~/GPU2020/madgraph4gpuTer/epoch2/cuda/ee_mumu/SubProcesses/P1_Sigma_sm_epem_mupmum> time ./check.exe -p 2048 256 12
***********************************************************************
NumBlocksPerGrid           = 2048
NumThreadsPerBlock         = 256
NumIterations              = 12
-----------------------------------------------------------------------
FP precision               = DOUBLE (nan=0)
Complex type               = STD::COMPLEX
RanNumb memory layout      = AOSOA[4]
Momenta memory layout      = AOSOA[4]
Random number generation   = CURAND (C++ code)
-----------------------------------------------------------------------
NumberOfEntries            = 12
TotalTime[Rnd+Rmb+ME] (123)= ( 9.799493e+00                 )  sec
TotalTime[Rambo+ME]    (23)= ( 9.447625e+00                 )  sec
TotalTime[RndNumGen]    (1)= ( 3.518679e-01                 )  sec
TotalTime[Rambo]        (2)= ( 2.036167e+00                 )  sec
TotalTime[MatrixElems]  (3)= ( 7.411458e+00                 )  sec
MeanTimeInMatrixElems      = ( 6.176215e-01                 )  sec
[Min,Max]TimeInMatrixElems = [ 6.168755e-01 ,  6.209477e-01 ]  sec
-----------------------------------------------------------------------
TotalEventsComputed        = 6291456
EvtsPerSec[Rnd+Rmb+ME](123)= ( 6.420185e+05                 )  sec^-1
EvtsPerSec[Rmb+ME]     (23)= ( 6.659299e+05                 )  sec^-1
EvtsPerSec[MatrixElems] (3)= ( 8.488823e+05                 )  sec^-1
***********************************************************************
NumMatrixElements(notNan)  = 6291456
MeanMatrixElemValue        = ( 1.372152e-02 +- 3.269516e-06 )  GeV^0
[Min,Max]MatrixElemValue   = [ 6.071581e-03 ,  3.374925e-02 ]  GeV^0
StdDevMatrixElemValue      = ( 8.200854e-03                 )  GeV^0
MeanWeight                 = ( 4.515827e-01 +- 0.000000e+00 )
[Min,Max]Weight            = [ 4.515827e-01 ,  4.515827e-01 ]
StdDevWeight               = ( 0.000000e+00                 )
***********************************************************************
0a ProcInit :     0.000389 sec
0b MemAlloc :     0.000045 sec
0c GenCreat :     0.000978 sec
1a GenSeed  :     0.000030 sec
1b GenRnGen :     0.351838 sec
2a RamboIni :     0.150477 sec
2b RamboFin :     1.885690 sec
3a SigmaKin :     7.411459 sec
4a DumpLoop :     0.084440 sec
8a CompStat :     0.045208 sec
9a GenDestr :     0.000115 sec
9b DumpScrn :     0.000220 sec
9c DumpJson :     0.000001 sec
TOTAL       :     9.930889 sec
TOTAL (123) :     9.799494 sec
TOTAL  (23) :     9.447626 sec
TOTAL   (1) :     0.351868 sec
TOTAL   (2) :     2.036167 sec
TOTAL   (3) :     7.411459 sec
***********************************************************************
real    0m9.961s
user    0m9.833s
sys     0m0.126s

[avalassi@itscrd70 gcc9.2/cvmfs] ~/GPU2020/madgraph4gpuTer/epoch2/cuda/ee_mumu/SubProcesses/P1_Sigma_sm_epem_mupmum> time ./gcheck.exe -p 2048 256 12
***********************************************************************
NumBlocksPerGrid           = 2048
NumThreadsPerBlock         = 256
NumIterations              = 12
-----------------------------------------------------------------------
FP precision               = DOUBLE (nan=0)
Complex type               = THRUST::COMPLEX
RanNumb memory layout      = AOSOA[4]
Momenta memory layout      = AOSOA[4]
Wavefunction GPU memory    = LOCAL
Random number generation   = CURAND DEVICE (CUDA code)
-----------------------------------------------------------------------
NumberOfEntries            = 12
TotalTime[Rnd+Rmb+ME] (123)= ( 1.212344e-01                 )  sec
TotalTime[Rambo+ME]    (23)= ( 1.136280e-01                 )  sec
TotalTime[RndNumGen]    (1)= ( 7.606422e-03                 )  sec
TotalTime[Rambo]        (2)= ( 1.033857e-01                 )  sec
TotalTime[MatrixElems]  (3)= ( 1.024222e-02                 )  sec
MeanTimeInMatrixElems      = ( 8.535181e-04                 )  sec
[Min,Max]TimeInMatrixElems = [ 8.233450e-04 ,  8.671930e-04 ]  sec
-----------------------------------------------------------------------
TotalEventsComputed        = 6291456
EvtsPerSec[Rnd+Rmb+ME](123)= ( 5.189499e+07                 )  sec^-1
EvtsPerSec[Rmb+ME]     (23)= ( 5.536891e+07                 )  sec^-1
EvtsPerSec[MatrixElems] (3)= ( 6.142670e+08                 )  sec^-1
***********************************************************************
NumMatrixElements(notNan)  = 6291456
MeanMatrixElemValue        = ( 1.372152e-02 +- 3.269516e-06 )  GeV^0
[Min,Max]MatrixElemValue   = [ 6.071581e-03 ,  3.374925e-02 ]  GeV^0
StdDevMatrixElemValue      = ( 8.200854e-03                 )  GeV^0
MeanWeight                 = ( 4.515827e-01 +- 0.000000e+00 )
[Min,Max]Weight            = [ 4.515827e-01 ,  4.515827e-01 ]
StdDevWeight               = ( 0.000000e+00                 )
***********************************************************************
00 CudaFree :     1.218477 sec
0a ProcInit :     0.000557 sec
0b MemAlloc :     0.037328 sec
0c GenCreat :     0.010041 sec
0d SGoodHel :     0.001599 sec
1a GenSeed  :     0.000020 sec
1b GenRnGen :     0.007587 sec
2a RamboIni :     0.000105 sec
2b RamboFin :     0.000047 sec
2c CpDTHwgt :     0.008336 sec
2d CpDTHmom :     0.094898 sec
3a SigmaKin :     0.000092 sec
3b CpDTHmes :     0.010150 sec
4a DumpLoop :     0.090539 sec
8a CompStat :     0.045862 sec
9a GenDestr :     0.000092 sec
9b DumpScrn :     0.000230 sec
9c DumpJson :     0.000002 sec
TOTAL       :     1.525961 sec
TOTAL (123) :     0.121234 sec
TOTAL  (23) :     0.113628 sec
TOTAL   (1) :     0.007606 sec
TOTAL   (2) :     0.103386 sec
TOTAL   (3) :     0.010242 sec
***********************************************************************
real    0m1.844s
user    0m0.335s
sys     0m0.866s
Eventually this change may be reverted (and _sa added also in epoch1).
@valassi
Copy link
Copy Markdown
Member Author

valassi commented Mar 28, 2021

Ok all issues fixed - self merging

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant