Skip to content

Improvements over PR #45. More consistent #ifdefs and hstRnarray usage.#48

Merged
valassi merged 6 commits into
madgraph5:masterfrom
valassi:master
Nov 10, 2020
Merged

Improvements over PR #45. More consistent #ifdefs and hstRnarray usage.#48
valassi merged 6 commits into
madgraph5:masterfrom
valassi:master

Conversation

@valassi
Copy link
Copy Markdown
Member

@valassi valassi commented Nov 10, 2020

No description provided.

…rray usage.

./gcheck.exe -p 16384 32 12
***********************************************************************
NumBlocksPerGrid           = 16384
NumThreadsPerBlock         = 32
NumIterations              = 12
-----------------------------------------------------------------------
FP precision               = DOUBLE (nan=0)
Complex type               = THRUST::COMPLEX
RanNumb memory layout      = AOSOA[4]
Momenta memory layout      = AOSOA[4]
Wavefunction GPU memory    = LOCAL
Random number generation   = CURAND DEVICE (CUDA code)
-----------------------------------------------------------------------
NumberOfEntries            = 12
TotalTime[Rnd+Rmb+ME] (123)= ( 8.692473e-02                 )  sec
TotalTime[Rambo+ME]    (23)= ( 7.913799e-02                 )  sec
TotalTime[RndNumGen]    (1)= ( 7.786739e-03                 )  sec
TotalTime[Rambo]        (2)= ( 7.001756e-02                 )  sec
TotalTime[MatrixElems]  (3)= ( 9.120431e-03                 )  sec
MeanTimeInMatrixElems      = ( 7.600359e-04                 )  sec
[Min,Max]TimeInMatrixElems = [ 7.568000e-04 ,  7.671370e-04 ]  sec
-----------------------------------------------------------------------
TotalEventsComputed        = 6291456
EvtsPerSec[Rnd+Rmb+ME](123)= ( 7.237821e+07                 )  sec^-1
EvtsPerSec[Rmb+ME]     (23)= ( 7.949982e+07                 )  sec^-1
EvtsPerSec[MatrixElems] (3)= ( 6.898200e+08                 )  sec^-1
***********************************************************************
NumMatrixElements(notNan)  = 6291456
MeanMatrixElemValue        = ( 1.372152e-02 +- 3.269516e-06 )  GeV^0
[Min,Max]MatrixElemValue   = [ 6.071582e-03 ,  3.374925e-02 ]  GeV^0
StdDevMatrixElemValue      = ( 8.200854e-03                 )  GeV^0
MeanWeight                 = ( 4.515827e-01 +- 0.000000e+00 )
[Min,Max]Weight            = [ 4.515827e-01 ,  4.515827e-01 ]
StdDevWeight               = ( 0.000000e+00                 )
***********************************************************************
00 CudaFree :     0.877303 sec
0a ProcInit :     0.000488 sec
0b MemAlloc :     0.061239 sec
0c GenCreat :     0.009764 sec
0d SGoodHel :     0.001760 sec
1a GenSeed  :     0.000101 sec
1b GenRnGen :     0.007686 sec
2a RamboIni :     0.000178 sec
2b RamboFin :     0.000138 sec
2c CpDTHwgt :     0.006028 sec
2d CpDTHmom :     0.063673 sec
3a SigmaKin :     0.000165 sec
3b CpDTHmes :     0.008956 sec
4a DumpLoop :     0.023916 sec
8a CompStat :     0.045670 sec
9a GenDestr :     0.000054 sec
9b MemFree  :     0.012793 sec
9c CudReset :     0.049363 sec
9d DumpScrn :     0.000221 sec
9e DumpJson :     0.000008 sec
TOTAL       :     1.169505 sec
TOTAL (123) :     0.086925 sec
TOTAL  (23) :     0.079138 sec
TOTAL   (1) :     0.007787 sec
TOTAL   (2) :     0.070018 sec
TOTAL   (3) :     0.009120 sec
***********************************************************************

./check.exe -p 16384 32 12
***********************************************************************
NumBlocksPerGrid           = 16384
NumThreadsPerBlock         = 32
NumIterations              = 12
-----------------------------------------------------------------------
FP precision               = DOUBLE (nan=0)
Complex type               = STD::COMPLEX
RanNumb memory layout      = AOSOA[4]
Momenta memory layout      = AOSOA[4]
Random number generation   = CURAND (C++ code)
-----------------------------------------------------------------------
NumberOfEntries            = 12
TotalTime[Rnd+Rmb+ME] (123)= ( 1.787162e+01                 )  sec
TotalTime[Rambo+ME]    (23)= ( 1.753950e+01                 )  sec
TotalTime[RndNumGen]    (1)= ( 3.321243e-01                 )  sec
TotalTime[Rambo]        (2)= ( 1.204920e+00                 )  sec
TotalTime[MatrixElems]  (3)= ( 1.633458e+01                 )  sec
MeanTimeInMatrixElems      = ( 1.361215e+00                 )  sec
[Min,Max]TimeInMatrixElems = [ 1.360797e+00 ,  1.361576e+00 ]  sec
-----------------------------------------------------------------------
TotalEventsComputed        = 6291456
EvtsPerSec[Rnd+Rmb+ME](123)= ( 3.520361e+05                 )  sec^-1
EvtsPerSec[Rmb+ME]     (23)= ( 3.587021e+05                 )  sec^-1
EvtsPerSec[MatrixElems] (3)= ( 3.851618e+05                 )  sec^-1
***********************************************************************
NumMatrixElements(notNan)  = 6291456
MeanMatrixElemValue        = ( 1.372152e-02 +- 3.269516e-06 )  GeV^0
[Min,Max]MatrixElemValue   = [ 6.071582e-03 ,  3.374925e-02 ]  GeV^0
StdDevMatrixElemValue      = ( 8.200854e-03                 )  GeV^0
MeanWeight                 = ( 4.515827e-01 +- 0.000000e+00 )
[Min,Max]Weight            = [ 4.515827e-01 ,  4.515827e-01 ]
StdDevWeight               = ( 0.000000e+00                 )
***********************************************************************
0a ProcInit :     0.000319 sec
0b MemAlloc :     0.050883 sec
0c GenCreat :     0.000824 sec
1a GenSeed  :     0.000105 sec
1b GenRnGen :     0.332019 sec
2a RamboIni :     0.083518 sec
2b RamboFin :     1.121402 sec
3a SigmaKin :    16.334579 sec
4a DumpLoop :     0.020090 sec
8a CompStat :     0.037017 sec
9a GenDestr :     0.000081 sec
9b MemFree  :     0.001111 sec
9d DumpScrn :     0.000187 sec
9e DumpJson :     0.000007 sec
TOTAL       :    17.982145 sec
TOTAL (123) :    17.871624 sec
TOTAL  (23) :    17.539499 sec
TOTAL   (1) :     0.332124 sec
TOTAL   (2) :     1.204920 sec
TOTAL   (3) :    16.334579 sec
***********************************************************************
Also fix a rnGen desctructor.

./gcheck.exe -p 16384 32 12
***********************************************************************
NumBlocksPerGrid           = 16384
NumThreadsPerBlock         = 32
NumIterations              = 12
-----------------------------------------------------------------------
FP precision               = DOUBLE (nan=5505024)
Complex type               = THRUST::COMPLEX
RanNumb memory layout      = AOSOA[4]
Momenta memory layout      = AOSOA[4]
Wavefunction GPU memory    = LOCAL
Random number generation   = COMMON RANDOM HOST (CUDA code)
-----------------------------------------------------------------------
NumberOfEntries            = 12
TotalTime[Rnd+Rmb+ME] (123)= ( 2.160973e-01                 )  sec
TotalTime[Rambo+ME]    (23)= ( 8.944656e-02                 )  sec
TotalTime[RndNumGen]    (1)= ( 1.266507e-01                 )  sec
TotalTime[Rambo]        (2)= ( 6.896938e-02                 )  sec
TotalTime[MatrixElems]  (3)= ( 2.047718e-02                 )  sec
MeanTimeInMatrixElems      = ( 1.706432e-03                 )  sec
[Min,Max]TimeInMatrixElems = [ 8.084830e-04 ,  1.144112e-02 ]  sec
-----------------------------------------------------------------------
TotalEventsComputed        = 6291456
EvtsPerSec[Rnd+Rmb+ME](123)= ( 2.911400e+07                 )  sec^-1
EvtsPerSec[Rmb+ME]     (23)= ( 7.033759e+07                 )  sec^-1
EvtsPerSec[MatrixElems] (3)= ( 3.072422e+08                 )  sec^-1
***********************************************************************
NumMatrixElements(notNan)  = 786432
MeanMatrixElemValue        = ( 1.372249e-02 +- 9.248484e-06 )  GeV^0
[Min,Max]MatrixElemValue   = [ 6.071582e-03 ,  3.374920e-02 ]  GeV^0
StdDevMatrixElemValue      = ( 8.201648e-03                 )  GeV^0
MeanWeight                 = ( 4.515827e-01 +- 0.000000e+00 )
[Min,Max]Weight            = [ 4.515827e-01 ,  4.515827e-01 ]
StdDevWeight               = ( 0.000000e+00                 )
***********************************************************************

./check.exe -p 16384 32 12
***********************************************************************
NumBlocksPerGrid           = 16384
NumThreadsPerBlock         = 32
NumIterations              = 12
-----------------------------------------------------------------------
FP precision               = DOUBLE (nan=5505024)
Complex type               = STD::COMPLEX
RanNumb memory layout      = AOSOA[4]
Momenta memory layout      = AOSOA[4]
Random number generation   = COMMON RANDOM HOST (C++ code)
-----------------------------------------------------------------------
NumberOfEntries            = 12
TotalTime[Rnd+Rmb+ME] (123)= ( 2.000985e+01                 )  sec
TotalTime[Rambo+ME]    (23)= ( 1.994450e+01                 )  sec
TotalTime[RndNumGen]    (1)= ( 6.534211e-02                 )  sec
TotalTime[Rambo]        (2)= ( 6.525773e-01                 )  sec
TotalTime[MatrixElems]  (3)= ( 1.929193e+01                 )  sec
MeanTimeInMatrixElems      = ( 1.607661e+00                 )  sec
[Min,Max]TimeInMatrixElems = [ 1.606176e+00 ,  1.616634e+00 ]  sec
-----------------------------------------------------------------------
TotalEventsComputed        = 6291456
EvtsPerSec[Rnd+Rmb+ME](123)= ( 3.144180e+05                 )  sec^-1
EvtsPerSec[Rmb+ME]     (23)= ( 3.154481e+05                 )  sec^-1
EvtsPerSec[MatrixElems] (3)= ( 3.261186e+05                 )  sec^-1
***********************************************************************
NumMatrixElements(notNan)  = 786432
MeanMatrixElemValue        = ( 1.372249e-02 +- 9.248484e-06 )  GeV^0
[Min,Max]MatrixElemValue   = [ 6.071582e-03 ,  3.374920e-02 ]  GeV^0
StdDevMatrixElemValue      = ( 8.201648e-03                 )  GeV^0
MeanWeight                 = ( 4.515827e-01 +- 0.000000e+00 )
[Min,Max]Weight            = [ 4.515827e-01 ,  4.515827e-01 ]
StdDevWeight               = ( 0.000000e+00                 )
***********************************************************************
./gcheck.exe -p 16384 32 12
***********************************************************************
NumBlocksPerGrid           = 16384
NumThreadsPerBlock         = 32
NumIterations              = 12
-----------------------------------------------------------------------
FP precision               = DOUBLE (nan=0)
Complex type               = THRUST::COMPLEX
RanNumb memory layout      = AOSOA[4]
Momenta memory layout      = AOSOA[4]
Wavefunction GPU memory    = LOCAL
Random number generation   = COMMON RANDOM HOST (CUDA code)
-----------------------------------------------------------------------
NumberOfEntries            = 12
TotalTime[Rnd+Rmb+ME] (123)= ( 2.852055e-01                 )  sec
TotalTime[Rambo+ME]    (23)= ( 8.309519e-02                 )  sec
TotalTime[RndNumGen]    (1)= ( 2.021103e-01                 )  sec
TotalTime[Rambo]        (2)= ( 7.403231e-02                 )  sec
TotalTime[MatrixElems]  (3)= ( 9.062881e-03                 )  sec
MeanTimeInMatrixElems      = ( 7.552401e-04                 )  sec
[Min,Max]TimeInMatrixElems = [ 7.227170e-04 ,  7.788700e-04 ]  sec
-----------------------------------------------------------------------
TotalEventsComputed        = 6291456
EvtsPerSec[Rnd+Rmb+ME](123)= ( 2.205938e+07                 )  sec^-1
EvtsPerSec[Rmb+ME]     (23)= ( 7.571384e+07                 )  sec^-1
EvtsPerSec[MatrixElems] (3)= ( 6.942004e+08                 )  sec^-1
***********************************************************************
NumMatrixElements(notNan)  = 6291456
MeanMatrixElemValue        = ( 1.371988e-02 +- 3.269530e-06 )  GeV^0
[Min,Max]MatrixElemValue   = [ 6.071582e-03 ,  3.374925e-02 ]  GeV^0
StdDevMatrixElemValue      = ( 8.200888e-03                 )  GeV^0
MeanWeight                 = ( 4.515827e-01 +- 0.000000e+00 )
[Min,Max]Weight            = [ 4.515827e-01 ,  4.515827e-01 ]
StdDevWeight               = ( 0.000000e+00                 )
***********************************************************************
00 CudaFree :     0.883399 sec
0a ProcInit :     0.000441 sec
0b MemAlloc :     0.075924 sec
0c GenCreat :     0.002128 sec
0d SGoodHel :     0.001766 sec
1b GenRnGen :     0.148745 sec
1c CpHTDrnd :     0.053365 sec
2a RamboIni :     0.000316 sec
2b RamboFin :     0.000199 sec
2c CpDTHwgt :     0.005506 sec
2d CpDTHmom :     0.068012 sec
3a SigmaKin :     0.000203 sec
3b CpDTHmes :     0.008860 sec
4a DumpLoop :     0.036731 sec
8a CompStat :     0.045455 sec
9a GenDestr :     0.000011 sec
9b MemFree  :     0.017908 sec
9c CudReset :     0.048424 sec
9d DumpScrn :     0.000239 sec
9e DumpJson :     0.000007 sec
TOTAL       :     1.397640 sec
TOTAL (123) :     0.285205 sec
TOTAL  (23) :     0.083095 sec
TOTAL   (1) :     0.202110 sec
TOTAL   (2) :     0.074032 sec
TOTAL   (3) :     0.009063 sec
***********************************************************************

./check.exe -p 16384 32 12
***********************************************************************
NumBlocksPerGrid           = 16384
NumThreadsPerBlock         = 32
NumIterations              = 12
-----------------------------------------------------------------------
FP precision               = DOUBLE (nan=0)
Complex type               = STD::COMPLEX
RanNumb memory layout      = AOSOA[4]
Momenta memory layout      = AOSOA[4]
Random number generation   = COMMON RANDOM HOST (C++ code)
-----------------------------------------------------------------------
NumberOfEntries            = 12
TotalTime[Rnd+Rmb+ME] (123)= ( 1.771678e+01                 )  sec
TotalTime[Rambo+ME]    (23)= ( 1.757863e+01                 )  sec
TotalTime[RndNumGen]    (1)= ( 1.381516e-01                 )  sec
TotalTime[Rambo]        (2)= ( 1.219389e+00                 )  sec
TotalTime[MatrixElems]  (3)= ( 1.635924e+01                 )  sec
MeanTimeInMatrixElems      = ( 1.363270e+00                 )  sec
[Min,Max]TimeInMatrixElems = [ 1.363001e+00 ,  1.363693e+00 ]  sec
-----------------------------------------------------------------------
TotalEventsComputed        = 6291456
EvtsPerSec[Rnd+Rmb+ME](123)= ( 3.551128e+05                 )  sec^-1
EvtsPerSec[Rmb+ME]     (23)= ( 3.579037e+05                 )  sec^-1
EvtsPerSec[MatrixElems] (3)= ( 3.845812e+05                 )  sec^-1
***********************************************************************
NumMatrixElements(notNan)  = 6291456
MeanMatrixElemValue        = ( 1.371988e-02 +- 3.269530e-06 )  GeV^0
[Min,Max]MatrixElemValue   = [ 6.071582e-03 ,  3.374925e-02 ]  GeV^0
StdDevMatrixElemValue      = ( 8.200888e-03                 )  GeV^0
MeanWeight                 = ( 4.515827e-01 +- 0.000000e+00 )
[Min,Max]Weight            = [ 4.515827e-01 ,  4.515827e-01 ]
StdDevWeight               = ( 0.000000e+00                 )
***********************************************************************
0a ProcInit :     0.000393 sec
0b MemAlloc :     0.051142 sec
0c GenCreat :     0.000304 sec
1b GenRnGen :     0.138152 sec
2a RamboIni :     0.083532 sec
2b RamboFin :     1.135857 sec
3a SigmaKin :    16.359241 sec
4a DumpLoop :     0.022470 sec
8a CompStat :     0.036933 sec
9a GenDestr :     0.000009 sec
9b MemFree  :     0.001136 sec
9d DumpScrn :     0.000218 sec
9e DumpJson :     0.000009 sec
TOTAL       :    17.829397 sec
TOTAL (123) :    17.716782 sec
TOTAL  (23) :    17.578630 sec
TOTAL   (1) :     0.138152 sec
TOTAL   (2) :     1.219389 sec
TOTAL   (3) :    16.359241 sec
***********************************************************************
./gcheck.exe -p 16384 32 12
***********************************************************************
NumBlocksPerGrid           = 16384
NumThreadsPerBlock         = 32
NumIterations              = 12
-----------------------------------------------------------------------
FP precision               = DOUBLE (nan=0)
Complex type               = THRUST::COMPLEX
RanNumb memory layout      = AOSOA[4]
Momenta memory layout      = AOSOA[4]
Wavefunction GPU memory    = LOCAL
Random number generation   = CURAND HOST (CUDA code)
-----------------------------------------------------------------------
NumberOfEntries            = 12
TotalTime[Rnd+Rmb+ME] (123)= ( 4.464168e-01                 )  sec
TotalTime[Rambo+ME]    (23)= ( 7.717266e-02                 )  sec
TotalTime[RndNumGen]    (1)= ( 3.692441e-01                 )  sec
TotalTime[Rambo]        (2)= ( 6.808826e-02                 )  sec
TotalTime[MatrixElems]  (3)= ( 9.084397e-03                 )  sec
MeanTimeInMatrixElems      = ( 7.570331e-04                 )  sec
[Min,Max]TimeInMatrixElems = [ 7.526100e-04 ,  7.626670e-04 ]  sec
-----------------------------------------------------------------------
TotalEventsComputed        = 6291456
EvtsPerSec[Rnd+Rmb+ME](123)= ( 1.409323e+07                 )  sec^-1
EvtsPerSec[Rmb+ME]     (23)= ( 8.152442e+07                 )  sec^-1
EvtsPerSec[MatrixElems] (3)= ( 6.925563e+08                 )  sec^-1
***********************************************************************
NumMatrixElements(notNan)  = 6291456
MeanMatrixElemValue        = ( 1.372152e-02 +- 3.269516e-06 )  GeV^0
[Min,Max]MatrixElemValue   = [ 6.071582e-03 ,  3.374925e-02 ]  GeV^0
StdDevMatrixElemValue      = ( 8.200854e-03                 )  GeV^0
MeanWeight                 = ( 4.515827e-01 +- 0.000000e+00 )
[Min,Max]Weight            = [ 4.515827e-01 ,  4.515827e-01 ]
StdDevWeight               = ( 0.000000e+00                 )
***********************************************************************
00 CudaFree :     1.163048 sec
0a ProcInit :     0.000442 sec
0b MemAlloc :     0.075970 sec
0c GenCreat :     0.000463 sec
0d SGoodHel :     0.001750 sec
1a GenSeed  :     0.000097 sec
1b GenRnGen :     0.334314 sec
1c CpHTDrnd :     0.034833 sec
2a RamboIni :     0.000263 sec
2b RamboFin :     0.000162 sec
2c CpDTHwgt :     0.005505 sec
2d CpDTHmom :     0.062159 sec
3a SigmaKin :     0.000177 sec
3b CpDTHmes :     0.008908 sec
4a DumpLoop :     0.023940 sec
8a CompStat :     0.045298 sec
9a GenDestr :     0.000018 sec
9b MemFree  :     0.017843 sec
9c CudReset :     0.047554 sec
9d DumpScrn :     0.000197 sec
9e DumpJson :     0.000027 sec
TOTAL       :     1.822965 sec
TOTAL (123) :     0.446417 sec
TOTAL  (23) :     0.077173 sec
TOTAL   (1) :     0.369244 sec
TOTAL   (2) :     0.068088 sec
TOTAL   (3) :     0.009084 sec
***********************************************************************

./check.exe -p 16384 32 12
***********************************************************************
NumBlocksPerGrid           = 16384
NumThreadsPerBlock         = 32
NumIterations              = 12
-----------------------------------------------------------------------
FP precision               = DOUBLE (nan=0)
Complex type               = STD::COMPLEX
RanNumb memory layout      = AOSOA[4]
Momenta memory layout      = AOSOA[4]
Random number generation   = CURAND (C++ code)
-----------------------------------------------------------------------
NumberOfEntries            = 12
TotalTime[Rnd+Rmb+ME] (123)= ( 1.789395e+01                 )  sec
TotalTime[Rambo+ME]    (23)= ( 1.756132e+01                 )  sec
TotalTime[RndNumGen]    (1)= ( 3.326294e-01                 )  sec
TotalTime[Rambo]        (2)= ( 1.197999e+00                 )  sec
TotalTime[MatrixElems]  (3)= ( 1.636332e+01                 )  sec
MeanTimeInMatrixElems      = ( 1.363610e+00                 )  sec
[Min,Max]TimeInMatrixElems = [ 1.363146e+00 ,  1.364147e+00 ]  sec
-----------------------------------------------------------------------
TotalEventsComputed        = 6291456
EvtsPerSec[Rnd+Rmb+ME](123)= ( 3.515968e+05                 )  sec^-1
EvtsPerSec[Rmb+ME]     (23)= ( 3.582564e+05                 )  sec^-1
EvtsPerSec[MatrixElems] (3)= ( 3.844852e+05                 )  sec^-1
***********************************************************************
NumMatrixElements(notNan)  = 6291456
MeanMatrixElemValue        = ( 1.372152e-02 +- 3.269516e-06 )  GeV^0
[Min,Max]MatrixElemValue   = [ 6.071582e-03 ,  3.374925e-02 ]  GeV^0
StdDevMatrixElemValue      = ( 8.200854e-03                 )  GeV^0
MeanWeight                 = ( 4.515827e-01 +- 0.000000e+00 )
[Min,Max]Weight            = [ 4.515827e-01 ,  4.515827e-01 ]
StdDevWeight               = ( 0.000000e+00                 )
***********************************************************************
0a ProcInit :     0.001699 sec
0b MemAlloc :     0.051144 sec
0c GenCreat :     0.000865 sec
1a GenSeed  :     0.000099 sec
1b GenRnGen :     0.332530 sec
2a RamboIni :     0.083874 sec
2b RamboFin :     1.114125 sec
3a SigmaKin :    16.363321 sec
4a DumpLoop :     0.020444 sec
8a CompStat :     0.037162 sec
9a GenDestr :     0.000081 sec
9b MemFree  :     0.001136 sec
9d DumpScrn :     0.000213 sec
9e DumpJson :     0.000007 sec
TOTAL       :    18.006704 sec
TOTAL (123) :    17.893950 sec
TOTAL  (23) :    17.561319 sec
TOTAL   (1) :     0.332629 sec
TOTAL   (2) :     1.197999 sec
TOTAL   (3) :    16.363321 sec
***********************************************************************
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant