Improvements over PR #45. More consistent #ifdefs and hstRnarray usage.#48
Merged
Conversation
…rray usage. ./gcheck.exe -p 16384 32 12 *********************************************************************** NumBlocksPerGrid = 16384 NumThreadsPerBlock = 32 NumIterations = 12 ----------------------------------------------------------------------- FP precision = DOUBLE (nan=0) Complex type = THRUST::COMPLEX RanNumb memory layout = AOSOA[4] Momenta memory layout = AOSOA[4] Wavefunction GPU memory = LOCAL Random number generation = CURAND DEVICE (CUDA code) ----------------------------------------------------------------------- NumberOfEntries = 12 TotalTime[Rnd+Rmb+ME] (123)= ( 8.692473e-02 ) sec TotalTime[Rambo+ME] (23)= ( 7.913799e-02 ) sec TotalTime[RndNumGen] (1)= ( 7.786739e-03 ) sec TotalTime[Rambo] (2)= ( 7.001756e-02 ) sec TotalTime[MatrixElems] (3)= ( 9.120431e-03 ) sec MeanTimeInMatrixElems = ( 7.600359e-04 ) sec [Min,Max]TimeInMatrixElems = [ 7.568000e-04 , 7.671370e-04 ] sec ----------------------------------------------------------------------- TotalEventsComputed = 6291456 EvtsPerSec[Rnd+Rmb+ME](123)= ( 7.237821e+07 ) sec^-1 EvtsPerSec[Rmb+ME] (23)= ( 7.949982e+07 ) sec^-1 EvtsPerSec[MatrixElems] (3)= ( 6.898200e+08 ) sec^-1 *********************************************************************** NumMatrixElements(notNan) = 6291456 MeanMatrixElemValue = ( 1.372152e-02 +- 3.269516e-06 ) GeV^0 [Min,Max]MatrixElemValue = [ 6.071582e-03 , 3.374925e-02 ] GeV^0 StdDevMatrixElemValue = ( 8.200854e-03 ) GeV^0 MeanWeight = ( 4.515827e-01 +- 0.000000e+00 ) [Min,Max]Weight = [ 4.515827e-01 , 4.515827e-01 ] StdDevWeight = ( 0.000000e+00 ) *********************************************************************** 00 CudaFree : 0.877303 sec 0a ProcInit : 0.000488 sec 0b MemAlloc : 0.061239 sec 0c GenCreat : 0.009764 sec 0d SGoodHel : 0.001760 sec 1a GenSeed : 0.000101 sec 1b GenRnGen : 0.007686 sec 2a RamboIni : 0.000178 sec 2b RamboFin : 0.000138 sec 2c CpDTHwgt : 0.006028 sec 2d CpDTHmom : 0.063673 sec 3a SigmaKin : 0.000165 sec 3b CpDTHmes : 0.008956 sec 4a DumpLoop : 0.023916 sec 8a CompStat : 0.045670 sec 9a GenDestr : 0.000054 sec 9b MemFree : 0.012793 sec 9c CudReset : 0.049363 sec 9d DumpScrn : 0.000221 sec 9e DumpJson : 0.000008 sec TOTAL : 1.169505 sec TOTAL (123) : 0.086925 sec TOTAL (23) : 0.079138 sec TOTAL (1) : 0.007787 sec TOTAL (2) : 0.070018 sec TOTAL (3) : 0.009120 sec *********************************************************************** ./check.exe -p 16384 32 12 *********************************************************************** NumBlocksPerGrid = 16384 NumThreadsPerBlock = 32 NumIterations = 12 ----------------------------------------------------------------------- FP precision = DOUBLE (nan=0) Complex type = STD::COMPLEX RanNumb memory layout = AOSOA[4] Momenta memory layout = AOSOA[4] Random number generation = CURAND (C++ code) ----------------------------------------------------------------------- NumberOfEntries = 12 TotalTime[Rnd+Rmb+ME] (123)= ( 1.787162e+01 ) sec TotalTime[Rambo+ME] (23)= ( 1.753950e+01 ) sec TotalTime[RndNumGen] (1)= ( 3.321243e-01 ) sec TotalTime[Rambo] (2)= ( 1.204920e+00 ) sec TotalTime[MatrixElems] (3)= ( 1.633458e+01 ) sec MeanTimeInMatrixElems = ( 1.361215e+00 ) sec [Min,Max]TimeInMatrixElems = [ 1.360797e+00 , 1.361576e+00 ] sec ----------------------------------------------------------------------- TotalEventsComputed = 6291456 EvtsPerSec[Rnd+Rmb+ME](123)= ( 3.520361e+05 ) sec^-1 EvtsPerSec[Rmb+ME] (23)= ( 3.587021e+05 ) sec^-1 EvtsPerSec[MatrixElems] (3)= ( 3.851618e+05 ) sec^-1 *********************************************************************** NumMatrixElements(notNan) = 6291456 MeanMatrixElemValue = ( 1.372152e-02 +- 3.269516e-06 ) GeV^0 [Min,Max]MatrixElemValue = [ 6.071582e-03 , 3.374925e-02 ] GeV^0 StdDevMatrixElemValue = ( 8.200854e-03 ) GeV^0 MeanWeight = ( 4.515827e-01 +- 0.000000e+00 ) [Min,Max]Weight = [ 4.515827e-01 , 4.515827e-01 ] StdDevWeight = ( 0.000000e+00 ) *********************************************************************** 0a ProcInit : 0.000319 sec 0b MemAlloc : 0.050883 sec 0c GenCreat : 0.000824 sec 1a GenSeed : 0.000105 sec 1b GenRnGen : 0.332019 sec 2a RamboIni : 0.083518 sec 2b RamboFin : 1.121402 sec 3a SigmaKin : 16.334579 sec 4a DumpLoop : 0.020090 sec 8a CompStat : 0.037017 sec 9a GenDestr : 0.000081 sec 9b MemFree : 0.001111 sec 9d DumpScrn : 0.000187 sec 9e DumpJson : 0.000007 sec TOTAL : 17.982145 sec TOTAL (123) : 17.871624 sec TOTAL (23) : 17.539499 sec TOTAL (1) : 0.332124 sec TOTAL (2) : 1.204920 sec TOTAL (3) : 16.334579 sec ***********************************************************************
Also fix a rnGen desctructor. ./gcheck.exe -p 16384 32 12 *********************************************************************** NumBlocksPerGrid = 16384 NumThreadsPerBlock = 32 NumIterations = 12 ----------------------------------------------------------------------- FP precision = DOUBLE (nan=5505024) Complex type = THRUST::COMPLEX RanNumb memory layout = AOSOA[4] Momenta memory layout = AOSOA[4] Wavefunction GPU memory = LOCAL Random number generation = COMMON RANDOM HOST (CUDA code) ----------------------------------------------------------------------- NumberOfEntries = 12 TotalTime[Rnd+Rmb+ME] (123)= ( 2.160973e-01 ) sec TotalTime[Rambo+ME] (23)= ( 8.944656e-02 ) sec TotalTime[RndNumGen] (1)= ( 1.266507e-01 ) sec TotalTime[Rambo] (2)= ( 6.896938e-02 ) sec TotalTime[MatrixElems] (3)= ( 2.047718e-02 ) sec MeanTimeInMatrixElems = ( 1.706432e-03 ) sec [Min,Max]TimeInMatrixElems = [ 8.084830e-04 , 1.144112e-02 ] sec ----------------------------------------------------------------------- TotalEventsComputed = 6291456 EvtsPerSec[Rnd+Rmb+ME](123)= ( 2.911400e+07 ) sec^-1 EvtsPerSec[Rmb+ME] (23)= ( 7.033759e+07 ) sec^-1 EvtsPerSec[MatrixElems] (3)= ( 3.072422e+08 ) sec^-1 *********************************************************************** NumMatrixElements(notNan) = 786432 MeanMatrixElemValue = ( 1.372249e-02 +- 9.248484e-06 ) GeV^0 [Min,Max]MatrixElemValue = [ 6.071582e-03 , 3.374920e-02 ] GeV^0 StdDevMatrixElemValue = ( 8.201648e-03 ) GeV^0 MeanWeight = ( 4.515827e-01 +- 0.000000e+00 ) [Min,Max]Weight = [ 4.515827e-01 , 4.515827e-01 ] StdDevWeight = ( 0.000000e+00 ) *********************************************************************** ./check.exe -p 16384 32 12 *********************************************************************** NumBlocksPerGrid = 16384 NumThreadsPerBlock = 32 NumIterations = 12 ----------------------------------------------------------------------- FP precision = DOUBLE (nan=5505024) Complex type = STD::COMPLEX RanNumb memory layout = AOSOA[4] Momenta memory layout = AOSOA[4] Random number generation = COMMON RANDOM HOST (C++ code) ----------------------------------------------------------------------- NumberOfEntries = 12 TotalTime[Rnd+Rmb+ME] (123)= ( 2.000985e+01 ) sec TotalTime[Rambo+ME] (23)= ( 1.994450e+01 ) sec TotalTime[RndNumGen] (1)= ( 6.534211e-02 ) sec TotalTime[Rambo] (2)= ( 6.525773e-01 ) sec TotalTime[MatrixElems] (3)= ( 1.929193e+01 ) sec MeanTimeInMatrixElems = ( 1.607661e+00 ) sec [Min,Max]TimeInMatrixElems = [ 1.606176e+00 , 1.616634e+00 ] sec ----------------------------------------------------------------------- TotalEventsComputed = 6291456 EvtsPerSec[Rnd+Rmb+ME](123)= ( 3.144180e+05 ) sec^-1 EvtsPerSec[Rmb+ME] (23)= ( 3.154481e+05 ) sec^-1 EvtsPerSec[MatrixElems] (3)= ( 3.261186e+05 ) sec^-1 *********************************************************************** NumMatrixElements(notNan) = 786432 MeanMatrixElemValue = ( 1.372249e-02 +- 9.248484e-06 ) GeV^0 [Min,Max]MatrixElemValue = [ 6.071582e-03 , 3.374920e-02 ] GeV^0 StdDevMatrixElemValue = ( 8.201648e-03 ) GeV^0 MeanWeight = ( 4.515827e-01 +- 0.000000e+00 ) [Min,Max]Weight = [ 4.515827e-01 , 4.515827e-01 ] StdDevWeight = ( 0.000000e+00 ) ***********************************************************************
./gcheck.exe -p 16384 32 12 *********************************************************************** NumBlocksPerGrid = 16384 NumThreadsPerBlock = 32 NumIterations = 12 ----------------------------------------------------------------------- FP precision = DOUBLE (nan=0) Complex type = THRUST::COMPLEX RanNumb memory layout = AOSOA[4] Momenta memory layout = AOSOA[4] Wavefunction GPU memory = LOCAL Random number generation = COMMON RANDOM HOST (CUDA code) ----------------------------------------------------------------------- NumberOfEntries = 12 TotalTime[Rnd+Rmb+ME] (123)= ( 2.852055e-01 ) sec TotalTime[Rambo+ME] (23)= ( 8.309519e-02 ) sec TotalTime[RndNumGen] (1)= ( 2.021103e-01 ) sec TotalTime[Rambo] (2)= ( 7.403231e-02 ) sec TotalTime[MatrixElems] (3)= ( 9.062881e-03 ) sec MeanTimeInMatrixElems = ( 7.552401e-04 ) sec [Min,Max]TimeInMatrixElems = [ 7.227170e-04 , 7.788700e-04 ] sec ----------------------------------------------------------------------- TotalEventsComputed = 6291456 EvtsPerSec[Rnd+Rmb+ME](123)= ( 2.205938e+07 ) sec^-1 EvtsPerSec[Rmb+ME] (23)= ( 7.571384e+07 ) sec^-1 EvtsPerSec[MatrixElems] (3)= ( 6.942004e+08 ) sec^-1 *********************************************************************** NumMatrixElements(notNan) = 6291456 MeanMatrixElemValue = ( 1.371988e-02 +- 3.269530e-06 ) GeV^0 [Min,Max]MatrixElemValue = [ 6.071582e-03 , 3.374925e-02 ] GeV^0 StdDevMatrixElemValue = ( 8.200888e-03 ) GeV^0 MeanWeight = ( 4.515827e-01 +- 0.000000e+00 ) [Min,Max]Weight = [ 4.515827e-01 , 4.515827e-01 ] StdDevWeight = ( 0.000000e+00 ) *********************************************************************** 00 CudaFree : 0.883399 sec 0a ProcInit : 0.000441 sec 0b MemAlloc : 0.075924 sec 0c GenCreat : 0.002128 sec 0d SGoodHel : 0.001766 sec 1b GenRnGen : 0.148745 sec 1c CpHTDrnd : 0.053365 sec 2a RamboIni : 0.000316 sec 2b RamboFin : 0.000199 sec 2c CpDTHwgt : 0.005506 sec 2d CpDTHmom : 0.068012 sec 3a SigmaKin : 0.000203 sec 3b CpDTHmes : 0.008860 sec 4a DumpLoop : 0.036731 sec 8a CompStat : 0.045455 sec 9a GenDestr : 0.000011 sec 9b MemFree : 0.017908 sec 9c CudReset : 0.048424 sec 9d DumpScrn : 0.000239 sec 9e DumpJson : 0.000007 sec TOTAL : 1.397640 sec TOTAL (123) : 0.285205 sec TOTAL (23) : 0.083095 sec TOTAL (1) : 0.202110 sec TOTAL (2) : 0.074032 sec TOTAL (3) : 0.009063 sec *********************************************************************** ./check.exe -p 16384 32 12 *********************************************************************** NumBlocksPerGrid = 16384 NumThreadsPerBlock = 32 NumIterations = 12 ----------------------------------------------------------------------- FP precision = DOUBLE (nan=0) Complex type = STD::COMPLEX RanNumb memory layout = AOSOA[4] Momenta memory layout = AOSOA[4] Random number generation = COMMON RANDOM HOST (C++ code) ----------------------------------------------------------------------- NumberOfEntries = 12 TotalTime[Rnd+Rmb+ME] (123)= ( 1.771678e+01 ) sec TotalTime[Rambo+ME] (23)= ( 1.757863e+01 ) sec TotalTime[RndNumGen] (1)= ( 1.381516e-01 ) sec TotalTime[Rambo] (2)= ( 1.219389e+00 ) sec TotalTime[MatrixElems] (3)= ( 1.635924e+01 ) sec MeanTimeInMatrixElems = ( 1.363270e+00 ) sec [Min,Max]TimeInMatrixElems = [ 1.363001e+00 , 1.363693e+00 ] sec ----------------------------------------------------------------------- TotalEventsComputed = 6291456 EvtsPerSec[Rnd+Rmb+ME](123)= ( 3.551128e+05 ) sec^-1 EvtsPerSec[Rmb+ME] (23)= ( 3.579037e+05 ) sec^-1 EvtsPerSec[MatrixElems] (3)= ( 3.845812e+05 ) sec^-1 *********************************************************************** NumMatrixElements(notNan) = 6291456 MeanMatrixElemValue = ( 1.371988e-02 +- 3.269530e-06 ) GeV^0 [Min,Max]MatrixElemValue = [ 6.071582e-03 , 3.374925e-02 ] GeV^0 StdDevMatrixElemValue = ( 8.200888e-03 ) GeV^0 MeanWeight = ( 4.515827e-01 +- 0.000000e+00 ) [Min,Max]Weight = [ 4.515827e-01 , 4.515827e-01 ] StdDevWeight = ( 0.000000e+00 ) *********************************************************************** 0a ProcInit : 0.000393 sec 0b MemAlloc : 0.051142 sec 0c GenCreat : 0.000304 sec 1b GenRnGen : 0.138152 sec 2a RamboIni : 0.083532 sec 2b RamboFin : 1.135857 sec 3a SigmaKin : 16.359241 sec 4a DumpLoop : 0.022470 sec 8a CompStat : 0.036933 sec 9a GenDestr : 0.000009 sec 9b MemFree : 0.001136 sec 9d DumpScrn : 0.000218 sec 9e DumpJson : 0.000009 sec TOTAL : 17.829397 sec TOTAL (123) : 17.716782 sec TOTAL (23) : 17.578630 sec TOTAL (1) : 0.138152 sec TOTAL (2) : 1.219389 sec TOTAL (3) : 16.359241 sec ***********************************************************************
./gcheck.exe -p 16384 32 12 *********************************************************************** NumBlocksPerGrid = 16384 NumThreadsPerBlock = 32 NumIterations = 12 ----------------------------------------------------------------------- FP precision = DOUBLE (nan=0) Complex type = THRUST::COMPLEX RanNumb memory layout = AOSOA[4] Momenta memory layout = AOSOA[4] Wavefunction GPU memory = LOCAL Random number generation = CURAND HOST (CUDA code) ----------------------------------------------------------------------- NumberOfEntries = 12 TotalTime[Rnd+Rmb+ME] (123)= ( 4.464168e-01 ) sec TotalTime[Rambo+ME] (23)= ( 7.717266e-02 ) sec TotalTime[RndNumGen] (1)= ( 3.692441e-01 ) sec TotalTime[Rambo] (2)= ( 6.808826e-02 ) sec TotalTime[MatrixElems] (3)= ( 9.084397e-03 ) sec MeanTimeInMatrixElems = ( 7.570331e-04 ) sec [Min,Max]TimeInMatrixElems = [ 7.526100e-04 , 7.626670e-04 ] sec ----------------------------------------------------------------------- TotalEventsComputed = 6291456 EvtsPerSec[Rnd+Rmb+ME](123)= ( 1.409323e+07 ) sec^-1 EvtsPerSec[Rmb+ME] (23)= ( 8.152442e+07 ) sec^-1 EvtsPerSec[MatrixElems] (3)= ( 6.925563e+08 ) sec^-1 *********************************************************************** NumMatrixElements(notNan) = 6291456 MeanMatrixElemValue = ( 1.372152e-02 +- 3.269516e-06 ) GeV^0 [Min,Max]MatrixElemValue = [ 6.071582e-03 , 3.374925e-02 ] GeV^0 StdDevMatrixElemValue = ( 8.200854e-03 ) GeV^0 MeanWeight = ( 4.515827e-01 +- 0.000000e+00 ) [Min,Max]Weight = [ 4.515827e-01 , 4.515827e-01 ] StdDevWeight = ( 0.000000e+00 ) *********************************************************************** 00 CudaFree : 1.163048 sec 0a ProcInit : 0.000442 sec 0b MemAlloc : 0.075970 sec 0c GenCreat : 0.000463 sec 0d SGoodHel : 0.001750 sec 1a GenSeed : 0.000097 sec 1b GenRnGen : 0.334314 sec 1c CpHTDrnd : 0.034833 sec 2a RamboIni : 0.000263 sec 2b RamboFin : 0.000162 sec 2c CpDTHwgt : 0.005505 sec 2d CpDTHmom : 0.062159 sec 3a SigmaKin : 0.000177 sec 3b CpDTHmes : 0.008908 sec 4a DumpLoop : 0.023940 sec 8a CompStat : 0.045298 sec 9a GenDestr : 0.000018 sec 9b MemFree : 0.017843 sec 9c CudReset : 0.047554 sec 9d DumpScrn : 0.000197 sec 9e DumpJson : 0.000027 sec TOTAL : 1.822965 sec TOTAL (123) : 0.446417 sec TOTAL (23) : 0.077173 sec TOTAL (1) : 0.369244 sec TOTAL (2) : 0.068088 sec TOTAL (3) : 0.009084 sec *********************************************************************** ./check.exe -p 16384 32 12 *********************************************************************** NumBlocksPerGrid = 16384 NumThreadsPerBlock = 32 NumIterations = 12 ----------------------------------------------------------------------- FP precision = DOUBLE (nan=0) Complex type = STD::COMPLEX RanNumb memory layout = AOSOA[4] Momenta memory layout = AOSOA[4] Random number generation = CURAND (C++ code) ----------------------------------------------------------------------- NumberOfEntries = 12 TotalTime[Rnd+Rmb+ME] (123)= ( 1.789395e+01 ) sec TotalTime[Rambo+ME] (23)= ( 1.756132e+01 ) sec TotalTime[RndNumGen] (1)= ( 3.326294e-01 ) sec TotalTime[Rambo] (2)= ( 1.197999e+00 ) sec TotalTime[MatrixElems] (3)= ( 1.636332e+01 ) sec MeanTimeInMatrixElems = ( 1.363610e+00 ) sec [Min,Max]TimeInMatrixElems = [ 1.363146e+00 , 1.364147e+00 ] sec ----------------------------------------------------------------------- TotalEventsComputed = 6291456 EvtsPerSec[Rnd+Rmb+ME](123)= ( 3.515968e+05 ) sec^-1 EvtsPerSec[Rmb+ME] (23)= ( 3.582564e+05 ) sec^-1 EvtsPerSec[MatrixElems] (3)= ( 3.844852e+05 ) sec^-1 *********************************************************************** NumMatrixElements(notNan) = 6291456 MeanMatrixElemValue = ( 1.372152e-02 +- 3.269516e-06 ) GeV^0 [Min,Max]MatrixElemValue = [ 6.071582e-03 , 3.374925e-02 ] GeV^0 StdDevMatrixElemValue = ( 8.200854e-03 ) GeV^0 MeanWeight = ( 4.515827e-01 +- 0.000000e+00 ) [Min,Max]Weight = [ 4.515827e-01 , 4.515827e-01 ] StdDevWeight = ( 0.000000e+00 ) *********************************************************************** 0a ProcInit : 0.001699 sec 0b MemAlloc : 0.051144 sec 0c GenCreat : 0.000865 sec 1a GenSeed : 0.000099 sec 1b GenRnGen : 0.332530 sec 2a RamboIni : 0.083874 sec 2b RamboFin : 1.114125 sec 3a SigmaKin : 16.363321 sec 4a DumpLoop : 0.020444 sec 8a CompStat : 0.037162 sec 9a GenDestr : 0.000081 sec 9b MemFree : 0.001136 sec 9d DumpScrn : 0.000213 sec 9e DumpJson : 0.000007 sec TOTAL : 18.006704 sec TOTAL (123) : 17.893950 sec TOTAL (23) : 17.561319 sec TOTAL (1) : 0.332629 sec TOTAL (2) : 1.197999 sec TOTAL (3) : 16.363321 sec ***********************************************************************
This was referenced Nov 10, 2020
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
No description provided.