Hi I'm trying to run a batch job utilizing the --array slurm option. Wondering if this is possible using drmaa-python. I know there is a runBulkJobs(...), however it seems that this doesn't run an array of jobs. There doesn't seem to be any $SLURM_ARRAY_TASK_ID (or the likes) associated with that run environment.
with drmaa.Session() as s:
try:
# create job template
jt = s.createJobTemplate()
jt.nativeSpecification='--mem-per-cpu='+self.memory + ' --array=1-3' + ' --time=' + self.time
jt.remoteCommand = command
print(jt.nativeSpecification)
# run job
joblist = s.runBulkJobs(jt, 1, 3, 1)
# wait for the return value
s.synchronize(joblist, self.convertToSeconds(), False)
for curjob in joblist:
print('Collecting job ' + curjob)
retval = s.wait(curjob, drmaa.Session.TIMEOUT_WAIT_FOREVER)
print('Job: {0} finished with status {1} and was aborted {2}'.format(retval.jobId, retval.exitStatus, retval.wasAborted))
if retval.wasAborted == True:
print("Ran out of memeory using: " + self.memory)
self.increaseMemory(6000)
# if zero exit code then break and job is over
elif retval.exitStatus == 0:
break
except drmaa.ExitTimeoutException:
print("Ran out of time using: " + self.time)
self.increaseTime(6)
except drmaa.OutOfMemoryException:
print("Ran out of memeory using: " + self.memory)
self.increaseMemory(6000)
when I try and run this I get a segmentation fault.
(gdb) run run.py "echo $SLURM_ARRAY_TASK_ID" 01:00:00 100
Starting program: /apps/python/2.7.6/bin/python run.py "echo $SLURM_ARRAY_TASK_ID" 01:00:00 100
[Thread debugging using libthread_db enabled]
--mem-per-cpu=100 -a=1-3 --time=01:00:00
Program received signal SIGSEGV, Segmentation fault.
drmaa_release_job_ids (values=0x0) at drmaa_base.c:297
297 drmaa_base.c: No such file or directory.
in drmaa_base.c
Missing separate debuginfos, use: debuginfo-install glibc-2.12-1.107.el6.x86_64 keyutils-libs-1.4-4.el6.x86_64 krb5-libs-1.10.3-10.el6_4.6.x86_64 libcom_err-1.41.12-18.el6.x86_64 libselinux-2.0.94-5.3.el6.x86_64 openssl-1.0.1e-16.el6_5.14.x86_64
(gdb) backtrace
#0 drmaa_release_job_ids (values=0x0) at drmaa_base.c:297
#1 0x00002aaab15cecbc in ffi_call_unix64 () at /apps/python/Python-2.7.6/Modules/_ctypes/libffi/src/x86/unix64.S:76
#2 0x00002aaab15ce393 in ffi_call (cif=<value optimized out>, fn=0x2aaab301b290 <drmaa_release_job_ids>, rvalue=<value optimized out>, avalue=0x7fffffffb0d0) at /apps/python/Python-2.7.6/Modules/_ctypes/libffi/src/x86/ffi64.c:522
#3 0x00002aaab15c6006 in _call_function_pointer (pProc=0x2aaab301b290 <drmaa_release_job_ids>, argtuple=0x7fffffffb1a0, flags=4353, argtypes=<value optimized out>, restype=0x2aaaaae5ebd0, checker=0x0) at /apps/python/Python-2.7.6/Modules/_ctypes/callproc.c:836
#4 _ctypes_callproc (pProc=0x2aaab301b290 <drmaa_release_job_ids>, argtuple=0x7fffffffb1a0, flags=4353, argtypes=<value optimized out>, restype=0x2aaaaae5ebd0, checker=0x0) at /apps/python/Python-2.7.6/Modules/_ctypes/callproc.c:1183
#5 0x00002aaab15bdcf3 in PyCFuncPtr_call (self=<value optimized out>, inargs=<value optimized out>, kwds=0x0) at /apps/python/Python-2.7.6/Modules/_ctypes/_ctypes.c:3929
#6 0x00002aaaaaaf79b3 in PyObject_Call (func=0x93b530, arg=<value optimized out>, kw=<value optimized out>) at Objects/abstract.c:2529
#7 0x00002aaaaaba6ad9 in do_call (f=<value optimized out>, throwflag=<value optimized out>) at Python/ceval.c:4239
#8 call_function (f=<value optimized out>, throwflag=<value optimized out>) at Python/ceval.c:4044
#9 PyEval_EvalFrameEx (f=<value optimized out>, throwflag=<value optimized out>) at Python/ceval.c:2666
#10 0x00002aaaaab1b887 in gen_send_ex (gen=0x971e60, arg=0x0, exc=<value optimized out>) at Objects/genobject.c:84
#11 0x00002aaaaab2f656 in listextend (self=0x981200, b=<value optimized out>) at Objects/listobject.c:872
#12 0x00002aaaaab2fae0 in list_init (self=0x981200, args=<value optimized out>, kw=<value optimized out>) at Objects/listobject.c:2458
#13 0x00002aaaaab5a8a8 in type_call (type=<value optimized out>, args=0x97bf90, kwds=0x0) at Objects/typeobject.c:745
#14 0x00002aaaaaaf79b3 in PyObject_Call (func=0x2aaaaae5b0a0, arg=<value optimized out>, kw=<value optimized out>) at Objects/abstract.c:2529
#15 0x00002aaaaaba6ad9 in do_call (f=<value optimized out>, throwflag=<value optimized out>) at Python/ceval.c:4239
#16 call_function (f=<value optimized out>, throwflag=<value optimized out>) at Python/ceval.c:4044
#17 PyEval_EvalFrameEx (f=<value optimized out>, throwflag=<value optimized out>) at Python/ceval.c:2666
#18 0x00002aaaaaba917e in PyEval_EvalCodeEx (co=0x78f4b0, globals=<value optimized out>, locals=<value optimized out>, args=<value optimized out>, argcount=4, kws=0x9addb8, kwcount=0, defs=0x0, defcount=0, closure=0x0) at Python/ceval.c:3253
#19 0x00002aaaaaba7332 in fast_function (f=<value optimized out>, throwflag=<value optimized out>) at Python/ceval.c:4117
#20 call_function (f=<value optimized out>, throwflag=<value optimized out>) at Python/ceval.c:4042
#21 PyEval_EvalFrameEx (f=<value optimized out>, throwflag=<value optimized out>) at Python/ceval.c:2666
#22 0x00002aaaaaba807e in fast_function (f=<value optimized out>, throwflag=<value optimized out>) at Python/ceval.c:4107
#23 call_function (f=<value optimized out>, throwflag=<value optimized out>) at Python/ceval.c:4042
#24 PyEval_EvalFrameEx (f=<value optimized out>, throwflag=<value optimized out>) at Python/ceval.c:2666
#25 0x00002aaaaaba807e in fast_function (f=<value optimized out>, throwflag=<value optimized out>) at Python/ceval.c:4107
#26 call_function (f=<value optimized out>, throwflag=<value optimized out>) at Python/ceval.c:4042
#27 PyEval_EvalFrameEx (f=<value optimized out>, throwflag=<value optimized out>) at Python/ceval.c:2666
#28 0x00002aaaaaba917e in PyEval_EvalCodeEx (co=0x7743b0, globals=<value optimized out>, locals=<value optimized out>, args=<value optimized out>, argcount=0, kws=0x0, kwcount=0, defs=0x0, defcount=0, closure=0x0) at Python/ceval.c:3253
#29 0x00002aaaaaba9292 in PyEval_EvalCode (co=<value optimized out>, globals=<value optimized out>, locals=<value optimized out>) at Python/ceval.c:667
#30 0x00002aaaaabc8e40 in run_mod (fp=0x82a030, filename=<value optimized out>, start=<value optimized out>, globals=0x67e510, locals=0x67e510, closeit=1, flags=0x7fffffffbe20) at Python/pythonrun.c:1370
#31 PyRun_FileExFlags (fp=0x82a030, filename=<value optimized out>, start=<value optimized out>, globals=0x67e510, locals=0x67e510, closeit=1, flags=0x7fffffffbe20) at Python/pythonrun.c:1356
#32 0x00002aaaaabc901f in PyRun_SimpleFileExFlags (fp=0x82a030, filename=0x7fffffffc4bd "run.py", closeit=1, flags=0x7fffffffbe20) at Python/pythonrun.c:948
#33 0x00002aaaaabdeb34 in Py_Main (argc=<value optimized out>, argv=<value optimized out>) at Modules/main.c:640
#34 0x00000039ad21ecdd in __libc_start_main () from /lib64/libc.so.6
#35 0x0000000000400669 in _start ()
aside: I'm also having trouble with it throwing a OutOfMemoryException.. Therefore am forced to assume it was aborted due to memory (not preferable) so advice on what's happening there would be great.
Hi I'm trying to run a batch job utilizing the --array slurm option. Wondering if this is possible using drmaa-python. I know there is a runBulkJobs(...), however it seems that this doesn't run an array of jobs. There doesn't seem to be any $SLURM_ARRAY_TASK_ID (or the likes) associated with that run environment.
when I try and run this I get a segmentation fault.
OUTPUT
gdb debug backtrace gives the following result
aside: I'm also having trouble with it throwing a OutOfMemoryException.. Therefore am forced to assume it was aborted due to memory (not preferable) so advice on what's happening there would be great.
Thanks!