Skip to content

Difficulty with ONT data #90

@robert-a-forsyth

Description

@robert-a-forsyth

I'm running into errors, specifically with ONT data where the make_examples, and post_process_variants seem to fail silently and run out the walltime on the slurm job.

/opt/deepvariant/bin/postprocess_variants \
     \
    --ref "Homo_sapiens_assembly38_masked_noALT.fasta" \
    --infile "HG008-ONT.call.tfrecord.gz" \
    --outfile "HG008-ONT.vcf.gz" \
    --nonvariant_site_tfrecord_path "HG008-ONT.gvcf.tfrecord@00024.gz" \
    --gvcf_outfile "HG008-ONT.g.vcf.gz" \
    --sample_name HG008-ONT \
     \
     \
    --cpus 12

leads to this log

2026-04-04 01:19:17.166772: I tensorflow/core/util/port.cc:110] oneDNN custom operations are on. You may see slightly different numerical results due to floating-point round-off errors from different computation orders. To turn them off, set the environment variable `TF_ENABLE_ONEDNN_OPTS=0`.
2026-04-04 01:19:18.142723: I tensorflow/tsl/cuda/cudart_stub.cc:28] Could not find cuda drivers on your machine, GPU will not be used.
2026-04-04 01:19:19.133450: I tensorflow/tsl/cuda/cudart_stub.cc:28] Could not find cuda drivers on your machine, GPU will not be used.
2026-04-04 01:19:19.137058: I tensorflow/core/platform/cpu_feature_guard.cc:182] This TensorFlow binary is optimized to use available CPU instructions in performance-critical operations.
To enable the following instructions: AVX2 AVX512F AVX512_VNNI FMA, in other operations, rebuild TensorFlow with the appropriate compiler flags.
2026-04-04 01:19:20.763585: W tensorflow/compiler/tf2tensorrt/utils/py_utils.cc:38] TF-TRT Warning: Could not find TensorRT
I0404 01:19:29.745429 22949602054784 postprocess_variants.py:1480] Using sample name from call_variants output. Sample name: HG008-ONT
I0404 01:19:29.745574 22949602054784 postprocess_variants.py:1485] --sample_name is set but was not used.
I0404 01:19:30.980240 22949602054784 postprocess_variants.py:1735] Running postprocess_variants with parallelism using 12 CPUs over 12 partitions.
I0404 01:21:41.124814 22949602054784 postprocess_variants.py:1346] Processing region chr4:0-chr4:190214555
I0404 01:21:41.125960 22949602054784 postprocess_variants.py:1353] CVO sorting took 2.1677200396855674 minutes
I0404 01:21:41.126940 22949602054784 postprocess_variants.py:1357] Transforming call_variants_output to variants.
I0404 01:21:41.126992 22949602054784 postprocess_variants.py:1365] Processed 2979906 variants.
I0404 01:21:41.127068 22949602054784 postprocess_variants.py:1568] Processing variants (and writing to temporary files) took 2.1677387317021686 minutes
I0404 01:21:41.714890 22949602054784 postprocess_variants.py:1346] Processing region chr3:0-chr3:198295559
I0404 01:21:41.715251 22949602054784 postprocess_variants.py:1353] CVO sorting took 2.1775914231936135 minutes
I0404 01:21:41.716210 22949602054784 postprocess_variants.py:1357] Transforming call_variants_output to variants.
I0404 01:21:41.716259 22949602054784 postprocess_variants.py:1365] Processed 3088142 variants.
I0404 01:21:41.716333 22949602054784 postprocess_variants.py:1568] Processing variants (and writing to temporary files) took 2.1776097575823465 minutes
I0404 01:21:47.385473 22949602054784 postprocess_variants.py:1346] Processing region chr1:0-chr1:248956422
I0404 01:21:47.385832 22949602054784 postprocess_variants.py:1353] CVO sorting took 2.272206179300944 minutes
I0404 01:21:47.389170 22949602054784 postprocess_variants.py:1357] Transforming call_variants_output to variants.
I0404 01:21:47.389217 22949602054784 postprocess_variants.py:1365] Processed 3821722 variants.
I0404 01:21:47.389294 22949602054784 postprocess_variants.py:1568] Processing variants (and writing to temporary files) took 2.272264071305593 minutes
I0404 01:21:50.422237 22949602054784 postprocess_variants.py:1346] Processing region chr2:0-chr2:242193529
I0404 01:21:50.422600 22949602054784 postprocess_variants.py:1353] CVO sorting took 2.32276877562205 minutes
I0404 01:21:50.423541 22949602054784 postprocess_variants.py:1357] Transforming call_variants_output to variants.
I0404 01:21:50.423591 22949602054784 postprocess_variants.py:1365] Processed 3863892 variants.
I0404 01:21:50.423666 22949602054784 postprocess_variants.py:1568] Processing variants (and writing to temporary files) took 2.322786756356557 minutes
I0404 01:21:54.706866 22949602054784 postprocess_variants.py:1346] Processing region chr9:0-chr10:133797422
I0404 01:21:54.707232 22949602054784 postprocess_variants.py:1353] CVO sorting took 2.3939231952031452 minutes
I0404 01:21:54.708181 22949602054784 postprocess_variants.py:1357] Transforming call_variants_output to variants.
I0404 01:21:54.708232 22949602054784 postprocess_variants.py:1365] Processed 4243004 variants.
I0404 01:21:54.708307 22949602054784 postprocess_variants.py:1568] Processing variants (and writing to temporary files) took 2.3939413189888 minutes
I0404 01:21:56.652044 22949602054784 postprocess_variants.py:1346] Processing region chr13:0-chr15:101991189
I0404 01:21:56.652399 22949602054784 postprocess_variants.py:1353] CVO sorting took 2.4262409011522927 minutes
I0404 01:21:56.653347 22949602054784 postprocess_variants.py:1357] Transforming call_variants_output to variants.
I0404 01:21:56.653398 22949602054784 postprocess_variants.py:1365] Processed 4434845 variants.
I0404 01:21:56.653472 22949602054784 postprocess_variants.py:1568] Processing variants (and writing to temporary files) took 2.426258965333303 minutes
I0404 01:21:56.838805 22949602054784 postprocess_variants.py:1346] Processing region chr11:0-chr12:133275309
I0404 01:21:56.839159 22949602054784 postprocess_variants.py:1353] CVO sorting took 2.4294020970662435 minutes
I0404 01:21:56.840102 22949602054784 postprocess_variants.py:1357] Transforming call_variants_output to variants.
I0404 01:21:56.840152 22949602054784 postprocess_variants.py:1365] Processed 4351241 variants.
I0404 01:21:56.840232 22949602054784 postprocess_variants.py:1568] Processing variants (and writing to temporary files) took 2.4294201771418256 minutes
I0404 01:22:00.139049 22949602054784 postprocess_variants.py:1346] Processing region chr7:0-chr8:145138636
I0404 01:22:00.139398 22949602054784 postprocess_variants.py:1353] CVO sorting took 2.4845094402631123 minutes
I0404 01:22:00.140315 22949602054784 postprocess_variants.py:1357] Transforming call_variants_output to variants.
I0404 01:22:00.140364 22949602054784 postprocess_variants.py:1365] Processed 4870138 variants.
I0404 01:22:00.140439 22949602054784 postprocess_variants.py:1568] Processing variants (and writing to temporary files) took 2.484527015686035 minutes
I0404 01:22:02.539152 22949602054784 postprocess_variants.py:1346] Processing region chr20:0-chrX:156040895
I0404 01:22:02.539509 22949602054784 postprocess_variants.py:1353] CVO sorting took 2.5242565234502155 minutes
I0404 01:22:02.540465 22949602054784 postprocess_variants.py:1357] Transforming call_variants_output to variants.
I0404 01:22:02.540513 22949602054784 postprocess_variants.py:1365] Processed 4782710 variants.
I0404 01:22:02.540586 22949602054784 postprocess_variants.py:1568] Processing variants (and writing to temporary files) took 2.5242746671040854 minutes
I0404 01:22:03.098630 22949602054784 postprocess_variants.py:1346] Processing region chr16:0-chr19:58617616
I0404 01:22:03.098975 22949602054784 postprocess_variants.py:1353] CVO sorting took 2.533632683753967 minutes
I0404 01:22:03.099862 22949602054784 postprocess_variants.py:1357] Transforming call_variants_output to variants.
I0404 01:22:03.099909 22949602054784 postprocess_variants.py:1365] Processed 4986589 variants.
I0404 01:22:03.099981 22949602054784 postprocess_variants.py:1568] Processing variants (and writing to temporary files) took 2.5336498181025187 minutes
I0404 01:22:48.051768 22949602054784 postprocess_variants.py:1346] Processing region chrY:0-chrUn_JTFH01001998v1_decoy:2001
I0404 01:22:48.052109 22949602054784 postprocess_variants.py:1353] CVO sorting took 3.282629195849101 minutes
I0404 01:22:48.053007 22949602054784 postprocess_variants.py:1357] Transforming call_variants_output to variants.
I0404 01:22:48.053056 22949602054784 postprocess_variants.py:1365] Processed 31043 variants.
I0404 01:22:48.053124 22949602054784 postprocess_variants.py:1568] Processing variants (and writing to temporary files) took 3.282646294434865 minutes
I0404 02:01:37.957194 22949602054784 postprocess_variants.py:1603] VCF and gVCF creation took 39.93733240365982 minutes.
I0404 02:01:38.266677 22949602054784 postprocess_variants.py:1603] VCF and gVCF creation took 39.95232543547948 minutes.
I0404 02:04:05.298418 22949602054784 postprocess_variants.py:1603] VCF and gVCF creation took 42.29848404725393 minutes.
I0404 02:04:06.535706 22949602054784 postprocess_variants.py:1603] VCF and gVCF creation took 42.26853257020314 minutes.
I0404 02:06:28.027679 22949602054784 postprocess_variants.py:1603] VCF and gVCF creation took 44.55532149473826 minutes.
I0404 02:06:55.130117 22949602054784 postprocess_variants.py:1603] VCF and gVCF creation took 44.97149666547775 minutes.
I0404 02:07:17.104206 22949602054784 postprocess_variants.py:1603] VCF and gVCF creation took 45.3408441901207 minutes.
I0404 02:08:06.580466 22949602054784 postprocess_variants.py:1603] VCF and gVCF creation took 46.058006676038104 minutes.
I0404 02:08:19.705912 22949602054784 postprocess_variants.py:1603] VCF and gVCF creation took 46.28608732620875 minutes.
I0404 02:08:27.090184 22949602054784 postprocess_variants.py:1603] VCF and gVCF creation took 46.44916099309921 minutes.
I0404 02:32:27.794966 22949602054784 postprocess_variants.py:1603] VCF and gVCF creation took 69.66236266295115 minutes.
[2026-04-04T17:18:06.019] error: Detected 1 oom_kill event in StepId=66364513.batch. Some of the step tasks have been OOM Killed.

Metadata

Metadata

Assignees

Labels

No labels
No labels

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions