-
Notifications
You must be signed in to change notification settings - Fork 1
Expand file tree
/
Copy pathcontainers.html
More file actions
678 lines (658 loc) · 37.9 KB
/
containers.html
File metadata and controls
678 lines (658 loc) · 37.9 KB
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
601
602
603
604
605
606
607
608
609
610
611
612
613
614
615
616
617
618
619
620
621
622
623
624
625
626
627
628
629
630
631
632
633
634
635
636
637
638
639
640
641
642
643
644
645
646
647
648
649
650
651
652
653
654
655
656
657
658
659
660
661
662
663
664
665
666
667
668
669
670
671
672
673
674
675
676
677
678
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">
<html xmlns="http://www.w3.org/1999/xhtml">
<head>
<meta http-equiv="Content-Type" content="text/html; charset=utf-8" />
<meta http-equiv="Content-Style-Type" content="text/css" />
<meta name="generator" content="pandoc" />
<meta name="author" content="April 21, 2021" />
<title>Introduction to Containers: Creating Reproducible, Scalable, and Portable Workflows (tinyurl.com/brc-apr21)</title>
<style type="text/css">code{white-space: pre;}</style>
<style type="text/css">
div.sourceCode { overflow-x: auto; }
table.sourceCode, tr.sourceCode, td.lineNumbers, td.sourceCode {
margin: 0; padding: 0; vertical-align: baseline; border: none; }
table.sourceCode { width: 100%; line-height: 100%; }
td.lineNumbers { text-align: right; padding-right: 4px; padding-left: 4px; color: #aaaaaa; border-right: 1px solid #aaaaaa; }
td.sourceCode { padding-left: 5px; }
code > span.kw { color: #007020; font-weight: bold; } /* Keyword */
code > span.dt { color: #902000; } /* DataType */
code > span.dv { color: #40a070; } /* DecVal */
code > span.bn { color: #40a070; } /* BaseN */
code > span.fl { color: #40a070; } /* Float */
code > span.ch { color: #4070a0; } /* Char */
code > span.st { color: #4070a0; } /* String */
code > span.co { color: #60a0b0; font-style: italic; } /* Comment */
code > span.ot { color: #007020; } /* Other */
code > span.al { color: #ff0000; font-weight: bold; } /* Alert */
code > span.fu { color: #06287e; } /* Function */
code > span.er { color: #ff0000; font-weight: bold; } /* Error */
code > span.wa { color: #60a0b0; font-weight: bold; font-style: italic; } /* Warning */
code > span.cn { color: #880000; } /* Constant */
code > span.sc { color: #4070a0; } /* SpecialChar */
code > span.vs { color: #4070a0; } /* VerbatimString */
code > span.ss { color: #bb6688; } /* SpecialString */
code > span.im { } /* Import */
code > span.va { color: #19177c; } /* Variable */
code > span.cf { color: #007020; font-weight: bold; } /* ControlFlow */
code > span.op { color: #666666; } /* Operator */
code > span.bu { } /* BuiltIn */
code > span.ex { } /* Extension */
code > span.pp { color: #bc7a00; } /* Preprocessor */
code > span.at { color: #7d9029; } /* Attribute */
code > span.do { color: #ba2121; font-style: italic; } /* Documentation */
code > span.an { color: #60a0b0; font-weight: bold; font-style: italic; } /* Annotation */
code > span.cv { color: #60a0b0; font-weight: bold; font-style: italic; } /* CommentVar */
code > span.in { color: #60a0b0; font-weight: bold; font-style: italic; } /* Information */
</style>
</head>
<body>
<div id="header">
<h1 class="title">Introduction to Containers: Creating Reproducible, Scalable, and Portable Workflows (tinyurl.com/brc-apr21)</h1>
<h2 class="author">April 21, 2021</h2>
<h3 class="date">Nicolas Chan, Wei Feinstein, Oliver Muellerklein, and Chris Paciorek</h3>
</div>
<h1 id="upcoming-events-and-hiring">Upcoming events and hiring</h1>
<ul>
<li><p>Research IT is looking for researchers working with sensitive data, as we are building tools and services to support that work. Please email research-it@berkeley.edu for more information.</p></li>
<li><p>Research IT is hiring graduate and undergraduate students for a variety of positions. Please talk to Amy Neeser to get more information.</p></li>
<li><p><a href="https://www.meetup.com/ucberkeley_cloudmeetup/">Cloud Computing Meetup</a> (monthly, with next meeting April 29 at 1 pm)</p></li>
<li><p><a href="https://dlab.berkeley.edu/working-groups/securing-research-data-working-group">Securing Research Data Working Group</a> (monthly, with next meeting May 10 at 2 pm)</p></li>
</ul>
<h1 id="how-to-get-additional-help">How to get additional help</h1>
<ul>
<li>Check the Status and Announcements page:
<ul>
<li><a href="https://research-it.berkeley.edu/services/high-performance-computing/status-and-announcements" class="uri">https://research-it.berkeley.edu/services/high-performance-computing/status-and-announcements</a></li>
</ul></li>
<li>For technical issues and questions about using Savio:
<ul>
<li>brc-hpc-help@berkeley.edu</li>
</ul></li>
<li>For questions about computing resources in general, including cloud computing:
<ul>
<li>brc@berkeley.edu</li>
<li>office hours: office hours: Wed. 1:30-3:00 and Thur. 9:30-11:00 <a href="https://research-it.berkeley.edu/programs/berkeley-research-computing/research-computing-consulting">on Zoom</a></li>
</ul></li>
<li>For questions about data management (including HIPAA-protected data):
<ul>
<li>researchdata@berkeley.edu</li>
<li>office hours: office hours: Wed. 1:30-3:00 and Thur. 9:30-11:00 <a href="https://research-it.berkeley.edu/programs/berkeley-research-computing/research-computing-consulting">on Zoom</a></li>
</ul></li>
</ul>
<h1 id="introduction">Introduction</h1>
<p>We'll do this mostly as a demonstration. We encourage you to login to your account and try out the various examples yourself as we go through them.</p>
<p>The materials for this tutorial are available using git at the short URL (<a href="https://tinyurl.com/brc-apr21">tinyurl.com/brc-apr21</a>), the GitHub URL (<a href="https://github.com/ucb-rit/savio-training-containers-2021" class="uri">https://github.com/ucb-rit/savio-training-containers-2021</a>), or simply as a <a href="https://github.com/ucb-rit/savio-training-containers-2021/archive/main.zip">zip file</a>.</p>
<h1 id="outline">Outline</h1>
<p>This training session will cover the following topics:</p>
<ul>
<li>Introduction to containers (Chris)
<ul>
<li>Comparison with VMs</li>
<li>Docker and Singularity</li>
<li>Advantages of containers</li>
</ul></li>
<li>Basic usage of containers (Chris)
<ul>
<li>Demo of running a Singularity container</li>
<li>More details on running a container</li>
<li>Use with Slurm</li>
<li>Sources of images</li>
</ul></li>
<li>Building containers (Nicolas)
<ul>
<li>Various ways to build Singularity containers</li>
<li>Using definition files</li>
<li>Using registries</li>
<li>Rewritable/sandbox images</li>
</ul></li>
<li>Specialized uses (Wei)
<ul>
<li>MPI</li>
<li>GPUs</li>
</ul></li>
<li>Containerizing scientific workflows (Oliver)</li>
<li>Other resources</li>
</ul>
<h1 id="what-is-a-container">What is a container?</h1>
<ul>
<li>Containerization provides "lightweight, standalone, executable packages of software that include everything needed to run an application: code, runtime, system tools, system libraries and settings".</li>
<li>A container provides a self-contained (isolated) filesystem.</li>
<li>Containers are similar to virtual machines in some ways, but much lighter-weight.</li>
<li>Containers are portable, shareable, and reproducible.</li>
</ul>
<h1 id="terminologyoverview">Terminology/Overview</h1>
<ul>
<li><em>image</em>: a bundle of files, including the operating system, system libraries, software, and possibly data and files associated with the software
<ul>
<li>may be stored as a single file (e.g., Singularity) or a group of files (e.g., Docker)</li>
</ul></li>
<li><em>container</em>: a virtual environment based on an image (i.e., a running instance of an image)
<ul>
<li>software running in the container sees this environment</li>
</ul></li>
<li><em>registry</em>: a source of images</li>
<li><em>host</em>: the actual machine on which the container runs</li>
</ul>
<h1 id="terminologyoverview-take-2">Terminology/Overview, take 2</h1>
<center>
<img src="taxonomy-of-docker-terms-and-concepts.png">
</center>
<p>(Image from docs.microsoft.com)</p>
<h1 id="containers-versus-vms">Containers versus VMs</h1>
<p>Let's see a schematic of what is going on with a container.</p>
<center>
<img src="vm_vs_container.png">
</center>
<p>(Image from Tin Ho, github.com/tin6150)</p>
<p>VMs have a copy of the entire operating system and need to be booted up, while containers use the Linux kernel of the host machine and processes running in the container can be seen as individual processes on the host.</p>
<h1 id="why-use-containers">Why use containers?</h1>
<ul>
<li>Portability - install once, run "anywhere".</li>
<li>Control your environment/software on systems (such as Savio, XSEDE) you don't own.</li>
<li>Manage complex dependencies/installations by using containers for modular computational workflows/pipelines, one workflow per container.</li>
<li>Provide a reproducible environment:
<ul>
<li>for yourself in the future,</li>
<li>for others (e.g., your collaborators),</li>
<li>for software you develop and want to distribute,</li>
<li>for a publication.</li>
</ul></li>
<li>Flexibility in using various OS, software, or application versions:
<ul>
<li>use outdated or updated versions of software or OS</li>
<li>use an OS you don't have.</li>
<li>test your code on various configurations or OSes.</li>
</ul></li>
<li>High performance compared to VMs.</li>
</ul>
<p>Much of this comes down to the fact that your workflow can depend in a fragile way on one or more pieces of software that may be difficult to install or keep operational.</p>
<h1 id="limitations-of-containers">Limitations of containers</h1>
<ul>
<li>Another level of abstraction/indirection can be confusing
<ul>
<li>'Where' am I?</li>
<li>Where are my files?</li>
</ul></li>
<li>Can run into host-container incompatibilities (e.g., MPI, GPUs)</li>
<li>Limitations in going between CPU architectures (e.g., x86_64 versus ARM)</li>
</ul>
<h1 id="docker-vs.-singularity-1">Docker vs. Singularity (1)</h1>
<p>What is Docker?</p>
<ul>
<li>Open-source computer software that encapsulates an application and all its dependencies into a single image, as a series of layers</li>
<li>Brings containerization to the individuals on their own machines</li>
<li>Rich image repository</li>
<li>Widely used by scientific communities</li>
<li>Security concerns make it unsuitable for the HPC environment</li>
<li>By default you are root in the container</li>
</ul>
<h1 id="docker-vs.-singularity-2">Docker vs. Singularity (2)</h1>
<p>What is Singularity?</p>
<ul>
<li>Open-source computer software that encapsulates an application and all its dependencies into a single image, as a single file</li>
<li>Brings containerization to Linux clusters and HPC</li>
<li>Developed at LBL by Greg Kurtzer</li>
<li>Typically users have a machine on which they have admin privileges and can build images but <em>don't</em> have admin privileges where the containers are run</li>
<li>You are yourself (from the host machine) in the container</li>
</ul>
<h1 id="docker-vs.-singularity-3">Docker vs. Singularity (3)</h1>
<p>How can Singularity leverage Docker?</p>
<ul>
<li>Create and run a Singularity container based on a Docker image
<ul>
<li>From DockerHub</li>
<li>By archiving a Docker image (and transferring to Savio)</li>
</ul></li>
<li>Create Singularity images by running Singularity in a Docker container</li>
</ul>
<h1 id="examples-of-where-containers-are-used">Examples of where containers are used</h1>
<ul>
<li>Kubernetes runs pods based on Docker images</li>
<li>MyBinder creates an executable environment by building a Docker image</li>
<li>You can have GitHub Actions and Bitbucket Pipelines in a Docker container</li>
<li>CodeOcean capsules are built on Docker images</li>
</ul>
<h1 id="running-pre-existing-containers-using-singularity-pulling">Running pre-existing containers using Singularity: Pulling</h1>
<ul>
<li>No root/sudo privilege is needed</li>
<li>Download or build immutable squashfs images/containers</li>
</ul>
<div class="sourceCode"><pre class="sourceCode bash"><code class="sourceCode bash"><span class="ex">singularity</span> pull --help</code></pre></div>
<ul>
<li>Pull a container from DockerHub.</li>
</ul>
<div class="sourceCode"><pre class="sourceCode bash"><code class="sourceCode bash"><span class="ex">singularity</span> pull docker://ubuntu:18.04
<span class="ex">singularity</span> pull docker://rocker/r-base:latest
<span class="ex">singularity</span> pull docker://postgres
<span class="fu">ls</span> -lrt <span class="kw">|</span> <span class="fu">tail</span> -n 10 # careful of your quota!
<span class="fu">ls</span> -l ~/.singularity
<span class="ex">singularity</span> cache list</code></pre></div>
<h1 id="running-pre-existing-containers-using-singularity-running">Running pre-existing containers using Singularity: Running</h1>
<ul>
<li>Now run the container</li>
</ul>
<div class="sourceCode"><pre class="sourceCode bash"><code class="sourceCode bash"><span class="ex">singularity</span> run ubuntu_18.04.sif # use downloaded image file
<span class="co">## alternatively, use ~/.singularity/cache</span>
<span class="ex">singularity</span> run docker://ubuntu:18.04 </code></pre></div>
<p>Note the change in prompt.</p>
<div class="sourceCode"><pre class="sourceCode bash"><code class="sourceCode bash"><span class="fu">cat</span> /etc/issue # not the Savio OS!
<span class="fu">which</span> python # not much here!
<span class="bu">pwd</span>
<span class="bu">echo</span> <span class="st">"written from the container"</span> <span class="op">></span> junk.txt
<span class="fu">ls</span> -l /global/home/users/paciorek <span class="kw">|</span> <span class="fu">head</span> -n 5
<span class="bu">exit</span>
<span class="fu">cat</span> /global/home/users/paciorek/junk.txt</code></pre></div>
<div class="sourceCode"><pre class="sourceCode bash"><code class="sourceCode bash"><span class="ex">singularity</span> run docker://rocker/r-base # easy!
<span class="ex">singularity</span> run docker://postgres # sometimes things are complicated!</code></pre></div>
<h1 id="other-ways-of-running-a-container">Other ways of running a container</h1>
<ul>
<li>Singularity Hub: If no tag is specified, the master branch of the repository is used</li>
</ul>
<div class="sourceCode"><pre class="sourceCode bash"><code class="sourceCode bash">$ <span class="ex">singularity</span> pull hello-world.sif shub://singularityhub/hello-world
$ <span class="ex">singularity</span> run hello-world.sif</code></pre></div>
<p>Here's how one runs a Docker container (on a system where you have admin access and Docker installed):</p>
<div class="sourceCode"><pre class="sourceCode bash"><code class="sourceCode bash"><span class="bu">echo</span> <span class="va">$HOME</span>
<span class="ex">docker</span> run -it --rm rocker/r-base bash
<span class="bu">pwd</span>
<span class="fu">ls</span> /accounts/gen/vis/paciorek # no automatic mount of my host home directory</code></pre></div>
<h1 id="different-ways-of-using-a-singularity-container">Different ways of using a Singularity container</h1>
<ul>
<li><p><strong>shell</strong> sub-command: invokes an interactive shell within a container</p>
<div class="sourceCode"><pre class="sourceCode bash"><code class="sourceCode bash"><span class="ex">singularity</span> shell hello-world.sif</code></pre></div></li>
<li><p><strong>run</strong> sub-command: executes the container’s runscript</p>
<div class="sourceCode"><pre class="sourceCode bash"><code class="sourceCode bash"><span class="ex">singularity</span> run hello-world.sif</code></pre></div></li>
<li><p><strong>exec</strong> sub-command: execute an arbitrary command within container</p>
<div class="sourceCode"><pre class="sourceCode bash"><code class="sourceCode bash"><span class="ex">singularity</span> exec hello-world.sif cat /etc/os-release</code></pre></div></li>
</ul>
<p>Let's see what we can find out about this image:</p>
<div class="sourceCode"><pre class="sourceCode bash"><code class="sourceCode bash"><span class="ex">singularity</span> inspect -r hello-world.sif</code></pre></div>
<h1 id="container-processes-on-the-host-system">Container processes on the host system</h1>
<p>Let's see how the container processes show up from the perspective of the host OS.</p>
<p>We'll run some intensive linear algebra in R.</p>
<div class="sourceCode"><pre class="sourceCode bash"><code class="sourceCode bash"><span class="ex">singularity</span> run docker://rocker/r-base:latest</code></pre></div>
<div class="sourceCode"><pre class="sourceCode r"><code class="sourceCode r">a <-<span class="st"> </span><span class="kw">matrix</span>(<span class="kw">rnorm</span>(<span class="dv">10000</span><span class="op">^</span><span class="dv">2</span>), <span class="dv">10000</span>)
<span class="kw">system.time</span>(<span class="kw">chol</span>(<span class="kw">crossprod</span>(a)))</code></pre></div>
<p>We see in <code>top</code> that the R process running in the container shows up as an R process on the host.</p>
<h1 id="accessing-the-savio-filesystems-and-bind-paths">Accessing the Savio filesystems and bind paths</h1>
<ul>
<li>Singularity allows mapping directories on host to directories within container via bind paths</li>
<li>This enables easy data access within containers</li>
<li>System-defined (i.e., automatic) bind paths on Savio
<ul>
<li><code>/global/home/users/</code></li>
<li><code>/global/scratch/</code></li>
<li><code>/tmp</code></li>
</ul></li>
<li>User can define own bind paths:
<ul>
<li>mount /host/path/ on the host to /container/path inside the container</li>
<li><code>-B /host/path/:/container/path</code></li>
</ul></li>
</ul>
<div class="sourceCode"><pre class="sourceCode bash"><code class="sourceCode bash"><span class="fu">ls</span> /global/scratch/paciorek/wikistats_small
<span class="ex">singularity</span> shell -B /global/scratch/paciorek/wikistats_small:/data hello-world.sif
<span class="fu">ls</span> /data
<span class="fu">touch</span> /data/erase-me
<span class="bu">exit</span>
<span class="fu">ls</span> -l /global/scratch/paciorek/wikistats_small</code></pre></div>
<p>In general one would do I/O to files on the host system rather than writing into the container.</p>
<p>It is possible to create writeable containers.</p>
<h1 id="running-containers-on-savio-via-slurm">Running containers on Savio via Slurm</h1>
<p>You can run Singularity within an <code>sbatch</code> or <code>srun</code> session.</p>
<p>Here's a basic job script.</p>
<div class="sourceCode"><pre class="sourceCode bash"><code class="sourceCode bash"><span class="co">#!/bin/bash </span>
<span class="co">#SBATCH --job-name=container-test </span>
<span class="co">#SBATCH --partition=savio2 </span>
<span class="co">#SBATCH --account=co_stat </span>
<span class="co">#SBATCH --time=5:00 </span>
<span class="ex">singularity</span> exec hello-world.sif cat /etc/os-release</code></pre></div>
<h1 id="sources-of-container-images-registries">Sources of container images (registries)</h1>
<ul>
<li><a href="https://hub.docker.com">DockerHub</a></li>
<li><a href="https://singularity-hub.org">SingularityHub (future is up in the air)</a></li>
<li><a href="https://cloud.sylabs.io/library">Sylabs container registry</a></li>
</ul>
<p>DockerHub images are named liked this: OWNER/CONTAINERNAME:TAG.</p>
<p>Let's see an <a href="https://hub.docker.com/u/continuumio">example of the Continuum images</a>. Here's a specific example with <a href="https://hub.docker.com/r/continuumio/miniconda3/tags">various tags</a>.</p>
<p>(For images provided directly by Docker, you don't specify the OWNER.)</p>
<h1 id="approaches-to-building">Approaches to Building</h1>
<ul>
<li>Build a Docker image and convert
<ul>
<li>Convenient if you already have a Docker build file</li>
</ul></li>
<li>Build from Singularity definition file
<ul>
<li>Bootstrap from another Singularity container, Docker image, or supported base OS</li>
<li>Alows extra customization with directives</li>
</ul></li>
</ul>
<h1 id="building-a-docker-container-demo">Building a Docker Container (Demo)</h1>
<pre class="docker"><code>FROM centos:7
RUN yum install -y epel-release && yum install -y cowsay
ENTRYPOINT ["/usr/bin/cowsay"]</code></pre>
<p>See <code>cowsay-entrypoint</code> and <code>cowsay-cmd</code> in this repository.</p>
<p>Build the container:</p>
<div class="sourceCode"><pre class="sourceCode bash"><code class="sourceCode bash"><span class="ex">docker</span> build -t ghcr.io/nicolaschan/cowsay-entrypoint:latest -f cowsay-entrypoint .
<span class="ex">docker</span> build -t ghcr.io/nicolaschan/cowsay-cmd:latest -f cowsay-cmd .</code></pre></div>
<h1 id="running-docker-container-demo">Running Docker Container (Demo)</h1>
<p>ENTRYPOINT Docker container:</p>
<div class="sourceCode"><pre class="sourceCode bash"><code class="sourceCode bash"><span class="ex">docker</span> run ghcr.io/nicolaschan/cowsay-entrypoint hi</code></pre></div>
<p>CMD Docker container (the following do the same thing):</p>
<div class="sourceCode"><pre class="sourceCode bash"><code class="sourceCode bash"><span class="ex">docker</span> run ghcr.io/nicolaschan/cowsay-cmd
<span class="ex">docker</span> run ghcr.io/nicolaschan/cowsay-cmd cowsay hi</code></pre></div>
<h1 id="pushing-to-docker-registry">Pushing to Docker Registry</h1>
<div class="sourceCode"><pre class="sourceCode bash"><code class="sourceCode bash"><span class="co"># use docker login to login according to the registry you are using</span>
<span class="ex">docker</span> push ghcr.io/nicolaschan/cowsay-entrypoint:latest
<span class="ex">docker</span> push ghcr.io/nicolaschan/cowsay-cmd:latest</code></pre></div>
<p>You can use the Docker container registry of your choice or deploy your own registry: <a href="https://docs.docker.com/registry/deploying/" class="uri">https://docs.docker.com/registry/deploying/</a></p>
<h1 id="converting-docker-to-singularity">Converting Docker to Singularity</h1>
<ul>
<li><code>singularity run docker://ghcr.io/nicolaschan/cowsay-entrypoint hi</code></li>
<li><code>singularity build cowsay.simg docker://ghcr.io/nicolaschan/cowsay-entrypoint</code></li>
<li><code>build</code> behaves similarly to <code>pull</code> in this context</li>
</ul>
<p>Reference: <a href="https://github.com/ucb-rit/savio-singularity-template/blob/master/build_examples.md" class="uri">https://github.com/ucb-rit/savio-singularity-template/blob/master/build_examples.md</a></p>
<h1 id="singularity-build-strategies">Singularity Build Strategies</h1>
<ul>
<li>Install Singularity locally (demo today)
<ul>
<li>Requires root access on your system</li>
</ul></li>
<li>Install Docker locally and use singularity-docker
<ul>
<li>https://github.com/singularityhub/singularity-docker</li>
<li>Note: You must specify the version/tag such as <code>:3.7.1</code> for these images</li>
</ul></li>
<li>Build an image on a cloud service or continuous integration host
<ul>
<li>Sylabs Remote Builder: https://cloud.sylabs.io/builder</li>
</ul></li>
</ul>
<h1 id="suggestionspitfalls">Suggestions/Pitfalls</h1>
<ul>
<li>Match CPU architecture of host where image is built and Savio</li>
<li>Savio is x86_64</li>
<li>Savio will try to bind mount the following paths by default (from <code>/etc/singularity/singularity.conf</code>):</li>
</ul>
<pre><code>/etc/passwd
/etc/resolv.conf
/proc
/sys
/dev
/tmp
bind path = /etc/localtime
bind path = /etc/hosts
bind path = /global/scratch
bind path = /global/home/users</code></pre>
<h1 id="singularity-definition-file">Singularity Definition File</h1>
<ol style="list-style-type: decimal">
<li><strong>Header</strong>: Base to build the container off of, such as an existing Docker/Singularity/OS image</li>
<li><strong>Sections</strong>: Denoted by a <code>%</code> which are executed once the container is built to configure it</li>
</ol>
<p>Reference: <a href="https://sylabs.io/guides/3.0/user-guide/definition_files.html" class="uri">https://sylabs.io/guides/3.0/user-guide/definition_files.html</a></p>
<h1 id="singularity-build-example">Singularity Build Example</h1>
<ul>
<li>Building simple alpine asciiquarium image: <code>alpine-example.def</code></li>
<li><code>%setup</code>: Executed on host system before container is built</li>
<li><code>%environment</code>: Set environment variables in the container</li>
<li><code>%post</code>: Executed within the container at build time</li>
<li><code>%runscript</code>: Executed with <code>singularity run busybox-example.simg</code> or <code>./busybox-example.simg</code></li>
</ul>
<p>Reference: <a href="https://github.com/ucb-rit/savio-singularity-template" class="uri">https://github.com/ucb-rit/savio-singularity-template</a></p>
<h1 id="singularity-definition-file-example">Singularity Definition File Example</h1>
<pre><code>Bootstrap: docker
From: alpine:latest
%setup
# Executed on host system before container is built
echo "Hello from setup"
%environment
export MY_VAR=my_var_value
%post
# Executed within the container at build time
echo "Post starting"
apk add asciiquarium
mkdir -p /app
echo 'echo $MY_VAR' >> /app/hello.sh
echo "Post finished"
%runscript
# Executed with `singularity run alpine-example.simg` or `./alpine-example.simg`
asciiquarium</code></pre>
<h1 id="singularity-build-example-demo">Singularity Build Example (Demo)</h1>
<p>On local machine, using files from this repository: <a href="https://github.com/ucb-rit/savio-training-containers-2021" class="uri">https://github.com/ucb-rit/savio-training-containers-2021</a></p>
<div class="sourceCode"><pre class="sourceCode bash"><code class="sourceCode bash"><span class="fu">sudo</span> singularity build alpine-example.simg alpine-example.def
<span class="ex">singularity</span> run alpine-example.simg
<span class="fu">scp</span> alpine-example.simg nicolaschan@dtn.brc.berkeley.edu:.</code></pre></div>
<p>On Savio:</p>
<div class="sourceCode"><pre class="sourceCode bash"><code class="sourceCode bash"><span class="ex">singularity</span> run alpine-example.simg
<span class="ex">singularity</span> exec alpine-example.simg sh
<span class="bu">echo</span> <span class="va">$MY_VAR</span></code></pre></div>
<p>If using singularity-docker:</p>
<div class="sourceCode"><pre class="sourceCode bash"><code class="sourceCode bash"><span class="ex">docker</span> run --privileged -t --rm -v <span class="va">$PWD</span>:/app quay.io/singularity/singularity:v3.7.1 \
build /app/alpine-example.simg /app/alpine-example.def</code></pre></div>
<h1 id="rewritablesandbox-singularity-images">Rewritable/Sandbox Singularity Images</h1>
<ul>
<li>Advantages:
<ul>
<li>Easier debugging (software installs, existing images, etc.)</li>
<li>Container can be built on Savio</li>
</ul></li>
<li>Disadvantages:
<ul>
<li>Harder to rebuild reproducibly</li>
<li>Some processes still require root (not just permissions for files)</li>
</ul></li>
</ul>
<h1 id="rewritablesandbox-images-demo">Rewritable/Sandbox Images (Demo)</h1>
<p>On Savio:</p>
<div class="sourceCode"><pre class="sourceCode bash"><code class="sourceCode bash"><span class="ex">singularity</span> build --sandbox alpine-sandbox docker://alpine
<span class="ex">singularity</span> shell --writable alpine-sandbox
<span class="bu">echo</span> <span class="st">"echo hi"</span> <span class="op">></span> /bin/hi
<span class="fu">chmod</span> +x /bin/hi
<span class="bu">exit</span>
<span class="ex">singularity</span> build alpine-sandbox.sif alpine-sandbox/
<span class="ex">./alpine-sandbox.sif</span>
<span class="ex">hi</span></code></pre></div>
<p>Reference: <a href="https://sylabs.io/guides/3.0/user-guide/build_a_container.html#creating-writable-sandbox-directories" class="uri">https://sylabs.io/guides/3.0/user-guide/build_a_container.html#creating-writable-sandbox-directories</a></p>
<h1 id="pushing-to-singularity-registry">Pushing to Singularity Registry</h1>
<p>Similar to Docker registry, you can use the Singularity registry of your choice. This can be a convenient way to manage your images and transferring them to/from Savio (but normal file transfers also work). For details, see <a href="https://sylabs.io/guides/3.0/user-guide/build_a_container.html#creating-writable-sandbox-directories">https://sylabs.io/guides/3.1/user-guide/cli/singularity_push.html</a></p>
<h1 id="outline-of-mpi-and-gpu-containers">Outline of MPI and GPU Containers</h1>
<ul>
<li>Build singularity containers from a definition file</li>
<li>Run MPI containers
<ul>
<li>Rely on MPI library soly within the container: single node only</li>
<li>Rely on the MPI library available on the host, multiple nodes possible</li>
</ul></li>
<li><p>MPI library version compatibility on the host and within containers</p></li>
<li>Build GPU singularity containers from a docker image</li>
<li>Run GPU containers</li>
<li><p>NVIDIA driver and CUDA library version compatibility</p></li>
</ul>
<h1 id="mpi-application">MPI Application</h1>
<ul>
<li>MPI application example: <a href="samples/mpitest.c">mpitest.c</a></li>
</ul>
<pre><code>[wfeinstein@n0000 singularity] salloc -p lr6 -A account_xxx -N 2 -q lr_normal -t 2:0:0
salloc: Pending job allocation 30456801
salloc: job 30456801 queued and waiting for resources
...
salloc: Nodes n0098.lr6,n0099.lr6 are ready for job
[wfeinstein@n0000 singularity-test]$ echo $SLURM_NODELIST |tr ',' '\n' > host
n0098.lr6
n0099.lr6
[wfeinstein@n0000 singularity]$ mpirun -np 4 --hostfile host -npernode 2 mpitest
Hello, I am on n0098.lr6 rank 3/4
Hello, I am on n0099.lr6 rank 3/4
Hello, I am on n0098.lr6 rank 2/4
Hello, I am on n0099.lr6 rank 2/4
</code></pre>
<h1 id="build-mpi-singularity-containers">Build MPI singularity containers</h1>
<ul>
<li><p>Definition file<br />
<a href="samples/SINGULARITY-mpi3.1.0.def">SINGULARITY-mpi3.1.0.def</a></p></li>
<li><p>Build MPI container locally</p>
<pre><code>sudo singularity build mpi3.1.0.sif SINGULARITY-mpi3.1.0.def</code></pre></li>
<li>Transfer mpi3.1.0.sif to your preferred cluster</li>
<li><p>Check out the container</p></li>
</ul>
<pre><code>[wfeinstein@n0000 singularity-test]$ singularity shell mpi3.1.0.sif
Singularity mpi3.1.0.sif:/global/scratch/wfeinstein/singularity-test> ls /opt/
mpitest mpitest.c ompi
Singularity mpi3.1.0.sif:/global/scratch/wfeinstein/singularity-test> /opt/ompi/bin/mpirun --version
mpirun (Open MPI) 3.1.0
Singularity mpi3.1.0.sif:/global/scratch/wfeinstein/singularity-test> /opt/ompi/bin/mpirun -np 2 /opt/mpitest
Hello, I am on n0000.scs00 rank 0/2
Hello, I am on n0000.scs00 rank 1/2</code></pre>
<h1 id="run-mpi-containers">Run MPI containers</h1>
<ul>
<li>Approach 1
<ul>
<li>Launch MPI tasks within a container, no dependece on host</li>
<li>However can't expand to multiple nodes</li>
</ul></li>
</ul>
<pre><code>[wfeinstein@n0000 singularity-test]$ singularity exec mpi3.1.0.sif /opt/ompi/bin/mpirun -np 2 /opt/mpitest
Hello, I am on n0000.scs00 rank 0/2
Hello, I am on n0000.scs00 rank 1/2
[wfeinstein@n0000 singularity-test]$ module list
No Modulefiles Currently Loaded.</code></pre>
<ul>
<li>Approach 2
<ul>
<li>Launch MPI processes from the host</li>
<li>Rely on the MPI implementations provided on the host</li>
<li>Can expand to multiple nodes</li>
</ul></li>
</ul>
<pre><code>[wfeinstein@n0000 singularity-test]$ mpirun -np 64 --hostfile host singularity exec mpi3.1.0.sif /opt/mpitest
...
Hello, I am on n0099.lr6 rank 34/64
Hello, I am on n0098.lr6 rank 27/64
Hello, I am on n0099.lr6 rank 41/64
Hello, I am on n0098.lr6 rank 31/64
...
[wfeinstein@n0000 singularity-test]$ module list
Currently Loaded Modulefiles:
1) gcc/9.2.0 2) openmpi-gcc/3.1.0</code></pre>
<h1 id="mpi-version-compatibility-1">MPI version compatibility (1)</h1>
<ul>
<li>Mismatch of MPI libaries on the host and within the container breaks container</li>
<li>Extra caution to ensure MPI library compatibility</li>
</ul>
<pre><code>[wfeinstein@n0000 singularity-test]$ ls *.sif
mpi2.0.4.sif mpi3.0.1.sif mpi3.1.0.sif mpi4.0.1.sif
[wfeinstein@n0000 singularity-test]$ module list
Currently Loaded Modulefiles:
1) gcc/9.2.0 2) openmpi/4.0.1-gcc
[wfeinstein@n0000 singularity-test]$ mpirun -np 4 --hostfile host singularity exec mpi3.0.1.sif /opt/mpitest
Hello, I am on n0098.lr6 rank 2/4
Hello, I am on n0098.lr6 rank 1/4
Hello, I am on n0098.lr6 rank 3/4
Hello, I am on n0098.lr6 rank 0/4
[wfeinstein@n0000 singularity-test]$ mpirun -np 4 --hostfile host singularity exec mpi4.0.1.sif /opt/mpitest
Hello, I am on n0098.lr6 rank 2/4
Hello, I am on n0098.lr6 rank 1/4
Hello, I am on n0098.lr6 rank 3/4
Hello, I am on n0098.lr6 rank 0/4
[wfeinstein@n0000 singularity-test]$ mpirun -np 4 --hostfile host singularity exec mpi2.0.4.sif /opt/mpitest
--------------------------------------------------------------------------
It looks like MPI_INIT failed for some reason; your parallel process is
likely to abort. There are many reasons that a parallel process can
....</code></pre>
<h1 id="mpi-version-compatibility-2">MPI version compatibility (2)</h1>
<pre><code>[wfeinstein@n0000 singularity-test]$ module list
Currently Loaded Modulefiles:
1) gcc/6.3.0 2) openmpi/3.0.1-gcc
[wfeinstein@n0000 singularity-test]$ mpirun -np 4 --hostfile host singularity exec mpi3.0.1.sif /opt/mpitest
Hello, I am on n0098.lr6 rank 2/4
Hello, I am on n0098.lr6 rank 3/4
Hello, I am on n0098.lr6 rank 0/4
Hello, I am on n0098.lr6 rank 1/4
[wfeinstein@n0000 singularity-test]$ mpirun -np 4 --hostfile host singularity exec mpi4.0.1.sif /opt/mpitest
[n0098.lr6:115220] PMIX ERROR: OUT-OF-RESOURCE in file client/pmix_client.c at line 225
[n0098.lr6:115220] OPAL ERROR: Error in file pmix3x_client.c at line 112
...</code></pre>
<h1 id="gpu-containers">GPU containers</h1>
<ul>
<li>Singularity supports NVIDIA’s CUDA GPU compute framework or AMD’s ROCm solution</li>
<li>Userspace NVIDIA driver components from the host are dynamically mounted in the container at runtime
<ul>
<li>--nv enables NVIDIA GPU support by providing the driver on the host</li>
</ul></li>
<li>NVIDIA driver not present in the container image itself</li>
<li>Application built against a CUDA toolkit has a minimal host NVIDIA driver requirement
<ul>
<li>e.g., CUDA/11.2 requires NVIDIA driver >= R450</li>
</ul></li>
<li>Minimal driver requirement for a specific version of the CUDA runtime/toolkit, <a href="https://docs.nvidia.com/deploy/cuda-compatibility/index.html">check it out here</a></li>
</ul>
<h1 id="gpu-container-examples">GPU container examples</h1>
<ul>
<li><p>Build PyTorch GPU containers from the <a href="https://ngc.nvidia.com/catalog/containers/nvidia:pytorch">NGC registry</a></p>
<pre><code>docker pull nvcr.io/nvidia/pytorch:21.03-py3:21.03-py3
singularity build pytorch_21.03_py3.sif docker-daemon://nvcr.io/nvidia/pytorch:21.03-py3</code></pre></li>
<li><p>Run PyTorch on a GPU node</p></li>
</ul>
<pre><code>[wfeinstein@n0043 pytorch]$ nvidia-smi -L
GPU 0: Tesla V100-SXM2-32GB (UUID: GPU-df6fb04c-b0a4-69cc-98a2-763783d4e152)
GPU 1: Tesla V100-SXM2-32GB (UUID: GPU-3c41dc39-df54-1582-4d27-6c8454ed96c6)
[wfeinstein@n0043 singularity-test]$ nvidia-smi
Mon Apr 19 00:32:43 2021
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 440.44 Driver Version: 440.44 CUDA Version: 10.2 |
...
[wfeinstein@n0043 pytorch]$ cat pytorch_test.py
import torch;
print(torch.__version__);
print(torch.cuda.get_device_name(0));
print(torch.version.cuda)
[wfeinstein@n0043 pytorch]$ singularity exec --nv pytorch_21.03_py3.sif python pytorch_test.py
1.9.0a0+df837d0
Tesla V100-SXM2-32GB
11.2
[wfeinstein@n0043 pytorch]$ singularity exec --nv pytorch_19_12_py3.sif python pytorch_test.py
1.4.0a0+a5b4d78
Tesla V100-SXM2-32GB
10.2</code></pre>
<h1 id="high-level-view-containerization-of-scientific-workflows">High-level View: Containerization of Scientific Workflows</h1>
<p>Benefits of Singularity</p>
<ul>
<li>Single image file = easily shareable</li>
<li>Close to the hardware <strong>(runs at kernel host level)</strong></li>
<li>Bring Your Own Software (BYOS) <strong>Install whatever you want inside</strong></li>
<li>Simultaneous use of multiple HPC clusters</li>
<li>Get more citations for your software!</li>
</ul>
<h1 id="non-resource-intensive">Non-resource intensive</h1>
<p><strong>Especially compared to VMs</strong></p>
<p>Unlike VMs that replicate an entire OS and all dependencies, containers run as light-weight apps on the kernel host.</p>
<p>Singularity was developed to run "close to the hardware".</p>
<p>E.g. Singularity images run off of the kernel host level, thus, they suffer minimal performance loss.</p>
<h1 id="shareable">Shareable</h1>
<p>Further, distributing multiple images is a non-intensive effort jump from running a single Singularity image.</p>
<p>Nearly all NSF-funded XSEDE clusters have Singularity installed on them. To deploy your encapsulated workflow from a single cluster, <strong>e.g. Savio</strong>, to multiple simultaneously only requires a single <em>FILE.sif</em> to be transferred from your machine or the Savio cluster filesystem to the various XSEDE clusters.</p>
<p>FILE.sif files can be transfered with both CLI, <em>e.g. scp commands</em>, or GUI, <em>e.g. Globus</em>, interfaces.</p>
<p><em>Copy .tar to one of several XSEDE clusters</em></p>
<p>The following example demostrates using the CLI <em>(e.g. Terminal)</em> to copy a .tar from a cluster or personal machine to XSEDE's Bridges cluster.</p>
<pre><code> $ scp MYFILE.tar USERNAME@bridges.psc.edu
$ ssh USERNAME@bridges.psc.edu
$ singularity build myimage.sif docker-archive://MYFILE.tar</code></pre>
<h1 id="portability">Portability</h1>
<p><em>Scientific and HPC use-case</em> = researchers need to run jobs across whatever resources they can get to obtain results.</p>
<p>Singularity images can run on any system with the same architecture that the image was made in.</p>
<p>E.g. An image made on a computer with a x86-64 architecture can be run on any other system / computer with a x86-64 architecture.</p>
<h1 id="reproducible">Reproducible</h1>
<p>Singularity images encapsulate your code, software dependencies, data, documentation, licenses, etc.</p>
<p>A singularity image can have a <em>DOI</em> and thus be used for publications.</p>
<p>This approach can dramatically increase your citation count!</p>
<p>There is exponentially more and more interest and attention in citing and publishing software.</p>
<h1 id="other-container-resources">Other container resources</h1>
<ul>
<li><a href="https://education.sdsc.edu/training/interactive/202101_intro_to_singularity/">San Diego supercomputing center training (big picture presentation)</a></li>
<li><a href="https://github.com/XSEDE/Container_Tutorial">XSEDE tutorial</a></li>
<li><a href="https://carpentries-incubator.github.io/docker-introduction/index.html">Software Carpentries Docker training</a></li>
<li><a href="https://carpentries-incubator.github.io/singularity-introduction/">Software Carpentries Singularity training</a></li>
</ul>
</body>
</html>