Skip to content

inrae/jupyterhub-vm

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

72 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Purpose

Building automation of a virtual machine (VM) from a server install image of ubuntu (ISO file)

  • based on ubuntu 22.04 LTS
  • containing R software version 4.5 with useful packages and The Littlest JupyerHub (TLJH)
  • with the help of Packer, Vagrant and Ansible tools
  • for using with VirtualBox or OpenStack.

For more details on the whole process, see https://inrae.github.io/jupyterhub-vm/

Creation and configuration of a virtual machine

Requires VirtualBox, Packer, Vagrant to be installed beforehand.

  • VirtualBox: this is what we call the provider. If the objective is to use the VM on his desktop computer, then the VM will have to run in VirtualBox. If the objective is to use the VM in the cloud (OpenStack for example), then VirtualBox is only used here as an intermediary to build the VM.

  • Packer : allows the creation of a virtual machine from an ISO, having a very precise control over its characteristics. Here it will allow us to build a VM compatible with the Vagrant tool, called a box.

  • Vagrant : allows building virtual machines from basic building blocks called boxes for Providers by provisioning them by Provisioners such as Ansible.

  • Ansible which is a powerfull tool allowing to describe tasks using Playbooks, then turn tough tasks into repeatable playbooks. It is not necessary to install Ansible beforehand. It will be installed temporarily on the virtual machine to proceed the provisionning. It will be removed at the end of the VM creation.

The entire process is summarized below in diagram form:

Overview


  • Configuration : The set of configuration files and scripts included in this repository, used for the automatic generation of the virtual machine.
  • Creation : The set of applications to install on your local machine (see above).
  • Storage and Instantiation : External infrastructure dedicated to hosting images and/or instances.
  • Input: an ISO file corresponding to the chosen operating system, downloaded from the Internet
  • Output: an instance of the operational virtual machine on an OpenStack Cloud (e.g. Genouest OpenStack cloud).

  • Before proceeding with each of the steps described below, you must first retrieve all configuration files and scripts.
git clone https://github.com/inrae/jupyterhub-vm.git

Implementation : The workflow implementation described below was carried out with Packer v1.15.0, Vagrant 2.4.9 and VirtualBox 7.2.6. The environment being tested is under Windows 11 25H2 64-bit with Cygwin 3.4.10. However, using a recent version of Ubuntu would be highly advantageous.


1 - Get the ISO file

ISO : https://releases.ubuntu.com/22.04/ubuntu-22.04.5-live-server-amd64.iso
CHECKSUM : sha256:9bc6028870aef3f74f4e16b900008179e78b130e6b0b9a140635434a46aa98b0

2 - Create the Base Box

cd  ./jupyterhub-vm
time packer build box-config.json | tee ./logs/packer.log
  • The base box should now be located in the -./builds- directory and be named -virtualbox-ubuntu2204.box_.

  • You can now delete the ISO file as it will no longer be needed in the following steps.


3 - Store Base Box in Vagrant Cloud

  • In order to be able to use this base box in several projects, the best option is to store it in the Vagrant cloud. To do this, use the web interface. Before uploading your base box you must complete the following tasks (if not yet done) : 1) create an account, 2) create a projet, 3) create a registry within the projet. Then 4) create a base box. See https://developer.hashicorp.com/vagrant/vagrant-cloud/boxes/create.

  • Here we have created the base boxe referenced as djreg/small-ubuntu2204


4 - Create Final VM

  • The tested version is Vagrant 2.4.9

  • Based on :

    • Base Box : the base box stored in the Vagrant Cloud (see previous step)
    • Vagrantfile : describes the type of the machine and how to configure and provision it.
    • ansible : configures the installation of the VM and the packages, modules, etc.
  • You must first install the plugin corresponding to the provider (VirtualBox) if not yet done

  • You have also to create a new VirtualBox Host-Only Ethernet Adapter

vagrant plugin install virtualbox
  • Then, you can now build the final VM
time vagrant up | tee logs/vagrant.log
  • At this stage, you can use the final VM given that it is running on the provider (VirtualBox). So you can connect on it using ssh command (login=vagrant, password=vagrant):
ssh -p 2222 vagrant@127.0.0.1
  • You can also access the JupyterHub web interface at http://192.168.99.1/ (or another IP address depending on the one specified in the Vagrantfile and ansible/vars/all.yml files).

  • Note 1 : The vagrant password can be changed in the http/user-data help with the mkpasswd command. See more details.

  • Note 2 : If you wish, you can add one or more SSH keys to the scripts/ssh_keys file, which will then be associated with the root account. This will allow you to log in directly as root. Very practical in development mode but to be avoided in production mode, given that the vagrant account already has full rights with the sudo mechanism.

  • Note 3 : A shell script (/usr/local/bin/install_R_pkgs) has been created to install a set of R packages from various sources (CRAN, bioconductor, github, ...). This script can be edited either before building the VM or afterward within the VM itself. However, in both cases, it must be executed from within the VM. This allows for a more generic and smaller VM, and enables the creation of multiple instances from the same image for different uses, i.e., for different application domains. Once connected to the virtual machine, you can run the following command to install all R packages :

    time sudo install_R_pkgs | tee /var/log/install_R_pkgs.log
    

5 - Export Final VM

  • Export the final VM as a TAR archive (tar.gz format). It will included the VMDK VM file (ubuntu2204-disk001.vmdk)
time vagrant package --output ./builds/ubuntu2204-box.tar.gz | tee -a ./logs/vagrant.log


6 - Upload Final VM on an OpenStack cloud

  • First you must extract the VMDK file of the virtual machine (ubuntu2204-disk001.vmdk) from the TAR archive. Put it under the same directory (i.e. ./builds)

  • Upload the final VM on a OpenStack cloud, based on :

  • Note : Depending on your network connection, this may take a long time (from 2 min. up to 30 min.).

time sh ./openstack/push_cloud.sh -c genostack | tee ./logs/genostack.log
  • You will be asked for a password
Please enter your OpenStack password, then [shift][Enter] :
  • Note 1 : Once the VM image has been placed in the cloud space and an instance created, you will need to edit the /usr/local/bin/get-hostname file to indicate either the full name of the instance or the IP address depending on what is needed to access it on the Internet. By default, the local IP address is provided. However, this may not work if the VM is behind a proxy.

  • Note 2 : You can go further and automate the creation of a functional instance on the cloud. See more details


7 - Do the housework on your local disk

  • Stop the VM if not yet done
vagrant halt -f default
  • Remove the final VM from the provider (VirtualBox)
vagrant destroy -f default
  • Remove the current virtual environment
rm -rf ./.vagrant
  • Delete the files corresponding to the base box and the final virtual machine (under ./builds)
rm -f ./builds/*
  • Optionally remove the base box from the local vagrant registry
rm -rf $HOME/.vagrant.d/boxes/djreg-*

Acknowledgements

We would like to thank the IFB GenOuest bioinformatics for providing storage and computing resources on its national life science Cloud.


Funded by:

  • 2020 : GAEV project via​ a SAPI INRAE call for project - See INRAE ​​Forge
  • 2026 : INRAE UR BIA-BIBS, Biopolymères Interactions Assemblage, plate-forme BIBS

License

Copyright (C) 2026 Daniel Jacob - INRAE

This program is free software: you can redistribute it and/or modify
it under the terms of the GNU General Public License as published by
the Free Software Foundation, either version 3 of the License, or
(at your option) any later version.

This program is distributed in the hope that it will be useful,
but WITHOUT ANY WARRANTY; without even the implied warranty of
MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
GNU General Public License for more details.

You should have received a copy of the GNU General Public License
along with this program.  If not, see <http://www.gnu.org/licenses/>.

About

JupyterHub based on a virtual machine built with the help of Packer, Vagrant and Ansible tools for using with VirtualBox or OpenStack.

Resources

License

Stars

Watchers

Forks

Packages

 
 
 

Contributors