Software install

Opening and running the Jupyter Julia notebook

Course slides and lecture material

All the course slides are Jupyter notebooks; browser-based computational notebooks.

Code cells are executed by putting the cursor into the cell and hitting shift + enter. For more info see the documentation.

Exercises and homework

The first two lecture's homework assignments will be Jupyter notebooks. You can import the notebooks from Moodle into your JupyterHub space. You can execute them on the JupyterHub or download them and run them them locally if you're already set-up.

For submission, you can directly submit the folder containing all notebooks of a lecture from within the JupyterHub/Moodle integration. From the homework task on Moodle, you should be able to launch the notebooks in your JupyterHub. Once the homework completed, you should be able to see the folders you have worked on from your JupyterHub within the submission steps on Moodle. See Logistics and Homework for details.

Starting from lecture 3, exercise scripts will be mostly standalone regular Julia scripts that have to be uploaded to your private GitHub repo (shared with the teaching staff only). Details in Logistics.

JupyterHub

You can access the JupyterHub from the General section in Moodle, clicking on JupyterHub

Upon login to the server, you should see the following launcher environment, including a notebook (file) browser, ability to create a notebook, launch a Julia console (REPL), or a regular terminal.

JupyterHub

โš ๏ธ Warning!
It is recommended to download your work as back-up before leaving the session.

Installing Julia v1.9 (or later)

There are two recommended ways to install Julia v1.9:

  1. Using the Juliaup Julia installer (preferred approach).

  2. Downloading the binaries for your platform from the Julia website (following the install directions provided under [help]).

๐Ÿ’ก Note
For Windows users: When installing Julia 1.9 on Windows, make sure to check the "Add PATH" tick or ensure Julia is on PATH (see [help]). Julia's REPL has a built-in shell mode you can access typing ; that natively works on Unix-based systems. On Windows, you can access the Windows shell by typing Powershell within the shell mode, and exit it typing exit, as described here.

Terminal + external editor

Ensure you have a text editor with syntax highlighting support for Julia. Sublime Text and Atom can be recommended.

From within the terminal, type

julia

to make sure that the Julia REPL (aka terminal) starts. Then you should be able to add 1+1 and verify you get the expected result. Exit with Ctrl-d.

Julia from Terminal

VS Code

If you'd enjoy a more IDE type of environment, check out VS Code. Follow the installation directions for the Julia VS Code extension.

VS Code Remote - SSH setup

VS Code's Remote-SSH extension allows you to connect and open a remote folder on any remote machine with a running SSH server. Once connected to a server, you can interact with files and folders anywhere on the remote filesystem (more).

  1. To get started, follow the install steps.

  2. Then, you can connect to a remote host, using ssh user@hostname and your password (selecting Remote-SSH: Connect to Host... from the Command Palette).

  3. Advanced options permit you to access a remote compute node from within VS Code.

๐Ÿ’ก Note
This remote configuration supports Julia graphics to render within VS Code's plot pane. However, this "remote" visualisation option is only functional when plotting from a Julia instance launched as Julia: Start REPL from the Command Palette. Displaying a plot from a Julia instance launched from the remote terminal (which allows, e.g., to include custom options such as ENV variables or load modules) will fail. To work around this limitation, select Julia: Connect external REPL from the Command Palette and follow the prompted instructions.
โš ๏ธ Warning!
The Remote-SSH setup is limited on Piz Daint because of a security issue, not allowing direct node execution nor supporting remote command execution which would be needed to correctly launch the Julia extension to allow for e.g. graphics redirection (more here).

Running Julia

First steps

Now that you have a running Julia install, launch Julia (e.g. by typing julia in the shell since it should be on path)

julia

Welcome in the Julia REPL (command window). There, you have 3 "modes", the standard

[user@comp ~]$ julia
               _
   _       _ _(_)_     |  Documentation: https://docs.julialang.org
  (_)     | (_) (_)    |
   _ _   _| |_  __ _   |  Type "?" for help, "]?" for Pkg help.
  | | | | | | |/ _` |  |
  | | |_| | | | (_| |  |  Version 1.9.3 (2023-08-24)
 _/ |\__'_|_|_|\__'_|  |  Official https://julialang.org/ release
|__/                   |

julia>

the shell mode by hitting ;, where you can enter Unix commands,

shell>

and the Pkg mode (package manager) by hitting ], that will be used to add and manage packages, and environments,

(@v1.9) pkg>

You can interactively execute commands in the REPL, like adding two numbers

julia> 2+2
4

julia>

Within this class, we will mainly work with Julia scripts. You can run them using the include() function in the REPL

julia> include("my_script.jl")

Alternatively, you can also execute a Julia script from the shell

julia -O3 my_script.jl

here passing the -O3 optimisation flag.

Package manager

The Pkg mode permits you to install and manage Julia packages, and control the project's environment.

Environments or Projects are an efficient way that enable portability and reproducibility. Upon activating a local environment, you generate a local Project.toml file that stores the packages and version you are using within a specific project (code-s), and a Manifest.toml file that keeps track locally of the state of the environment.

To activate an project-specific environment, navigate to your targeted project folder, launch Julia

mkdir my_cool_project
cd my_cool_project
julia

and activate it

julia> ]

(@v1.9) pkg>

(@v1.9) pkg> activate .
  Activating new environment at `~/my_cool_project/Project.toml`

(my_cool_project) pkg>

Then, let's install the Plots.jl package

(my_cool_project) pkg> add Plots

and check the status

(my_cool_project) pkg> st
      Status `~/my_cool_project/Project.toml`
  [91a5bcdd] Plots v1.22.3

as well as the .toml files

julia> ;

shell> ls
Manifest.toml Project.toml

We can now load Plots.jl and plot some random noise

julia> using Plots

julia> heatmap(rand(10,10))

Let's assume you're handed your my_cool_project to someone to reproduce your cool random plot. To do so, you can open julia from the my_cool_project folder with the --project option

cd my_cool_project
julia --project

Or you can rather activate it afterwards

cd my_cool_project
julia

and then,

julia> ]

(@v1.9) pkg> activate .
  Activating environment at `~/my_cool_project/Project.toml`

(my_cool_project) pkg>

(my_cool_project) pkg> st
      Status `~/my_cool_project/Project.toml`
  [91a5bcdd] Plots v1.22.3

Here we go, you can now share that folder with colleagues or with yourself on another machine and have a reproducible environment ๐Ÿ™‚

Multi-threading on CPUs

On the CPU, multi-threading is made accessible via Base.Threads. To make use of threads, Julia needs to be launched with

julia --project -t auto

which will launch Julia with as many threads are there are cores on your machine (including hyper-threaded cores). Alternatively set the environment variable JULIA_NUM_THREADS, e.g. export JULIA_NUM_THREADS=2 to enable 2 threads.

Julia on GPUs

The CUDA.jl module permits to launch compute kernels on Nvidia GPUs natively from within Julia. JuliaGPU provides further reading and introductory material about GPU ecosystems within Julia.

Julia MPI

The following steps permit you to install MPI.jl on your machine and test it:

  1. If Julia MPI is a dependency of a Julia project MPI.jl should have been added upon executing the instantiate command from within the package manager see here. If not, MPI.jl can be added from within the package manager (typing add MPI in package mode).

  2. Install mpiexecjl:

julia> using MPI

julia> MPI.install_mpiexecjl()
[ Info: Installing `mpiexecjl` to `HOME/.julia/bin`...
[ Info: Done!
  1. Then, one should add HOME/.julia/bin to PATH in order to launch the Julia MPI wrapper mpiexecjl.

  2. Running a Julia MPI code <my_script.jl> on np MPI processes:

$ mpiexecjl -n np julia --project <my_script.jl>
  1. To test the Julia MPI installation, launch the l8_hello_mpi.jl using the Julia MPI wrapper mpiexecjl (located in ~/.julia/bin) on, e.g., 4 processes:

$ mpiexecjl -n 4 julia --project ./l8_hello_mpi.jl
$ Hello world, I am 0 of 3
$ Hello world, I am 1 of 3
$ Hello world, I am 2 of 3
$ Hello world, I am 3 of 3
๐Ÿ’ก Note

On macOS, you may encounter this issue. To fix it, define following ENV variable:

$ export MPICH_INTERFACE_HOSTNAME=localhost

and add -host localhost to the execution script:

$ mpiexecjl -n 4 -host localhost julia --project ./hello_mpi.jl

For running Julia at scale on Piz Daint, refer to the Julia MPI GPU on Piz Daint section.

GPU computing on Piz Daint

GPU computing on Piz Daint at CSCS. The supercomputer Piz Daint is composed of about 5700 compute nodes, each hosting a single Nvidia P100 16GB PCIe graphics card. We have a 6000 node hour allocation for our course on the system.

โš ๏ธ Warning!
Since the course allocation is exceptional, make sure not to open any help tickets directly at CSCS help, but report questions and issue to our helpdesk room on Element. Also, better ask about good practice before launching anything you are unsure in order to avoid any disturbance on the machine.

The login procedure is as follow. First a login to the front-end (or login) machine Ela (hereafter referred to as "ela") is needed before one can log into Piz Daint. Login is performed using ssh. We will set-up a proxy-jump in order to simplify the procedure and directly access Piz Daint (hereafter referred to as "daint")

Both daint and ela share a home folder. However, the scratch folder is only accessible on daint. We can use VS code in combination with the proxy-jump to conveniently edit files on daint's scratch directly. We will use Julia module to have all Julia-related tools ready.

Make sure to have the Remote-SSH extension installed in VS code (see here for details on how-to).

Please follow the steps listed hereafter to get ready and set-up on daint.

Account setup

  1. Fetch your personal username and password credentials from Moodle.

  2. Open a terminal (in Windows, use a tool as e.g. PuTTY or OpenSSH) and ssh to ela and enter the password:

ssh <username>@ela.cscs.ch
๐Ÿ’ก Note
๐Ÿ‘‰ For Lecture 6, you can jump directly to the JupyterLab setup.
  1. Generate a ed25519 keypair as described in the CSCS user website. On your local machine (not ela), do ssh-keygen leaving the passphrase empty. Then copy your public key to the remote server (ela) using ssh-copy-id. Alternatively, you can copy the keys manually as described in the CSCS user website.

ssh-keygen -t ed25519
ssh-copy-id <username>@ela.cscs.ch
ssh-copy-id -i ~/.ssh/id_ed25519.pub <username>@ela.cscs.ch
  1. Edit your ssh config file located in ~/.ssh/config and add following entries to it, making sure to replace <username> and key file with correct names, if needed:

Host ela
  HostName ela.cscs.ch
  User <username>
  IdentityFile ~/.ssh/id_ed25519

Host daint
  HostName daint.cscs.ch
  User <username>
  IdentityFile ~/.ssh/id_ed25519
  ProxyJump ela
  ForwardAgent yes
  RequestTTY yes

Host nid*
  HostName %h
  User <username>
  IdentityFile ~/.ssh/id_ed25519
  ProxyJump daint
  ForwardAgent yes
  RequestTTY yes
  1. Now you should be able to perform password-less login to daint as following

ssh daint

Moreover, you will get the Julia related modules loaded as we add the RemoteCommand

At this stage, you are logged into daint, but still on a login node and not a compute node.

You can reach your home folder upon typing cd $HOME, and your scratch space upon typing cd $SCRATCH. Always make sure to run and save files from scratch folder.

๐Ÿ’ก Note

To make things easier, you can create a soft link from your $HOME pointing to $SCRATCH as this will also be useful in a JupyterLab setting

ln -s $SCRATCH scratch
โš ๏ธ Warning!

There is interactive visualisation on daint. Make sure to produce png or gifs. Also to avoid plotting to fail, make sure to set the following ENV["GKSwstype"]="nul" in the code. Also, it may be good practice to define the animation directory to avoid filling a tmp, such as

ENV["GKSwstype"]="nul"
if isdir("viz_out")==false mkdir("viz_out") end
loadpath = "./viz_out/"; anim = Animation(loadpath,String[])
println("Animation directory: $(anim.dir)")

Running Julia interactively on Piz Daint

So now, how do we actually run some GPU Julia code on Piz Daint?

  1. Open a terminal (other than from within VS code) and login to daint:

ssh daint
  1. The next step is to secure an allocation using salloc, a functionality provided by the SLURM scheduler. Use salloc command to allocate one node (N1) and one process (n1) on the GPU partition -C'gpu' on the project class04 for 1 hour:

salloc -C'gpu' -Aclass04 -N1 -n1 --time=01:00:00
๐Ÿ’ก Note
You can check the status of the allocation typing squeue -u <username>.

๐Ÿ‘‰ Running remote job instead? Jump right there

  1. Make sure to remember the node number returned upon successful allocation, e.g., salloc: Nodes nid02145 are ready for job

  2. Once you have your allocation (salloc) and the node (here nid02145), you can access the compute node by using the following srun command followed by loading the required modules:

srun -n1 --pty /bin/bash -l
module load daint-gpu Julia/1.9.3-CrayGNU-21.09-cuda
  1. You should then be able to launch Julia

julia

๐Ÿ‘€ ONLY the first time

  1. Assuming you are on a node and launched Julia. To finalise your install, enter the package manager and query status ] st and add CUDA@v4.

โš ๏ธ Warning!
Because some driver discovery compatibility issues, you need to add specifically version 4 of CUDA.jl, upon typing add CUDA@v4 in the package mode.
(@1.9-daint-gpu) pkg> st
  Installing known registries into `/scratch/snx3000/class230/../julia/class230/daint-gpu`
      Status `/scratch/snx3000/julia/class230/daint-gpu/environments/1.9-daint-gpu/Project.toml` (empty project)

(@1.9-daint-gpu) pkg> add CUDA@v4
  1. Then load it and query version info

julia> using CUDA

julia> CUDA.versioninfo()
CUDA runtime 11.0, local installation
CUDA driver 12.1
NVIDIA driver 470.57.2, originally for CUDA 11.4
  1. Try out your first calculation on the P100 GPU

julia> a = CUDA.ones(3,4);

julia> b = CUDA.rand(3,4);

julia> c = CUDA.zeros(3,4);

julia> c .= a .+ b

If you made it to here, you're all set ๐Ÿš€

Monitoring GPU usage

You can use the nvidia-smi command to monitor GPU usage on a compute node on daint. Just type in the terminal or with Julia's REPL (in shell mode):

shell> nvidia-smi
Tue Oct 24 18:42:45 2023
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 470.57.02    Driver Version: 470.57.02    CUDA Version: 11.4     |
|-------------------------------+----------------------+----------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|                               |                      |               MIG M. |
|===============================+======================+======================|
|   0  Tesla P100-PCIE...  On   | 00000000:02:00.0 Off |                    0 |
| N/A   21C    P0    25W / 250W |      2MiB / 16280MiB |      0%   E. Process |
|                               |                      |                  N/A |
+-------------------------------+----------------------+----------------------+

+-----------------------------------------------------------------------------+
| Processes:                                                                  |
|  GPU   GI   CI        PID   Type   Process name                  GPU Memory |
|        ID   ID                                                   Usage      |
|=============================================================================|
|  No running processes found                                                 |
+-----------------------------------------------------------------------------+
๐Ÿ’ก Note
You can also use VS code's integrated terminal to launch Julia on daint. However, you can't use the Julia extension nor the direct node login and would have to use srun -n1 --pty /bin/bash -l and load the needed modules, namely module load daint-gpu Julia/1.9.3-CrayGNU-21.09-cuda.

Running a remote job on Piz Daint

If you do not want to use an interactive session you can use the sbatch command to launch a job remotely on the machine. Example of a submit.sh you can launch (without need of an allocation) as sbatch submit.sh:

#!/bin/bash -l
#SBATCH --job-name="my_gpu_run"
#SBATCH --output=my_gpu_run.%j.o
#SBATCH --error=my_gpu_run.%j.e
#SBATCH --time=00:30:00
#SBATCH --nodes=1
#SBATCH --ntasks-per-node=1
#SBATCH --partition=normal
#SBATCH --constraint=gpu
#SBATCH --account class04

module load daint-gpu
module load Julia/1.9.3-CrayGNU-21.09-cuda

srun julia -O3 <my_julia_gpu_script.jl>

JupyterLab access on Piz Daint

Some tasks and homework, are prepared as Jupyter notebook and can easily be executed within a JupyterLab environment. CSCS offers a convenient JupyterLab access.

  1. If possible, create a soft link from your $HOME pointing to $SCRATCH (do this on daint):

ln -s $SCRATCH scratch
  1. Head to https://jupyter.cscs.ch/.

  2. Login with your username and password you've set for in the Account setup step

  3. Select Node Type: GPU, Node: 1 and the duration you want and Launch JupyterLab.

  4. From with JupyterLab, upload the notebook to work on and get started!

Transferring files on Piz Daint

Given that daint's scratch is not mounted on ela, it is unfortunately impossible to transfer files from/to daint using common sftp tools as they do not support the proxy-jump. Various solutions exist to workaround this, including manually handling transfers over terminal, using a tool which supports proxy-jump, or VS code.

To use VS code as development tool, make sure to have installed the Remote-SSH extension as described in the VS Code Remote - SSH setup section. Then, in VS code Remote-SSH settings, make sure the Remote Server Listen On Socket is set to true.

The next step should work out of the box. You should be able to select daint from within the Remote Explorer side-pane. You should get logged into daint. You now can browse your files, change directory to, e.g., your scratch at /scratch/snx3000/<username>/. Just drag and drop files in there to transfer them.

Another way is to use sshfs which lets you mount the file system on servers with ssh-access (works on Linux, there are MacOS and Windows ports too). After installing sshfs on your laptop, create a empty directory to mount (mkdir -p ~/mnt/daint), you should be able to mount via

sshfs <your username on daint>@daint.cscs.ch:/ /home/$USER/mnt_daint  -o compression=yes -o reconnect -o idmap=user -o gid=100 -o workaround=rename -o follow_symlinks -o ProxyJump=ela

and unmount via

fusermount -u -z /home/$USER/mnt_daint

For convenience it is suggested to also symlink to the home-directory ln -s ~/mnt/daint/users/<your username on daint> ~/mnt/daint_home. (Note that we mount the root directory / with sshfs such that access to /scratch is possible.)

Julia MPI GPU on Piz Daint

The following step should allow you to run distributed memory parallelisation application on multiple GPU nodes on Piz Daint.

  1. Make sure to have the Julia GPU environment loaded

module load daint-gpu
module load Julia/1.9.3-CrayGNU-21.09-cuda
  1. Then, you would need to allocate more than one node, let's say 4 nodes for 2 hours, using salloc

salloc -C'gpu' -Aclass04 -N4 -n4 --time=02:00:00
  1. To launch a Julia (GPU) MPI script on 4 nodes (GPUs) using MPI, you can simply use srun

srun -n4 julia -O3 <my_mpi_script.jl>

CUDA-aware MPI on Piz Daint

โš ๏ธ Warning!
There is currently an issue on the Daint software stack with CuDA-aware MPI. For now, make sure not to run with CUDA-aware MPI, i.e., having both MPICH_RDMA_ENABLED_CUDA and IGG_CUDAAWARE_MPI set to 0.

You may want to leverage CUDA-aware MPI, i.e., passing GPU pointers directly through the MPI-based update halo functions, then make sure to export the appropriate ENV variables

export MPICH_RDMA_ENABLED_CUDA=1
export IGG_CUDAAWARE_MPI=1

In the CUDA-aware MPI case, a more robust launch procedure may be to launch a shell script via srun. You can create, e.g., a runme_mpi_daint.sh script containing:

#!/bin/bash -l

module load daint-gpu
module load Julia/1.9.3-CrayGNU-21.09-cuda

export MPICH_RDMA_ENABLED_CUDA=1
export IGG_CUDAAWARE_MPI=1

julia -O3 <my_script.jl>

Which you then launch using srun upon having made it executable (chmod +x runme_mpi_daint.sh)

srun -n4 ./runme_mpi_daint.sh

If you do not want to use an interactive session you can use the sbatch command to launch a job remotely on daint. Example of a sbatch_mpi_daint.sh you can launch (without need of an allocation) as sbatch sbatch_mpi_daint.sh:

#!/bin/bash -l
#SBATCH --job-name="diff2D"
#SBATCH --output=diff2D.%j.o
#SBATCH --error=diff2D.%j.e
#SBATCH --time=00:05:00
#SBATCH --nodes=4
#SBATCH --ntasks-per-node=1
#SBATCH --partition=normal
#SBATCH --constraint=gpu
#SBATCH --account class04

module load daint-gpu
module load Julia/1.9.3-CrayGNU-21.09-cuda

export MPICH_RDMA_ENABLED_CUDA=1
export IGG_CUDAAWARE_MPI=1

srun -n4 bash -c 'julia -O3 <my_julia_mpi_gpu_script.jl>'
๐Ÿ’ก Note
The 2 scripts above can be found in the scripts folder.