Upon login to the server, you should see the following launcher environment, including a notebook (file) browser, ability to create a notebook, launch a Julia console (REPL), or a regular terminal.
Follow the instructions from the Julia Download page to install Julia v1.10 (which is using the Juliaup Julia installer under the hood).
;
that natively works on Unix-based systems. On Windows, you can access the Windows shell by typing Powershell
within the shell mode, and exit it typing exit
, as described here.Ensure you have a text editor with syntax highlighting support for Julia. We recommend to use VSCode, see below. However, other editors are available too such as Sublime, Emacs, Vim, Helix, etc.
From within the terminal, type
julia
to make sure that the Julia REPL (aka terminal) starts. Then you should be able to add 1+1
and verify you get the expected result. Exit with Ctrl-d
.
If you'd enjoy a more IDE type of environment, check out VS Code. Follow the installation directions for the Julia VS Code extension.
VS Code's Remote-SSH extension allows you to connect and open a remote folder on any remote machine with a running SSH server. Once connected to a server, you can interact with files and folders anywhere on the remote filesystem (more).
To get started, follow the install steps.
Then, you can connect to a remote host, using ssh user@hostname
and your password (selecting Remote-SSH: Connect to Host...
from the Command Palette).
Advanced options permit you to access a remote compute node from within VS Code.
Julia: Start REPL
from the Command Palette. Displaying a plot from a Julia instance launched from the remote terminal (which allows, e.g., to include custom options such as ENV
variables or load modules) will fail. To work around this limitation, select Julia: Connect external REPL
from the Command Palette and follow the prompted instructions.Now that you have a running Julia install, launch Julia (e.g. by typing julia
in the shell since it should be on path)
julia
Welcome in the Julia REPL (command window). There, you have 3 "modes", the standard
[user@comp ~]$ julia
_
_ _ _(_)_ | Documentation: https://docs.julialang.org
(_) | (_) (_) |
_ _ _| |_ __ _ | Type "?" for help, "]?" for Pkg help.
| | | | | | |/ _` | |
| | |_| | | | (_| | | Version 1.10.5 (2024-08-27)
_/ |\__'_|_|_|\__'_| | Official https://julialang.org/ release
|__/ |
julia>
the shell mode by hitting ;
, where you can enter Unix commands,
shell>
and the Pkg mode (package manager) by hitting ]
, that will be used to add and manage packages, and environments,
(@v1.10) pkg>
You can interactively execute commands in the REPL, like adding two numbers
julia> 2+2
4
julia>
Within this class, we will mainly work with Julia scripts. You can run them using the include()
function in the REPL
julia> include("my_script.jl")
Alternatively, you can also execute a Julia script from the shell
julia -03 my_script.jl
here passing the -O3
optimisation flag.
The Pkg mode permits you to install and manage Julia packages, and control the project's environment.
Environments or Projects are an efficient way that enable portability and reproducibility. Upon activating a local environment, you generate a local Project.toml
file that stores the packages and version you are using within a specific project (code-s), and a Manifest.toml
file that keeps track locally of the state of the environment.
To activate an project-specific environment, navigate to your targeted project folder, launch Julia
mkdir my_cool_project
cd my_cool_project
julia
and activate it
julia> ]
(@v1.10) pkg>
(@v1.10) pkg> activate .
Activating new environment at `~/my_cool_project/Project.toml`
(my_cool_project) pkg>
Then, let's install the Plots.jl
package
(my_cool_project) pkg> add Plots
and check the status
(my_cool_project) pkg> st
Status `~/my_cool_project/Project.toml`
[91a5bcdd] Plots v1.22.3
as well as the .toml
files
julia> ;
shell> ls
Manifest.toml Project.toml
We can now load Plots.jl
and plot some random noise
julia> using Plots
julia> heatmap(rand(10,10))
Let's assume you're handed your my_cool_project
to someone to reproduce your cool random plot. To do so, you can open julia from the my_cool_project
folder with the --project
option
cd my_cool_project
julia --project
Or you can rather activate it afterwards
cd my_cool_project
julia
and then,
julia> ]
(@v1.10) pkg> activate .
Activating environment at `~/my_cool_project/Project.toml`
(my_cool_project) pkg>
(my_cool_project) pkg> st
Status `~/my_cool_project/Project.toml`
[91a5bcdd] Plots v1.22.3
Here we go, you can now share that folder with colleagues or with yourself on another machine and have a reproducible environment 🙂
On the CPU, multi-threading is made accessible via Base.Threads. To make use of threads, Julia needs to be launched with
julia --project -t auto
which will launch Julia with as many threads are there are cores on your machine (including hyper-threaded cores). Alternatively set the environment variable JULIA_NUM_THREADS
, e.g. export JULIA_NUM_THREADS=2
to enable 2 threads.
The CUDA.jl module permits to launch compute kernels on Nvidia GPUs natively from within Julia. JuliaGPU provides further reading and introductory material about GPU ecosystems within Julia.
The following steps permit you to install MPI.jl on your machine and test it:
If Julia MPI is a dependency of a Julia project MPI.jl should have been added upon executing the instantiate
command from within the package manager see here. If not, MPI.jl can be added from within the package manager (typing add MPI
in package mode).
Install mpiexecjl
:
julia> using MPI
julia> MPI.install_mpiexecjl()
[ Info: Installing `mpiexecjl` to `HOME/.julia/bin`...
[ Info: Done!
Then, one should add HOME/.julia/bin
to PATH in order to launch the Julia MPI wrapper mpiexecjl
.
Running a Julia MPI code <my_script.jl>
on np
MPI processes:
$ mpiexecjl -n np julia --project <my_script.jl>
To test the Julia MPI installation, launch the l8_hello_mpi.jl
using the Julia MPI wrapper mpiexecjl
(located in ~/.julia/bin
) on, e.g., 4 processes:
$ mpiexecjl -n 4 julia --project ./l8_hello_mpi.jl
$ Hello world, I am 0 of 3
$ Hello world, I am 1 of 3
$ Hello world, I am 2 of 3
$ Hello world, I am 3 of 3
On macOS, you may encounter this issue. To fix it, define following ENV
variable:
$ export MPICH_INTERFACE_HOSTNAME=localhost
and add -host localhost
to the execution script:
$ mpiexecjl -n 4 -host localhost julia --project ./hello_mpi.jl
For running Julia at scale on Piz Daint, refer to the Julia MPI GPU on Piz Daint section.
GPU computing on Piz Daint at CSCS. The supercomputer Piz Daint is composed of about 5700 compute nodes, each hosting a single Nvidia P100 16GB PCIe graphics card. We have a 6000 node hour allocation for our course on the system.
The login procedure is as follow. First a login to the front-end (or login) machine Ela (hereafter referred to as "ela") is needed before one can log into Piz Daint. Login is performed using ssh
. We will set-up a proxy-jump in order to simplify the procedure and directly access Piz Daint (hereafter referred to as "daint")
Both daint and ela share a home
folder. However, the scratch
folder is only accessible on daint. We can use VS code in combination with the proxy-jump to conveniently edit files on daint's scratch directly. We will use Julia module to have all Julia-related tools ready.
Make sure to have the Remote-SSH extension installed in VS code (see here for details on how-to).
Please follow the steps listed hereafter to get ready and set-up on daint.
Fetch your personal username and password credentials from Moodle.
Open a terminal (in Windows, use a tool as e.g. PuTTY or OpenSSH) and ssh
to ela and enter the password:
ssh <username>@ela.cscs.ch
Generate a ed25519
keypair as described in the CSCS user website. On your local machine (not ela), do ssh-keygen
leaving the passphrase empty. Then copy your public key to the remote server (ela) using ssh-copy-id
. Alternatively, you can copy the keys manually as described in the CSCS user website.
ssh-keygen -t ed25519
ssh-copy-id -i ~/.ssh/id_ed25519.pub <username>@ela.cscs.ch
Once your key is added to ela, manually connect to daint to authorize your key for the first time, while making sure you are logged-in in ela. Execute:
[classXXX@ela2 ~]$ ssh daint
This step shall prompt you to accept the daint server’s SSH key and enter the password you got from Moodle again.
Edit your ssh config file located in ~/.ssh/config
and add following entries to it, making sure to replace <username>
and key file with correct names, if needed:
Host daint-xc
HostName daint.cscs.ch
User <username>
IdentityFile ~/.ssh/id_ed25519
ProxyJump <username>@ela.cscs.ch
AddKeysToAgent yes
ForwardAgent yes
Now you should be able to perform password-less login to daint as following
ssh daint-xc
Moreover, you will get the Julia related modules loaded as we add the RemoteCommand
At this stage, you are logged into daint, but still on a login node and not a compute node.
You can reach your home folder upon typing cd $HOME
, and your scratch space upon typing cd $SCRATCH
. Always make sure to run and save files from scratch folder.
To make things easier, you can create a soft link from your $HOME
pointing to $SCRATCH
as this will also be useful in a JupyterLab setting
ln -s $SCRATCH scratch
Make sure to remove any folders you may find in your scratch as those are the empty remaining from last year's course.
The Julia setup on Piz Daint is handled by JUHPC. Everything should be ready for use and the only step required is to activate the environment mostly each time before launching Julia. Also, only hte first time, juliaup
needs to be installed (these steps are explained hereafter).
To access a GPU on Piz Daint.
Open a terminal (other than from within VS code) and login to daint:
ssh daint-xc
The next step is to secure an allocation using salloc
, a functionality provided by the SLURM scheduler. Use salloc
command to allocate one node (N1
) and one process (n1
) on the GPU partition -C'gpu'
on the project class04
for 1 hour:
salloc -C'gpu' -Aclass04 -N1 -n1 --time=01:00:00
squeue -u <username>
.👉 Running remote job instead? Jump right there
Once you have your allocation (salloc
) and the node, you can access the compute node by using the following srun
command:
srun -n1 --pty /bin/bash -l
Then, to "activate" the Julia configuration previously prepared, enter the following (do not miss the first dot .
):
. $SCRATCH/../julia/daint-gpu-nocudaaware/activate
This will activate the artifact-based config for CUDA.jl which works smoother on the rather old Nvidia P100 GPUs. The caveat is that it does not allow for CUDA-aware MPI. It exists also a CUDA-aware daint-gpu
configuration one could try out at later stage but may not be totally stable.
Then, only the first time, you need to install Julia using the juliaup
command:
juliaup
This will install latest Julia, upon JUHPC calling into juliaup.
Next, go to the scratch and create a temporary test dir
cd $SCRATCH
mkdir tmp-test
cd tmp-test
touch Project.toml
You should then be able to launch Julia in the tmp-test
project environment
julia --project=.
Within Julia, enter the package mode, check the status, and add any package you'd like to be part of tmp-test
. Let's here add CUDA
and MPI
, as these two packages will be mostly used in the course.
julia> ]
(tmp-test) pkg> st
Installing known registries into `/scratch/snx3000/class230/../julia/class230/daint-gpu-nocudaaware/juliaup/depot`
Added `General` registry to /scratch/snx3000/class230/../julia/class230/daint-gpu-nocudaaware/juliaup/depot/registries
Status `/scratch/snx3000/class230/tmp-test/Project.toml` (empty project)
(tmp-test) pkg> add CUDA, MPI
Then load it and query version info
julia> using CUDA
julia> CUDA.versioninfo()
CUDA runtime 11.8, artifact installation
CUDA driver 12.6
NVIDIA driver 470.57.2
#[skipped lines]
Preferences:
- CUDA_Runtime_jll.version: 11.8
- CUDA_Runtime_jll.local: false
1 device:
0: Tesla P100-PCIE-16GB (sm_60, 15.897 GiB / 15.899 GiB available)
Try out your first calculation on the P100 GPU
julia> a = CUDA.ones(3,4);
julia> b = CUDA.rand(3,4);
julia> c = CUDA.zeros(3,4);
julia> c .= a .+ b
If you made it to here, you're all set 🚀
There is interactive visualisation on daint. Make sure to produce png
or gifs
. Also to avoid plotting to fail, make sure to set the following ENV["GKSwstype"]="nul"
in the code. Also, it may be good practice to define the animation directory to avoid filling a tmp
, such as
ENV["GKSwstype"]="nul"
if isdir("viz_out")==false mkdir("viz_out") end
loadpath = "./viz_out/"; anim = Animation(loadpath,String[])
println("Animation directory: $(anim.dir)")
You can use the nvidia-smi
command to monitor GPU usage on a compute node on daint. Just type in the terminal or with Julia's REPL (in shell mode):
shell> nvidia-smi
Fri Oct 25 22:32:26 2024
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 470.57.02 Driver Version: 470.57.02 CUDA Version: 11.4 |
|-------------------------------+----------------------+----------------------+
| GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. |
| | | MIG M. |
|===============================+======================+======================|
| 0 Tesla P100-PCIE... On | 00000000:02:00.0 Off | 0 |
| N/A 24C P0 25W / 250W | 0MiB / 16280MiB | 0% E. Process |
| | | N/A |
+-------------------------------+----------------------+----------------------+
+-----------------------------------------------------------------------------+
| Processes: |
| GPU GI CI PID Type Process name GPU Memory |
| ID ID Usage |
|=============================================================================|
| No running processes found |
+-----------------------------------------------------------------------------+
srun -n1 --pty /bin/bash -l
and activate the environment.If you do not want to use an interactive session you can use the sbatch
command to launch a job remotely on the machine. Example of a submit.sh
you can launch (without need of an allocation) as sbatch submit.sh
:
#!/bin/bash -l
#SBATCH --job-name="my_gpu_run"
#SBATCH --output=my_gpu_run.%j.o
#SBATCH --error=my_gpu_run.%j.e
#SBATCH --time=00:30:00
#SBATCH --nodes=1
#SBATCH --ntasks-per-node=1
#SBATCH --partition=normal
#SBATCH --constraint=gpu
#SBATCH --account class04
# make sure to activate julia env before executing this script.
# DO NOT add this `. $SCRATCH/../julia/daint-gpu-nocudaaware/activate` in here
srun julia --project <my_julia_gpu_script.jl>
. $SCRATCH/../julia/daint-gpu-nocudaaware/activate
) before executing the sbatch
command.Some tasks and homework, are prepared as Jupyter notebook and can easily be executed within a JupyterLab environment. CSCS offers a convenient JupyterLab access.
If possible, create a soft link from your $HOME
pointing to $SCRATCH
(do this on daint):
ln -s $SCRATCH scratch
Head to https://jupyter.cscs.ch/.
Login with your username and password you've set for in the Account setup step
Select Node Type: GPU
, Node: 1
and the duration you want and Launch JupyterLab.
From with JupyterLab, upload the notebook to work on and get started!
Given that daint's scratch
is not mounted on ela, it is unfortunately impossible to transfer files from/to daint using common sftp tools as they do not support the proxy-jump. Various solutions exist to workaround this, including manually handling transfers over terminal, using a tool which supports proxy-jump, or VS code.
To use VS code as development tool, make sure to have installed the Remote-SSH
extension as described in the VS Code Remote - SSH setup section. Then, in VS code Remote-SSH settings, make sure the Remote Server Listen On Socket
is set to true
.
The next step should work out of the box. You should be able to select daint
from within the Remote Explorer side-pane. You should get logged into daint. You now can browse your files, change directory to, e.g., your scratch at /scratch/snx3000/<username>/
. Just drag and drop files in there to transfer them.
Another way is to use sshfs
which lets you mount the file system on servers with ssh-access (works on Linux, there are MacOS and Windows ports too). After installing sshfs
on your laptop, create a empty directory to mount (mkdir -p ~/mnt/daint
), you should be able to mount via
sshfs <your username on daint>@daint.cscs.ch:/ /home/$USER/mnt_daint -o compression=yes -o reconnect -o idmap=user -o gid=100 -o workaround=rename -o follow_symlinks -o ProxyJump=ela
and unmount via
fusermount -u -z /home/$USER/mnt_daint
For convenience it is suggested to also symlink to the home-directory ln -s ~/mnt/daint/users/<your username on daint> ~/mnt/daint_home
. (Note that we mount the root directory /
with sshfs
such that access to /scratch
is possible.)
The following step should allow you to run distributed memory parallelisation application on multiple GPU nodes on Piz Daint.
Make sure to have the Julia GPU environment loaded
. $SCRATCH/../julia/daint-gpu-nocudaaware/activate
Then, you would need to allocate more than one node, let's say 4 nodes for 2 hours, using salloc
salloc -C'gpu' -Aclass04 -N4 -n4 --time=02:00:00
To launch a Julia (GPU) MPI script on 4 nodes (GPUs) using MPI, you can simply use srun
srun -n4 julia --project <my_mpi_script.jl>
If you do not want to use an interactive session you can use the sbatch
command to launch an MPI job remotely on daint. Example of a sbatch_mpi_daint.sh
you can launch (without need of an allocation) as sbatch sbatch_mpi_daint.sh
:
#!/bin/bash -l
#SBATCH --job-name="diff2D"
#SBATCH --output=diff2D.%j.o
#SBATCH --error=diff2D.%j.e
#SBATCH --time=00:05:00
#SBATCH --nodes=4
#SBATCH --ntasks-per-node=1
#SBATCH --partition=normal
#SBATCH --constraint=gpu
#SBATCH --account class04
srun -n4 bash -c 'julia --project <my_julia_mpi_gpu_script.jl>'
MPICH_RDMA_ENABLED_CUDA
and IGG_CUDAAWARE_MPI
set to 0.