Upon login to the server, you should see the following launcher environment, including a notebook (file) browser, ability to create a notebook, launch a Julia console (REPL), or a regular terminal.
There are two recommended ways to install Julia v1.9:
Using the Juliaup Julia installer (preferred approach).
Downloading the binaries for your platform from the Julia website (following the install directions provided under [help]).
;
that natively works on Unix-based systems. On Windows, you can access the Windows shell by typing Powershell
within the shell mode, and exit it typing exit
, as described here.Ensure you have a text editor with syntax highlighting support for Julia. Sublime Text and Atom can be recommended.
From within the terminal, type
julia
to make sure that the Julia REPL (aka terminal) starts. Then you should be able to add 1+1
and verify you get the expected result. Exit with Ctrl-d
.
If you'd enjoy a more IDE type of environment, check out VS Code. Follow the installation directions for the Julia VS Code extension.
VS Code's Remote-SSH extension allows you to connect and open a remote folder on any remote machine with a running SSH server. Once connected to a server, you can interact with files and folders anywhere on the remote filesystem (more).
To get started, follow the install steps.
Then, you can connect to a remote host, using ssh user@hostname
and your password (selecting Remote-SSH: Connect to Host...
from the Command Palette).
Advanced options permit you to access a remote compute node from within VS Code.
Julia: Start REPL
from the Command Palette. Displaying a plot from a Julia instance launched from the remote terminal (which allows, e.g., to include custom options such as ENV
variables or load modules) will fail. To work around this limitation, select Julia: Connect external REPL
from the Command Palette and follow the prompted instructions.Now that you have a running Julia install, launch Julia (e.g. by typing julia
in the shell since it should be on path)
julia
Welcome in the Julia REPL (command window). There, you have 3 "modes", the standard
[user@comp ~]$ julia
_
_ _ _(_)_ | Documentation: https://docs.julialang.org
(_) | (_) (_) |
_ _ _| |_ __ _ | Type "?" for help, "]?" for Pkg help.
| | | | | | |/ _` | |
| | |_| | | | (_| | | Version 1.9.3 (2023-08-24)
_/ |\__'_|_|_|\__'_| | Official https://julialang.org/ release
|__/ |
julia>
the shell mode by hitting ;
, where you can enter Unix commands,
shell>
and the Pkg mode (package manager) by hitting ]
, that will be used to add and manage packages, and environments,
(@v1.9) pkg>
You can interactively execute commands in the REPL, like adding two numbers
julia> 2+2
4
julia>
Within this class, we will mainly work with Julia scripts. You can run them using the include()
function in the REPL
julia> include("my_script.jl")
Alternatively, you can also execute a Julia script from the shell
julia -O3 my_script.jl
here passing the -O3
optimisation flag.
The Pkg mode permits you to install and manage Julia packages, and control the project's environment.
Environments or Projects are an efficient way that enable portability and reproducibility. Upon activating a local environment, you generate a local Project.toml
file that stores the packages and version you are using within a specific project (code-s), and a Manifest.toml
file that keeps track locally of the state of the environment.
To activate an project-specific environment, navigate to your targeted project folder, launch Julia
mkdir my_cool_project
cd my_cool_project
julia
and activate it
julia> ]
(@v1.9) pkg>
(@v1.9) pkg> activate .
Activating new environment at `~/my_cool_project/Project.toml`
(my_cool_project) pkg>
Then, let's install the Plots.jl
package
(my_cool_project) pkg> add Plots
and check the status
(my_cool_project) pkg> st
Status `~/my_cool_project/Project.toml`
[91a5bcdd] Plots v1.22.3
as well as the .toml
files
julia> ;
shell> ls
Manifest.toml Project.toml
We can now load Plots.jl
and plot some random noise
julia> using Plots
julia> heatmap(rand(10,10))
Let's assume you're handed your my_cool_project
to someone to reproduce your cool random plot. To do so, you can open julia from the my_cool_project
folder with the --project
option
cd my_cool_project
julia --project
Or you can rather activate it afterwards
cd my_cool_project
julia
and then,
julia> ]
(@v1.9) pkg> activate .
Activating environment at `~/my_cool_project/Project.toml`
(my_cool_project) pkg>
(my_cool_project) pkg> st
Status `~/my_cool_project/Project.toml`
[91a5bcdd] Plots v1.22.3
Here we go, you can now share that folder with colleagues or with yourself on another machine and have a reproducible environment ๐
On the CPU, multi-threading is made accessible via Base.Threads. To make use of threads, Julia needs to be launched with
julia --project -t auto
which will launch Julia with as many threads are there are cores on your machine (including hyper-threaded cores). Alternatively set the environment variable JULIA_NUM_THREADS
, e.g. export JULIA_NUM_THREADS=2
to enable 2 threads.
The CUDA.jl module permits to launch compute kernels on Nvidia GPUs natively from within Julia. JuliaGPU provides further reading and introductory material about GPU ecosystems within Julia.
The following steps permit you to install MPI.jl on your machine and test it:
If Julia MPI is a dependency of a Julia project MPI.jl should have been added upon executing the instantiate
command from within the package manager see here. If not, MPI.jl can be added from within the package manager (typing add MPI
in package mode).
Install mpiexecjl
:
julia> using MPI
julia> MPI.install_mpiexecjl()
[ Info: Installing `mpiexecjl` to `HOME/.julia/bin`...
[ Info: Done!
Then, one should add HOME/.julia/bin
to PATH in order to launch the Julia MPI wrapper mpiexecjl
.
Running a Julia MPI code <my_script.jl>
on np
MPI processes:
$ mpiexecjl -n np julia --project <my_script.jl>
To test the Julia MPI installation, launch the l8_hello_mpi.jl
using the Julia MPI wrapper mpiexecjl
(located in ~/.julia/bin
) on, e.g., 4 processes:
$ mpiexecjl -n 4 julia --project ./l8_hello_mpi.jl
$ Hello world, I am 0 of 3
$ Hello world, I am 1 of 3
$ Hello world, I am 2 of 3
$ Hello world, I am 3 of 3
On macOS, you may encounter this issue. To fix it, define following ENV
variable:
$ export MPICH_INTERFACE_HOSTNAME=localhost
and add -host localhost
to the execution script:
$ mpiexecjl -n 4 -host localhost julia --project ./hello_mpi.jl
For running Julia at scale on Piz Daint, refer to the Julia MPI GPU on Piz Daint section.
GPU computing on Piz Daint at CSCS. The supercomputer Piz Daint is composed of about 5700 compute nodes, each hosting a single Nvidia P100 16GB PCIe graphics card. We have a 6000 node hour allocation for our course on the system.
The login procedure is as follow. First a login to the front-end (or login) machine Ela (hereafter referred to as "ela") is needed before one can log into Piz Daint. Login is performed using ssh
. We will set-up a proxy-jump in order to simplify the procedure and directly access Piz Daint (hereafter referred to as "daint")
Both daint and ela share a home
folder. However, the scratch
folder is only accessible on daint. We can use VS code in combination with the proxy-jump to conveniently edit files on daint's scratch directly. We will use Julia module to have all Julia-related tools ready.
Make sure to have the Remote-SSH extension installed in VS code (see here for details on how-to).
Please follow the steps listed hereafter to get ready and set-up on daint.
Fetch your personal username and password credentials from Moodle.
Open a terminal (in Windows, use a tool as e.g. PuTTY or OpenSSH) and ssh
to ela and enter the password:
ssh <username>@ela.cscs.ch
Generate a ed25519
keypair as described in the CSCS user website. On your local machine (not ela), do ssh-keygen
leaving the passphrase empty. Then copy your public key to the remote server (ela) using ssh-copy-id
. Alternatively, you can copy the keys manually as described in the CSCS user website.
ssh-keygen -t ed25519
ssh-copy-id <username>@ela.cscs.ch
ssh-copy-id -i ~/.ssh/id_ed25519.pub <username>@ela.cscs.ch
Edit your ssh config file located in ~/.ssh/config
and add following entries to it, making sure to replace <username>
and key file with correct names, if needed:
Host ela
HostName ela.cscs.ch
User <username>
IdentityFile ~/.ssh/id_ed25519
Host daint
HostName daint.cscs.ch
User <username>
IdentityFile ~/.ssh/id_ed25519
ProxyJump ela
ForwardAgent yes
RequestTTY yes
Host nid*
HostName %h
User <username>
IdentityFile ~/.ssh/id_ed25519
ProxyJump daint
ForwardAgent yes
RequestTTY yes
Now you should be able to perform password-less login to daint as following
ssh daint
Moreover, you will get the Julia related modules loaded as we add the RemoteCommand
At this stage, you are logged into daint, but still on a login node and not a compute node.
You can reach your home folder upon typing cd $HOME
, and your scratch space upon typing cd $SCRATCH
. Always make sure to run and save files from scratch folder.
To make things easier, you can create a soft link from your $HOME
pointing to $SCRATCH
as this will also be useful in a JupyterLab setting
ln -s $SCRATCH scratch
There is interactive visualisation on daint. Make sure to produce png
or gifs
. Also to avoid plotting to fail, make sure to set the following ENV["GKSwstype"]="nul"
in the code. Also, it may be good practice to define the animation directory to avoid filling a tmp
, such as
ENV["GKSwstype"]="nul"
if isdir("viz_out")==false mkdir("viz_out") end
loadpath = "./viz_out/"; anim = Animation(loadpath,String[])
println("Animation directory: $(anim.dir)")
So now, how do we actually run some GPU Julia code on Piz Daint?
Open a terminal (other than from within VS code) and login to daint:
ssh daint
The next step is to secure an allocation using salloc
, a functionality provided by the SLURM scheduler. Use salloc
command to allocate one node (N1
) and one process (n1
) on the GPU partition -C'gpu'
on the project class04
for 1 hour:
salloc -C'gpu' -Aclass04 -N1 -n1 --time=01:00:00
squeue -u <username>
.๐ Running remote job instead? Jump right there
Make sure to remember the node number returned upon successful allocation, e.g., salloc: Nodes nid02145 are ready for job
Once you have your allocation (salloc
) and the node (here nid02145
), you can access the compute node by using the following srun
command followed by loading the required modules:
srun -n1 --pty /bin/bash -l
module load daint-gpu Julia/1.9.3-CrayGNU-21.09-cuda
You should then be able to launch Julia
julia
Assuming you are on a node and launched Julia. To finalise your install, enter the package manager and query status ] st
and add CUDA@v4
.
add CUDA@v4
in the package mode.(@1.9-daint-gpu) pkg> st
Installing known registries into `/scratch/snx3000/class230/../julia/class230/daint-gpu`
Status `/scratch/snx3000/julia/class230/daint-gpu/environments/1.9-daint-gpu/Project.toml` (empty project)
(@1.9-daint-gpu) pkg> add CUDA@v4
Then load it and query version info
julia> using CUDA
julia> CUDA.versioninfo()
CUDA runtime 11.0, local installation
CUDA driver 12.1
NVIDIA driver 470.57.2, originally for CUDA 11.4
Try out your first calculation on the P100 GPU
julia> a = CUDA.ones(3,4);
julia> b = CUDA.rand(3,4);
julia> c = CUDA.zeros(3,4);
julia> c .= a .+ b
If you made it to here, you're all set ๐
You can use the nvidia-smi
command to monitor GPU usage on a compute node on daint. Just type in the terminal or with Julia's REPL (in shell mode):
shell> nvidia-smi
Tue Oct 24 18:42:45 2023
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 470.57.02 Driver Version: 470.57.02 CUDA Version: 11.4 |
|-------------------------------+----------------------+----------------------+
| GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. |
| | | MIG M. |
|===============================+======================+======================|
| 0 Tesla P100-PCIE... On | 00000000:02:00.0 Off | 0 |
| N/A 21C P0 25W / 250W | 2MiB / 16280MiB | 0% E. Process |
| | | N/A |
+-------------------------------+----------------------+----------------------+
+-----------------------------------------------------------------------------+
| Processes: |
| GPU GI CI PID Type Process name GPU Memory |
| ID ID Usage |
|=============================================================================|
| No running processes found |
+-----------------------------------------------------------------------------+
srun -n1 --pty /bin/bash -l
and load the needed modules, namely module load daint-gpu Julia/1.9.3-CrayGNU-21.09-cuda
.If you do not want to use an interactive session you can use the sbatch
command to launch a job remotely on the machine. Example of a submit.sh
you can launch (without need of an allocation) as sbatch submit.sh
:
#!/bin/bash -l
#SBATCH --job-name="my_gpu_run"
#SBATCH --output=my_gpu_run.%j.o
#SBATCH --error=my_gpu_run.%j.e
#SBATCH --time=00:30:00
#SBATCH --nodes=1
#SBATCH --ntasks-per-node=1
#SBATCH --partition=normal
#SBATCH --constraint=gpu
#SBATCH --account class04
module load daint-gpu
module load Julia/1.9.3-CrayGNU-21.09-cuda
srun julia -O3 <my_julia_gpu_script.jl>
Some tasks and homework, are prepared as Jupyter notebook and can easily be executed within a JupyterLab environment. CSCS offers a convenient JupyterLab access.
If possible, create a soft link from your $HOME
pointing to $SCRATCH
(do this on daint):
ln -s $SCRATCH scratch
Head to https://jupyter.cscs.ch/.
Login with your username and password you've set for in the Account setup step
Select Node Type: GPU
, Node: 1
and the duration you want and Launch JupyterLab.
From with JupyterLab, upload the notebook to work on and get started!
Given that daint's scratch
is not mounted on ela, it is unfortunately impossible to transfer files from/to daint using common sftp tools as they do not support the proxy-jump. Various solutions exist to workaround this, including manually handling transfers over terminal, using a tool which supports proxy-jump, or VS code.
To use VS code as development tool, make sure to have installed the Remote-SSH
extension as described in the VS Code Remote - SSH setup section. Then, in VS code Remote-SSH settings, make sure the Remote Server Listen On Socket
is set to true
.
The next step should work out of the box. You should be able to select daint
from within the Remote Explorer side-pane. You should get logged into daint. You now can browse your files, change directory to, e.g., your scratch at /scratch/snx3000/<username>/
. Just drag and drop files in there to transfer them.
Another way is to use sshfs
which lets you mount the file system on servers with ssh-access (works on Linux, there are MacOS and Windows ports too). After installing sshfs
on your laptop, create a empty directory to mount (mkdir -p ~/mnt/daint
), you should be able to mount via
sshfs <your username on daint>@daint.cscs.ch:/ /home/$USER/mnt_daint -o compression=yes -o reconnect -o idmap=user -o gid=100 -o workaround=rename -o follow_symlinks -o ProxyJump=ela
and unmount via
fusermount -u -z /home/$USER/mnt_daint
For convenience it is suggested to also symlink to the home-directory ln -s ~/mnt/daint/users/<your username on daint> ~/mnt/daint_home
. (Note that we mount the root directory /
with sshfs
such that access to /scratch
is possible.)
The following step should allow you to run distributed memory parallelisation application on multiple GPU nodes on Piz Daint.
Make sure to have the Julia GPU environment loaded
module load daint-gpu
module load Julia/1.9.3-CrayGNU-21.09-cuda
Then, you would need to allocate more than one node, let's say 4 nodes for 2 hours, using salloc
salloc -C'gpu' -Aclass04 -N4 -n4 --time=02:00:00
To launch a Julia (GPU) MPI script on 4 nodes (GPUs) using MPI, you can simply use srun
srun -n4 julia -O3 <my_mpi_script.jl>
MPICH_RDMA_ENABLED_CUDA
and IGG_CUDAAWARE_MPI
set to 0.You may want to leverage CUDA-aware MPI, i.e., passing GPU pointers directly through the MPI-based update halo functions, then make sure to export the appropriate ENV
variables
export MPICH_RDMA_ENABLED_CUDA=1
export IGG_CUDAAWARE_MPI=1
In the CUDA-aware MPI case, a more robust launch procedure may be to launch a shell script via srun
. You can create, e.g., a runme_mpi_daint.sh
script containing:
#!/bin/bash -l
module load daint-gpu
module load Julia/1.9.3-CrayGNU-21.09-cuda
export MPICH_RDMA_ENABLED_CUDA=1
export IGG_CUDAAWARE_MPI=1
julia -O3 <my_script.jl>
Which you then launch using srun
upon having made it executable (chmod +x runme_mpi_daint.sh
)
srun -n4 ./runme_mpi_daint.sh
If you do not want to use an interactive session you can use the sbatch
command to launch a job remotely on daint. Example of a sbatch_mpi_daint.sh
you can launch (without need of an allocation) as sbatch sbatch_mpi_daint.sh
:
#!/bin/bash -l
#SBATCH --job-name="diff2D"
#SBATCH --output=diff2D.%j.o
#SBATCH --error=diff2D.%j.e
#SBATCH --time=00:05:00
#SBATCH --nodes=4
#SBATCH --ntasks-per-node=1
#SBATCH --partition=normal
#SBATCH --constraint=gpu
#SBATCH --account class04
module load daint-gpu
module load Julia/1.9.3-CrayGNU-21.09-cuda
export MPICH_RDMA_ENABLED_CUDA=1
export IGG_CUDAAWARE_MPI=1
srun -n4 bash -c 'julia -O3 <my_julia_mpi_gpu_script.jl>'