conda
environment for running machine learning jobs on perlmutter
. But you may find general info on:conda
settings (e.g. environment directory and package directory to avoid home directory out-of-quota problem)conda
environmentperlmutter
perlmutter
(without typing the $
, same below)conda
is not on PATH.
Get it on PATH
by:matml,
short for materials machine learning.$ conda create --name matml
to create it, and then $ conda install python <other_package_name>
to install packages. But if you intend to use PyTorch
or TensorFlow,
there are pre-installed versions on perlmutter
that are built from source. These could be optimized for the hardware, so it would be better to use them. NERSC
folks have already put them in a conda environment, and we just need to clone it.pytorch
as an example, but it is similar for tensorflow
)pytorch/1.10.0,
jot it down, you will use it later) and load itwhich python
is to find out the path to the pytorch conda environment, and you will see something like /global/common/software/nersc/shasta2105/pytorch/1.10.0/bin/python
. This means the conda environment is at /global/common/software/nersc/shasta2105/pytorch/1.10.0
(NOTE, /bin/python
is excluded).matml
environment,perlmutter.
PATH
. Make sure to load python
first, then the torch environment (e.g. pytorch/1.10.0
), and finally activate your own conda environment. You can check it by $ which python,
and make sure it is from the matml
enviroment (e.g. /global/common/software/matgen/<username>/conda/envs/matml/bin/python
).my_first_torch_job.py
submit.sh
module load python
in submit.sh
won't work. Use $ conda deactivate
to deactivate if you are in an environment. Alternatively, you can log out and log in again before running the below command):