condaenvironment for running machine learning jobs on
perlmutter. But you may find general info on:
condasettings (e.g. environment directory and package directory to avoid home directory out-of-quota problem)
perlmutter(without typing the
$, same below)
condais not on
PATH.Get it on
/global/homes/m/<username>/.conda/envs). Each of us has a quota of 40G for
$HOME, and sometimes conda environments can get quite big, which can cause out-of-quota problem. So, let's change the default environment directory to avoid this.
/global/common/software/jcesr, depending on the
accountyou have access to). Create a directory under your username (to store all your software), e.g.
envs_dirswhat we've created:
~/.condarcto see all the changes you've made. You can even directly edit it to remove the changes or add new ones.
/global/homes/m/<username>/.conda/pkgs). You can change the default package storage directory as well:
matgento the accout you have access to, and, of course, change
<username>to your username.
matml,short for materials machine learning.
$ conda create --name matmlto create it, and then
$ conda install python <other_package_name>to install packages. But if you intend to use
TensorFlow,there are pre-installed versions on
perlmutterthat are built from source. These could be optimized for the hardware, so it would be better to use them.
NERSCfolks have already put them in a conda environment, and we just need to clone it.
pytorchas an example, but it is similar for
pytorch/1.10.0,jot it down, you will use it later) and load it
which pythonis to find out the path to the pytorch conda environment, and you will see something like
/global/common/software/nersc/shasta2105/pytorch/1.10.0/bin/python. This means the conda environment is at
PATH. Make sure to load
pythonfirst, then the torch environment (e.g.
pytorch/1.10.0), and finally activate your own conda environment. You can check it by
$ which python,and make sure it is from the
module load pythonin
submit.shwon't work. Use
$ conda deactivateto deactivate if you are in an environment. Alternatively, you can log out and log in again before running the below command):