Lab Session 7: Local Installation
Contents
Lab Session 7: Local Installation¶
Statistics 159/259, Spring 2022
Prof. F. Pérez and GSI F. Sapienza, Department of Statistics, UC Berkeley.
04/04/2022
The goal for today is to setup your local machine with the JupyterLab we are using in the Stat 159/259 Hub. Even if you already have conda installed in your local machine, it’s not a bad idea to reset the conda instal. The idea is to have a very simplistic base
environment from which you can launch JupyterLab and connect it to the kernels associated to other environmnets you create.
Useful links:
Part 1: Installation and configuration of miniconda¶
Instead of installing conda with the full package management system and many Python libraries, we are going to install Miniconda, a minimalist version of conda that included the package management tool and a small number of useful packages. If you already have conda installed in your machine, you may not need to do this. An easy way to check if conda is installed in your computer is by entering conda --help
in the terminal. You are also welcome to install conda from zero by removing all conda directories and installing miniconda.
You can install the last version of miniconda depending on you operating system from the Miniconda website. For unix machines, a more simple way of doing this is by using the wget
command followed by the installer link (silent installation mode). For example, for a MacOSX machine (without M1 chip) you can use
wget https://repo.anaconda.com/miniconda/Miniconda3-latest-MacOSX-x86_64.sh -O ~/miniconda.sh
mkdir -p $HOME/local/conda
bash ~/miniconda.sh -b -p $HOME/local/conda
and for a Linux machine
wget https://repo.anaconda.com/miniconda/Miniconda3-latest-Linux-x86_64.sh -O ~/miniconda.sh
mkdir -p $HOME/local/conda
bash ~/miniconda.sh -b -p $HOME/local/conda
If you don’t have wget
installed, you can do it using brew with brew install wget
; or download the .sh
file directly from the miniconda installation website and replace ~/miniconda.sh
by the name of the downloaded file (don’t change the name of the file and be sure that the direction to the file is correct, either by adding the full path to the file or by moving the sh
file to your home directory).
Note for Windows users
If you are working from a Windows machine, you can directly install Miniconda using the link provided in the website. Most cloud-hosted computing is based on a Unix foundation, which both MacOS and Linux provide out of the box, so some of our instructions fit that workflow a little more easily. But for Windows platforms, Microsoft now officially supports something very similar, which is excellent and we highly encourage you to play with and learn about: the Windows Subsystem for Linux (WSL). It provides top-notch Unix functionality next to your regular Windows tools and workflow, and is probably the ideal setup for modern scientific and research computing on Windows. Discussing it in detail is beyond the scope of this course, but we highly encourage you to explore this tool on your own (and let us know how it goes!).
In order to specify the path where conda is installed, you need to add the following line to your .bashrc
file
# add path to conda
export PATH="$HOME/local/conda/bin:$PATH"
and execute it by source .bashrc
. After doing this, you will see that now conda is recognized. You can fix this change by now running conda init
, so you don’t need to worry again about PATH
. You can see where conda has been installed by typing which conda
in your terminal.
You can change the configuration of conda inside .condarc
manually or directly though the shell by using the following commands. In this case, we specify the channel used by conda/mamba to download the libraries.
conda config --add channels conda-forge
conda config --set channel_priority strict
Once you have installed conda, you can use it to install mamba:
conda install --yes mamba
If this doesn’t work, you can also try conda install mamba -n base -c conda-forge
. If this still doesn’t work, you can always replace mamba
by conda
in the following section, but it will be much slower and unefficient.
Part 2: Setting the Hub environment in local¶
Once you have installed miniconda, you can create virtual environments from a yml
file just as we did in the Hub. As a first step, we are going to create a virtual environment with the same configuration we have in the Hub. The configuration file environment.yml
for the Hub is available at the site repository for the course. Before using environment.yml
to create a new environment, take a look to the text file and try to identify parts of the file that are not necessary or irrelevant to create the Hub environment; you will have to remove the last lines of the file that include the pip instructions for the requirements of the JupyterBook we use for the website, which are not necessary for the creation of the virtual environment.
Then, you can create a the virtual environment with
mamba env create -f environment.yml
You can always use conda env create -f <yml file>
, but mamba is much faster than conda. It will take a few minutes until all the dependencies get installed in your machine. It could be the case that one specific library produces conflict in certain machines. If conda prints a error message associated to a specific library included in the environment file, try to comment such line in environment.yml
and try again.
Now you can activate the environment from the terminal using conda activate <environment name>
. From there, you can lunch the classical notebook by typing jupyter notebook
but you also have installed the JupyterLab, which you can lunch with jupyter lab
.
In order to reproduce the same environment we have in the Hub, we are going to install JupyterLab in our base environment and then create a Python kernel with the environment of the Hub. In your base environment (conda activate base
), be sure to have installed both JupyterLab
and ipykernel
. If not, you can install then using conda:
mamba install -c conda-forge jupyterlab
mamba install ipykernel
From here, the story is pretty much the same of what we did in the Hub to create a kernel with a new virtual environment (see Using an environmnet in your notebooks). Moving back to the new environment we just created (conda activate <environment name>
) we run
python -m ipykernel install --user --name <environment name> --display-name "IPython - STAT 159"
Now, if we come back to base
and launch Jupyter Lab with jupyter lab
, we should see the option creating a new notebook with the IPython - STAT 159
kernel.
Part 3: Reproducibility check¶
For Homework No4 you created a virtual environment with the desired installation for reproducing one of the papers. Clone the repository for the homework in you local machine and use the yml
file in the GitHub repository to create a virtual environment in your local machine from which you are able to run the same code you ran in the Hub.
Note
If you don’t have git installed on your machine, you can get it via conda/mamba too!
Additional Comments¶
If you’ve made it to this point, you can continue the configuration of your local machine so it also includes some of the other tools we were using in the Hub.
Dotfiles: You can clone the bare repository you used to configure your bash, git and more setups in the hub. Remember to clone the repository you had forked in your personal GH account.
Midnight Commander: You can install the midnight commander on your local computer using brew, an installation tool for Mac computers.