Lab Session 7: Local Installation

  • Statistics 159/259, Spring 2022

  • Prof. F. Pérez and GSI F. Sapienza, Department of Statistics, UC Berkeley.

  • 04/04/2022

The goal for today is to setup your local machine with the JupyterLab we are using in the Stat 159/259 Hub. Even if you already have conda installed in your local machine, it’s not a bad idea to reset the conda instal. The idea is to have a very simplistic base environment from which you can launch JupyterLab and connect it to the kernels associated to other environmnets you create.

Useful links:

Part 1: Installation and configuration of miniconda

Instead of installing conda with the full package management system and many Python libraries, we are going to install Miniconda, a minimalist version of conda that included the package management tool and a small number of useful packages. If you already have conda installed in your machine, you may not need to do this. An easy way to check if conda is installed in your computer is by entering conda --help in the terminal. You are also welcome to install conda from zero by removing all conda directories and installing miniconda.

You can install the last version of miniconda depending on you operating system from the Miniconda website. For unix machines, a more simple way of doing this is by using the wget command followed by the installer link (silent installation mode). For example, for a MacOSX machine (without M1 chip) you can use

wget https://repo.anaconda.com/miniconda/Miniconda3-latest-MacOSX-x86_64.sh -O ~/miniconda.sh
mkdir -p $HOME/local/conda
bash ~/miniconda.sh -b -p $HOME/local/conda

and for a Linux machine

wget https://repo.anaconda.com/miniconda/Miniconda3-latest-Linux-x86_64.sh -O ~/miniconda.sh
mkdir -p $HOME/local/conda
bash ~/miniconda.sh -b -p $HOME/local/conda

If you don’t have wget installed, you can do it using brew with brew install wget; or download the .sh file directly from the miniconda installation website and replace ~/miniconda.sh by the name of the downloaded file (don’t change the name of the file and be sure that the direction to the file is correct, either by adding the full path to the file or by moving the sh file to your home directory).

Note for Windows users

If you are working from a Windows machine, you can directly install Miniconda using the link provided in the website. Most cloud-hosted computing is based on a Unix foundation, which both MacOS and Linux provide out of the box, so some of our instructions fit that workflow a little more easily. But for Windows platforms, Microsoft now officially supports something very similar, which is excellent and we highly encourage you to play with and learn about: the Windows Subsystem for Linux (WSL). It provides top-notch Unix functionality next to your regular Windows tools and workflow, and is probably the ideal setup for modern scientific and research computing on Windows. Discussing it in detail is beyond the scope of this course, but we highly encourage you to explore this tool on your own (and let us know how it goes!).

In order to specify the path where conda is installed, you need to add the following line to your .bashrc file

# add path to conda
export PATH="$HOME/local/conda/bin:$PATH"

and execute it by source .bashrc. After doing this, you will see that now conda is recognized. You can fix this change by now running conda init, so you don’t need to worry again about PATH. You can see where conda has been installed by typing which conda in your terminal.

You can change the configuration of conda inside .condarc manually or directly though the shell by using the following commands. In this case, we specify the channel used by conda/mamba to download the libraries.

conda config --add channels conda-forge
conda config --set channel_priority strict

Once you have installed conda, you can use it to install mamba:

conda install --yes mamba

If this doesn’t work, you can also try conda install mamba -n base -c conda-forge. If this still doesn’t work, you can always replace mamba by conda in the following section, but it will be much slower and unefficient.

Part 2: Setting the Hub environment in local

Once you have installed miniconda, you can create virtual environments from a yml file just as we did in the Hub. As a first step, we are going to create a virtual environment with the same configuration we have in the Hub. The configuration file environment.yml for the Hub is available at the site repository for the course. Before using environment.yml to create a new environment, take a look to the text file and try to identify parts of the file that are not necessary or irrelevant to create the Hub environment; you will have to remove the last lines of the file that include the pip instructions for the requirements of the JupyterBook we use for the website, which are not necessary for the creation of the virtual environment.

Then, you can create a the virtual environment with

mamba env create -f environment.yml

You can always use conda env create -f <yml file>, but mamba is much faster than conda. It will take a few minutes until all the dependencies get installed in your machine. It could be the case that one specific library produces conflict in certain machines. If conda prints a error message associated to a specific library included in the environment file, try to comment such line in environment.yml and try again.

Now you can activate the environment from the terminal using conda activate <environment name>. From there, you can lunch the classical notebook by typing jupyter notebook but you also have installed the JupyterLab, which you can lunch with jupyter lab.

In order to reproduce the same environment we have in the Hub, we are going to install JupyterLab in our base environment and then create a Python kernel with the environment of the Hub. In your base environment (conda activate base), be sure to have installed both JupyterLab and ipykernel. If not, you can install then using conda:

mamba install -c conda-forge jupyterlab
mamba install ipykernel

From here, the story is pretty much the same of what we did in the Hub to create a kernel with a new virtual environment (see Using an environmnet in your notebooks). Moving back to the new environment we just created (conda activate <environment name>) we run

python -m ipykernel install --user --name <environment name> --display-name "IPython - STAT 159"

Now, if we come back to base and launch Jupyter Lab with jupyter lab, we should see the option creating a new notebook with the IPython - STAT 159 kernel.

Part 3: Reproducibility check

For Homework No4 you created a virtual environment with the desired installation for reproducing one of the papers. Clone the repository for the homework in you local machine and use the yml file in the GitHub repository to create a virtual environment in your local machine from which you are able to run the same code you ran in the Hub.

Note

If you don’t have git installed on your machine, you can get it via conda/mamba too!

Additional Comments

If you’ve made it to this point, you can continue the configuration of your local machine so it also includes some of the other tools we were using in the Hub.

  1. Dotfiles: You can clone the bare repository you used to configure your bash, git and more setups in the hub. Remember to clone the repository you had forked in your personal GH account.

  2. Midnight Commander: You can install the midnight commander on your local computer using brew, an installation tool for Mac computers.