How to do it...

Take a look at the following steps to get started:

  1. Start by downloading the Anaconda distribution from https://www.anaconda.com/download. Choose Python version 3. In any case, this is not fundamental, because Anaconda will let you use Python 2 if you need it. You can accept all the installation defaults, but you may want to make sure that the conda binaries are in your path (do not forget to open a new window so that the path is updated). If you have another Python distribution, be careful with your PYTHONPATH and existing Python libraries. It's probably better to unset your PYTHONPATH. As much as possible, uninstall all other Python versions and installed Python libraries.
  2. Let's go ahead with the libraries. We will now create a new conda environment called bioinformatics with biopython=1.70, as shown in the following command:
conda create -n bioinformatics biopython biopython=1.70
  1. Let's activate the environment, as follows:
source activate bioinformatics
  1. Let's add the bioconda and conda-forge channel to our source list:
conda config --add channels bioconda
conda config --add channels conda-forge

Also, install the core packages:

conda install scipy matplotlib jupyter-notebook pip pandas cython numba scikit-learn seaborn pysam pyvcf simuPOP dendropy rpy2

Some of them will probably be installed with the core distribution anyway.

  1. We can even install R from conda:
conda install r-essentials r-gridextra

r-essentials installs a lot of R packages, including ggplot2, which we will use later. We also install r-gridextra, since we will be using it in the Notebook.