Digital Education Resources - Vanderbilt Libraries Digital Lab

Previous lesson: Programming environments

If you just want to get started as quickly as possible, skip this page and go to the Quickstart guide for running Python in a Colab notebook or Quickstart guide for running R in RStudio Cloud.

Installing a programming environment

The purpose of this lesson is to help you decide on which programming environment is best for your situation. The first two sections discuss general issues related to program environment options.

If you are learning R, your main option is RStudio.

If you are learning Python, there are many options. The sections from the “Choices for running Jupyter notebooks” section onward show you what is required to use the various Python programming options so that you can reach a decision on the platform you want to use. After deciding, these videos can also guide you when you need to load a lesson notebook.

If you are considering extensive work in learning about Data Science, you should watch the second video, which explains about the Anaconda distribution.

Once you have decided what system you want to use, you can follow the provided links to other pages with installation details.

Learning objectives At the end of this lesson, the learner will:

Total video time: 27m 06s

Example Jupyter notebook at GitHub

Example Colab notebook

Lesson slides


Programming environment choices (3m31s)

This sections discuss local computer and cloud-based options for programming.

For instructions on installing R and RStudio, see this page. However, if you are planning to install the Anaconda distribution, you do not need to go through that installation separately. See the following section for more information about Anaconda.

There are many options for programming in Python, but for these lessons we recommend using Jupyter notebooks. See the third section for Jupyter notebook options.

What is Anaconda? (6m30s)

Anaconda is an umbrella system for data science that includes many of the most important tools used in data science. It is also free. It includes both the Python and R programming languages, most of the common Python libraries used in science and engineering (including NumPy, SciPy, Matplotlib, and pandas), and many commonly used R packages. Anaconda also includes the popular Jupyter notebook system, RStudio, the Spyder Python development environment, and has its own custom package management system. The Anaconda Navigator provides access to the system through a desktop GUI (shown in the screenshot above).

Technically, there are two different pieces in play. Anaconda itself is a software distribution - it includes a number of pre-configured programs and packages. Conda is a package manager that is installed automatically when Anaconda is installed. Conda is used to install, remove, and update the packages associated with the software that is a part of the Anaconda distribution. Conda can actually be used as a package management system independently of Anaconda. This blog post contains more details.

Is Anaconda for me?

Using Anaconda is appealing because it allows you to have access to many data science tools with a single install. However, there are several things to consider before installing Anaconda.

  1. It’s big. Because Anaconda automatically installs a lot of packages, it can get big (can be up to 3 GB). You might also have problems if your computer is using an outdated operating system. So Anaconda could be a problem on old computers with limited drive space.

  2. Potential conflicts. There could be potential conflicts between the Conda package managing system and other systems you are using. In particular, if you are using a Mac with Homebrew, you should do some further research before installing Anaconda. Here’s a place to start.

  3. Virtual environments. The Conda package manager has its own built-in environment manager. (That’s why a Mac Terminal prompt starts including (base) after Anaconda is installed.) There may be some complications if you need to use Conda-managed packages within your virtualenv or venv virtual environment. See this page for more details.

If you are a newby user with a reasonably new computer, you are probably not going to know or care about any of these things and can probably safely just install Anaconda and get on with your life. If you are an advanced user and completely understand all of these things, then you will understand the implacations and be able to deal with any problems that may arise. Things might be complicated if you are an intermediate user and are using the tools mentioned above (e.g. Homebrew or virtual environments), but aren’t expert enough to figure out how to fix things if they go wrong.

If you are learning R and decide not to install Anaconda, to do a stand-alone installation, see this page for instructions. Otherwise, go on to the next section.

Alternatives to Anaconda for installing R and RStudio

If you are learning R and decide not to install Anaconda, to do a stand-alone installation, see this page for instructions.

Another quick alternative is to use the cloud-based RStudio Cloud. You can use RStudio there without any installation, although it requires creating an account.

If you want to use the Anaconda option, go on to the next section.

Installing Anaconda

If you decide that Anaconda is for you, go to the Anaconda Installation page and folow the links for your operating system.

To start using Anaconda, go to the Anaconda Navigator. It will show up in your Windows Start menu or Launchpad on Mac. Not every application included in Anaconda will be pre-installed, so the first time you want to use one of the applications in the Anaconda Navigator, you may need to click the Install button. After the first time, the button will change to Launch. The Getting started page has more details.

If you are learning R and are going to use the RStudio IDE, you do not need to watch any of the rest of the videos on this page! You can go straight to the next R lesson: R programming basics

If you are interested in running R using Jupyter notebooks, you may want to continue watching. However, all future R lessons will use RStudio in the examples and not Jupyter notebooks.

Choices for running Jupyter notebooks (1m32s)

You should watch the following videos that explain how the various systems work, then try the cloud-based systems to see if they will be adequate for you. If a cloud-based system won’t work for you, then you probably should install Anaconda locally. If you are running Jupyter notebooks locally, you can use the Anaconda navigator to start them up, as shown in the following video, or using the console as described in the following paragraph.

If all else fails and you can neither use one of the cloud-based systems nor Anaconda, you can do a stand-alone installation of Jupyter notebooks. To do that you first need to do a separate installation of Python, then use PIP to install Jupter notebooks. To run Jupyter notebooks “manually” (without Anaconda Navigator), go to your console, then enter

jupyter notebook

That will start the notebook server and open the navigation page.

Running Jupyter notebooks locally

Note: JupyterLab is another system for running Jupyter notebooks that comes with Anaconda. It is a fancier interface for navigating your filesystem and launching notebooks. However, the way that notebooks run will be the same and the coding demonstrated here will be done using the “regular” Jupyter notebook framework.

Running your own Jupyter notebook locally (3m56s)

Downloading and running a Jupyter notebook from GitHub (1m54s)

Example Jupyter notebook to download from GitHub

Quickstart on using Jupyter notebooks

Setting up VS Code to run Jupyter notebooks - optional to watch (5m08s)

Link to VS Code documentation on using Jupyter notebooks

This is an option that will not interest most people, but I was asked about it. So I made a video to show how to do the setup. The interface is different than the browser interface, but VS Code will perform the same notebook functions as the browser with the addition of debugging tools that are built into VS Code. I have not explored that functionality, but you may find it useful.

Running Jupyter notebooks in the cloud

Running your own Jupyter notebook in Colab (4m22s)

Google Colaboratory landing page. If presented with an options screen, you can click New Notebook to create a blank notebook or cancel to open an actual Colab notebook that will run. See the next section for information about how to copy this notebook to your own Google Drive.

Once you have stored your own notebooks, you can select a Recent notebook, or open notebooks from your Google Drive.

Running someone else’s Jupyter notebook in Colab (3m35s)

Example notebook in Colab. The link will open the notebook in read-only mode. To run the notebook without saving, click Open in playground, and agree to the warning. Click the play button at the left of each cell in order to run the script.

In order to be able to save any changes you’ve made to the script, you need to copy it to your own Google Drive by clicking the Copy to Drive link at the upper left, or select Save a copy in Drive from the file menu.

To open a notebook cloned to your own Google Drive, go to your Drive, click on the Colab Notebooks folder, then a notebook name. Then click on the Open with Google Colaboratory button at the top of the screen to open the notebook at the site.

Make your decision

Deciding on a programming environment (1m46s)

If you have decided that the cloud options will work for you (Colab), there is no additional installation required. Note: Microsoft has ended free access to Azure, so it is no longer covered in these lessons.

If you want to or must use Anaconda, go to Anaconda Installation page and folow the links for your operating system.

If you want to do a stand-alone local installation of Jupyter notebooks, see the last paragraph of the Choices for running Jupyter notebooks section for more information.

Next Python lesson: Python programming basics

Next R lesson: R programming basics

Revised 2022-01-07

Questions? Contact us

License: CC BY 4.0.
Credit: "Vanderbilt Libraries Digital Lab -"