Digital Scholarship Resources - Vanderbilt Libraries Digital Scholarship and Communications Office

Previous lesson:

If you weren’t sure how you were going to be running your code: Installing a programming environment

If you wanted to start coding R as quickly as possible: Quickstart guide for running R in RStudio Cloud

R programming basics: Vectors, vectorized computing, and packages

This lesson is focused on the fundamental data structure in R, vectors. The lesson will also cover how to make additional functions available to your environment by loading packages.

Learning objectives At the end of this lesson, the learner will:

Total video time: 44 m 41 s

Lesson R script at GitHub

Lesson slides

If you are a Vanderbilt user, you should be able to use your VUNet ID and password for free access to O’Reilly for Higher Education resources. To access them, click this link, then log in. Sometimes it is necessary to close your browser, or clear your cookies to get access, so if you have problems, you can try that. It is also possible to navigate there by going to https://www.library.vanderbilt.edu/, select DATABASES A-Z, click on O, then select O'Reilly for Higher Education.

In this lesson, I’ll reference some sections of the book, R Cookbook, 2nd Edition, which you can find by searching at the O’Reilly site, or try this direct link to the book. The direct links in the text might work, otherwise navigate to the correct section by number.

Vectors

Note: Comments can be added to R scripts to make them more understandable. A comment starts with the # character and R simply ignores everything on the line after it. Here’s an example:

# simple script to demonstrate assignment
x <- c(1,2)  # the "arrow" points to the left to show the direction of the assignment

R objects (9m40s)

Notice that in R, the assignment operator is <-, designed to look like a leftward pointing arrow since the data on the right is passed into the variable on the left. (One can also use the symbol = as the assignment operator, but using <- is more typical.)

To run R statements one at a time, you can use the console pane. After you type each statement and press Enter/Return, the result (if any) shows up in the next line. Changes to the environment also show up in the Global Environment (upper right pane).


Function review (4m07s)

To review functions, see this video


Loading an R script

Watch if you haven’t already viewed it in the RStudio Cloud quickstart lesson.

The editor pane in the upper left of RStudio opens when you either select New File from the File menu, then select R Script or if you open an existing file.

As you work in the editor pane, suggestions will pop up as you type, as shown in the screenshot above.


Vectors (12m43s)

R Cookbook section 2.5

A vector is a one-dimensional data structure consisting of items of the same kind. This would be analogous to a column of data in a spreadsheet. Vectors have a name that is used to refer to that particular instance of a vector. The individual items in the vector can be referenced using their position in the vector, as shown in the diagram above. Note: R is “one based”, meaning that we start counting items at 1. This is in contrast to Python, which is “zero based” (counting starts at 0).

We can construct a vector by explicitly entering its values using the c() (for “combine”) function, like this:

animal <- c("frog", "spider", "worm", "bee")

The screenshot above shows what happens when we create a vector in the console pane, then display the third item in the vector.

You can run code that’s in the editor pane one line at a time, or all at once. To run a single line of code, highlight it (or simply place the cursor somewhere on the line). Then click the Run button at the top of the pane. The statement line will appear in the console, then execute.

To run several lines at once, highlight the ones you want to run, then click the Run button. To run the entire script, highlight all of it, then click Run.

To determine the number of items in a vector, use the length() function:

length(vector)

To determine the type of items that the vector contains, use the mode() function:

mode(vector)

Sequences

A sequence is a vector that contains a sequence of numbers that step by one. To generate a sequence, separate the range of integers by a colon:

numbers <- 3:9

Ranges can decrease and can include negative numbers. For example:

countdown <- 5:-3

Referencing parts of vectors

To reference a single item in a vector, use its index:

animal[3]

To reference a range of items in a vector (a subvector), use a range of indices:

subset <- animal[2:4]

Reminder to Python users: R is one-based and ranges include the final item, so the range 2:4 would include the second through the fourth item.


Vectorized computing (7m38s)


Using packages (10m33s)

To review package managers, see this video

The RStudio package manager is a tab in the lower right pane.

In the search box, start typing the name of the package you want to load. As you type, packages with matching names will be screened. If you see the package you want, click the checkbox to the left of its name. When you check the box, RStudio will run the library function for you in the console pane, and load the package.

If the package does NOT show up in the list, then it isn’t yet installed on your computer. Click the Install button. If prompted to create a personal library, click Yes.

An Install Packages window will pop up. You can leave the Install from: option at its default “Repository (CRAN..”. In the Packages box, type the name of the package in the popup window and press Install.

A bunch of lines will scroll up the console window. When it says “The downloaded binary packages are in…” you’re done. The package should now appear in the list of packages in the Packages pane in the lower right, where you can check its box.


Practice assignment

In the code that you write, for each object use a meaningful name in “snake case”.

  1. Create a vector object containing the names of the days of the week.
  2. Create a sequence of the numbers for the days of the year (1=the first day, 365=the last day). Assign that sequence of numbers to a vector object.
  3. Write the expression for Wednesday using your “day of the week” vector.
  4. Create a vector containing numbers from 0 to 100. Each number represents the radius of a circle. Calculate the areas of those circles and assign the answer to another vector. Note: There are very few “predefined” variables in R, but pi is one of them. You can use pi in your code without defining its value. However, you can also overwrite its value by assigning some other object the name pi. Additional note: / is the operator for division, * is the operator for multiplication, and ^ is the operator for exponentiation (i.e. r^2 is r squared).
  5. The number of dogs in five households is: 2, 0, 1, 0, 1. The number of cats in the same five households is: 1, 0, 0, 2, 3. Assign the number of dogs to one vector and the number of cats to another vector. Calculate the number of pets (dogs plus cats) in the five households and assign the answer to another vector.

Next lesson: Lists and dataframes


Revised 2022-01-06

Questions? Contact us

License: CC BY 4.0.
Credit: "Vanderbilt Libraries Digital Scholarship and Communications - www.library.vanderbilt.edu"