Digital Education Resources - Vanderbilt Libraries Digital Lab
Previous lesson: introduction to ggplot
In this lesson we will learn how to control details of plots, such as colors used, orientation of features, and descriptive elements associated with the plot.
Learning objectives At the end of this lesson, the learner will be able to:
Total video time: n/a
ggplot2 book - colour scales and legends
ggplot2 book - position scales and axes
Specifying color by name
ggplot(data = rel_data, mapping = aes(x = limited_proficiency, y = economically_disadvantaged)) +
geom_point(color = "red")
Specifying color by hex code
ggplot(data = rel_data, mapping = aes(x = limited_proficiency, y = economically_disadvantaged)) +
geom_point(color = "#349c35")
Use the color attribute of the aesthetic to vary color based on a factor
ggplot(data = three_level, mapping = aes(x = limited_proficiency, y = economically_disadvantaged)) +
geom_point(aes(color = as.factor(`School Level`)))
Setting and using a color palette for color vision deficiency
cbPalette <- c("#000000", "#E69F00", "#56B4E9", "#009E73", "#F0E442", "#0072B2", "#D55E00", "#CC79A7", "#999999")
ggplot(data = three_level, mapping = aes(x = limited_proficiency, y = economically_disadvantaged)) +
geom_point(aes(color = as.factor(`School Level`))) + # not required to convert to factor if character
scale_colour_manual(values=cbPalette)
Note: Use scale_fill_manual
instead of scale_colour_manual
when using color as a fill instead of points.
Use the color attribute of the aesthetic to vary color based on a continuous variable
ggplot(data = three_level, mapping = aes(x = limited_proficiency, y = economically_disadvantaged)) +
geom_point(aes(color = percent_white)) +
scale_colour_gradientn(colours=rainbow(4)) # the number controls the number of colors included
Note: Omitting scale_colour_gradientn
will use the default color ramp.
Code to generate a table showing all of the available point markers and their numeric codes (from here)
d=data.frame(p=c(0:25,32:127))
ggplot() +
scale_y_continuous(name="") +
scale_x_continuous(name="") +
scale_shape_identity() +
geom_point(data=d, mapping=aes(x=p%%16, y=p%/%16, shape=p), size=5, fill="red") +
geom_text(data=d, mapping=aes(x=p%%16, y=p%/%16+0.25, label=p), size=3)
Use shape
, size
, color
, and fill
to control the appearance of the markers
ggplot(data = rel_data, mapping = aes(x = limited_proficiency, y = economically_disadvantaged)) +
geom_point(shape = 24, size = 5, color = "blue", fill = "yellow")
The size of a point marker can be controlled in an aesthetic by a continuous variable
ggplot(data = rel_data, mapping = aes(x = limited_proficiency, y = economically_disadvantaged)) +
geom_point(aes(size = percent_white), shape = 2, color = "blue")
The shape of a point marker can only be controlled in an aesthetic by a categorical variable (factor)
ggplot(data = three_level, mapping = aes(x = limited_proficiency, y = economically_disadvantaged)) +
geom_point(aes(shape = `School Level`), size = 3, color = "blue")
The outline (color) and interior (fill) of a bar can be manually specified by either name or hex code
ggplot(data = schools_data) +
geom_histogram(mapping = aes(x = Female), binwidth = 100, color = "red", fill = "gray")
ggplot(data = schools_data) +
geom_histogram(mapping = aes(x = Female), binwidth = 100, color = "#349c35", fill = "#9C349B")
The color of the category bars can be set in the aesthetic by setting the fill
argument to the same column as is used for the x
variable.
ggplot(data = three_level) +
geom_bar(mapping = aes(x = `School Level`, fill = `School Level`), color = "black")
The default color scheme can be tweaked by modifying the luminance (amount of black) and chromo (saturation; intensity of color) using scale_fill_hue
ggplot(data = three_level) +
geom_bar(mapping = aes(x = `School Level`, fill = `School Level`), color = "black") +
scale_fill_hue(c = 80, l = 20)
**Note: Luminance (l
) values vary from 0 to 100. Chroma (c
) values vary starting at 0 and going up to an indefinite value (depends on the hue; could be around 400).
If one categorical variable is used for x, another variable can be used for sub-bars using fill
argument in the aesthetic.
ggplot(data = three_level) +
geom_bar(mapping = aes(x = as.factor(`Zip Code`), fill = `School Level`))
Notice how this differs from the last example in which x
and fill
used the same column from the data frame.
The position
argument can be used to change the way the sub-bars are stacked. A value of fill
makes the total bars the same size and represents fractions, and a value of dodge
places the sub-bars side-by-side
ggplot(data = three_level) +
geom_bar(mapping = aes(x = as.factor(`Zip Code`), fill = `School Level`), position = "fill")
ggplot(data = three_level) +
geom_bar(mapping = aes(x = as.factor(`Zip Code`), fill = `School Level`), position = "dodge")
In a geom_col
, the aesthetic’s has a y
value that is a continuous variable. That makes the y-axis display that value rather than counts as in a geom_bar
.
ggplot(data = co2) +
geom_col(mapping = aes(x = State, y= metric_tons, fill = sector))
As with geom_bar
, the position
argument controls placement of the sub-bars
ggplot(data = co2) +
geom_col(mapping = aes(x = State, y= metric_tons, fill = sector), position = "dodge")
The order of levels in a factor determines the order in which elements of the plot are displayed. The order can be changed manually using the factor
function.
co2$sector <- factor(co2$sector, c("Transportation", "Electric.Power", "Industrial", "Residential", "Commercial"))
The first argument is the factor vector to be modified and the second argument is a vector of labels indicating the order they should appear in the reordered factor vector.
The order of levels can also be changed by sorting based on some characteristic of a variable as it applies to rows that have that level. This is done with the reorder
function. In the following example, the mean CO2 value for all the rows having a particular state
level is used to order those state levels.
co2$State <- reorder(co2$State, -co2$Total, mean)
In this example, co2$Total
is preceeded by a minus sign to make the ordering descending. The final argument (mean
) determines how the set of values applying to a level are processed to produce a single number that can be used for sorting.
For figures in papers that can’t be in color, a grayscale color ramp can be used by adding a scale_fill_grey
component to the plot.
ggplot(data = co2) +
geom_col(mapping = aes(x = State, y= metric_tons, fill = sector), position = "dodge", color = "black") +
scale_fill_grey()
The position of the legend can be controlled using theme(legend.position)
. Valid values are left
, right
, top
, bottom
, and none
.
ggplot(data = co2) +
geom_col(mapping = aes(x = State, y= metric_tons, fill = sector), position = "dodge", color = "black") +
theme(legend.position = "left")
The legend can also be positioned within the plot using a vector composed of the X and Y position within the plot (ranging between 0 and 1)
ggplot(data = co2) +
geom_col(mapping = aes(x = State, y= metric_tons, fill = sector), position = "dodge", color = "black") +
theme(legend.position = c(.8, .8))
The labs
object is used to set values of the X and Y axes, and title.
ggplot(data = co2) +
geom_col(mapping = aes(x = State, y= metric_tons, fill = sector), position = "dodge", color = "black") +
labs(
x = "states with the largest total",
y = "CO<sub>2</sub> released (in 10^6 metric tons)",
title = "Fig. 3. Annual fossil fuel combustion in 2018 (EPA 430-R-18-003)"
)
The title of the legend is also changed using an argument of the labs
function. The key of that argument is the aspect used in the aesthetic to visualize the variable described in the legend. For example, if a variable is used to vary the color of markers, color
is used as the key for the argument:
ggplot(data = three_level, mapping = aes(x = limited_proficiency, y = economically_disadvantaged)) +
geom_point(aes(color = as.factor(`School Level`))) + # not required to convert to factor if character
labs(
x = "percent limited English proficiency",
y = "percent economically disadvantaged",
color = "School level"
)
If the axis labels are crowded together, they can be rotated using guides
ggplot(data = three_level) +
geom_bar(mapping = aes(x = `Zip Code`, fill = `School Level`)) +
guides(x = guide_axis(angle = 90)) +
labs(
y = "number of schools",
title = "distribution of schools in Davidson County"
)
Like everything else in R, ggplot plots are objects that can be assigned to named objects (i.e. variables). Assigning a name to a plot object avoids an unwieldy amount of code in the export function.
grayscale_figure <- ggplot(data = co2) +
geom_col(mapping = aes(x = State, y= metric_tons, fill = sector), position = "dodge", color = "black") +
scale_fill_grey()
The ggtext
library is used to export plots as image files for use as figures in papers or diagrams in presentations. JPEG and PNG are supported natively. SVG images, which scale infinitely and are preferred by publishers, require installation of the svglite
package.
ggsave("figure.png", plot=grayscale_figure, width = 20, height = 30, units = "cm")
ggsave("figure.svg", plot=grayscale_figure, width = 20, height = 30, units = "cm")
Note: ggsave apparently determines the export type based on the file extension used in the output file name.
There are a number of built-in datasets included with the R installation that can be referenced without loading them from an external file. We will use some of them in the practice assignment.
Next lession: Statistical functions
Revised 2021-09-17
Questions? Contact us
License: CC BY 4.0.
Credit: "Vanderbilt Libraries Digital Lab - www.library.vanderbilt.edu"