Basic charts in Base R

In R, there are multiple systems for creating graphs, each with its own strengths. Base R provide a straightforward approach to creating various plots without additional packages. Mastering base R plotting functions equips you with foundational skills for effective data visualization. In this section, we will discuss how to make some basic charts in base R. Please load the following libraries:

library(haven)
library(dplyr)
library(here)

We will use some WEO data for the US that we have stored in a rds file–a file format is used to store a dataframe on disk. We read it with the readRDS command

weo_usa<-readRDS(here("databases/weo_usa.rds"))

We look at the last part of the data frame with the tail command. The file contains real GDP (ngdp_rpch), unemployment (lur) and cpi inflation (pcpi_pch)

tail(weo_usa)
# A tibble: 6 × 4
   year ngdp_rpch   lur pcpi_pch
  <dbl>     <dbl> <dbl>    <dbl>
1  2023      1.58  3.83     4.52
2  2024      1.06  4.92     2.32
3  2025      1.76  4.80     2.10
4  2026      2.05  4.30     2.00
5  2027      2.12  4.10     2.01
6  2028      2.12  4.06     2.09

Line Charts

We will start with a line chart of US unemployment from the year 1980.

# Using the same sample data as barplot
# Filter the data to include only years >= 1980
weo_usa <- subset(weo_usa, year >= 1980)

# Create a line chart
plot(weo_usa$year,weo_usa$lur, type = "l", main = "US: Unemployment", xlab = "", ylab = "Percent", col = "red")

Note that the y-axis labels have the wrong orientation. We can fix this by adding las=1:

# Using the same sample data as barplot
# Create a line chart
plot(weo_usa$year,weo_usa$lur, type = "l", main = "US: Unemployment", 
     xlab = "", ylab = "Percent", col = "red", las=1)

If we want to add a second line, we use the lines command

# Using the same sample data as barplot
# Create a line chart
plot(weo_usa$year,weo_usa$lur, type = "l", main = "US: Unemployment", xlab = "", 
     ylab = "Percent", col = "red", las=1)
lines(weo_usa$year,weo_usa$pcpi_pch,col="blue")

The range of the y-axis is too small. One solution is to fix it by hand, using the ylim command:

# Using the same sample data as barplot
# Create a line chart
plot(weo_usa$year,weo_usa$lur, type = "l", main = "US: Unemployment", xlab = "", 
     ylab = "Percent", col = "red", las=1, ylim=c(0,13))
lines(weo_usa$year,weo_usa$pcpi_pch,col="blue")

A better solution is to have R find the appropriate limits by using the range command:

# Using the same sample data as barplot
# Create a line chart
plot(weo_usa$year,weo_usa$lur, type = "l", main = "US: Unemployment", xlab = "", 
     ylab = "Percent", col = "red", las=1, ylim=range(weo_usa$lur,weo_usa$pcpi_pch))
lines(weo_usa$year,weo_usa$pcpi_pch,col="blue")

Let’s add a legend. The bty parameter indicates whether or not to have a box around the legend.

# Using the same sample data as barplot
# Create a line chart
plot(weo_usa$year,weo_usa$lur, type = "l", main = "US: Unemployment", xlab = "", 
     ylab = "Percent", col = "red", las=1, ylim=range(weo_usa$lur,weo_usa$pcpi_pch))
lines(weo_usa$year,weo_usa$pcpi_pch,col="blue")
legend(x = "topleft", bty = "n",
       legend=c("Unemployment", "Inflation"),  
       fill = c("red","blue")) 

Scatter Plots

Scatter plots are made the same way as line charts, but instead of a data variable we have another variable on the x-axis. For example, let’s make a scatter plot of unemployment versus inflation:

# Create a scatter plot
plot(weo_usa$lur,weo_usa$pcpi_pch,  main = "US: Unemployment rate and inflation", 
     xlab = "Unemployment rate", ylab = "Inflation", pch = 16, col = "navy",las=1)

The link is not very strong. We can add a regression line:

# Create a scatter plot
plot(weo_usa$lur,weo_usa$pcpi_pch,  main = "US: Unemployment rate and inflation", xlab = "Unemployment rate", ylab = "Inflation", pch = 16, col = "navy",las=1)

eq<-lm(weo_usa$pcpi_pch~weo_usa$lur)
abline(eq)

Bar Charts

Bar charts are a bit tricky. You need to specific two arguments: height and names.arg.

From the documentation: height is either a vector or matrix of values describing the bars which make up the plot. If height is a vector, the plot consists of a sequence of rectangular bars with heights given by the values in the vector. If height is a matrix and beside is FALSE then each bar of the plot corresponds to a column of height, with the values in the column giving the heights of stacked sub-bars making up the bar. If height is a matrix and beside is TRUE, then the values in each column are juxtaposed rather than stacked.

Bar Chart of one variable

Let’s first make a bar chart of one variable. In this case height should be a vector:

# Using historical GDP of the United States as sample data

# Create a bar chart
barplot(height = as.vector(weo_usa$lur), names.arg = weo_usa$year, 
        main = "US: Unemployment rate", ylab = "Percent", col = "lightblue", las=1)

box()

box() adds a box around the plot area.

Bar chart of two variables

We now make a bar chart of two variables. We will make a chart of the “misery rate”–the sum of the unemploymenet rate and inflation.

We select the two variables and change it into a matrix, using the as.matrix() function. Note that in the barplot function we need to transpose this matrix, using t().

# Using historical GDP of the United States as sample data

# Create a matrix from the selected columns of the data frame
df_matrix <- as.matrix(weo_usa[, c("lur", "pcpi_pch")])

# Create a bar chart
barplot(height = t(df_matrix), names.arg = weo_usa$year, 
        main = "US: Misery rate", ylab = "Percent", col = c("red", "lightblue"), las = 1)

# Add the legend at the top right of the chart
legend("topright", legend = c("Unemployment", "Inflation"), fill = c("red", "lightblue"),bty = "n")
box()