library(readxl)
library(openxlsx)
library(here)
Reading Excel Data in R
Introduction
Reading Excel files into R is a common task for data analysis. In this tutorial, we will explore how to read Excel data into R using various libraries. We will use selected Penn World Table data between 1994 and 2019 as an example.
Prerequisites
Before we begin, make sure you have the necessary packages installed and loaded. The main packages used for reading Excel files in R are readxl
, openxlsx
, and XLConnect
. Each has its strengths and specific use cases.
Loading the Packages
Load the following packages:
If you have not have done so, please install them first.
The Data
We have an Excel file with selected Penn World Table data between 1994 and 2019:
The data are in the file “databases/penn_excel_sample.xlsx”, in sheet “Sheet1”.
Reading Excel Data with Different Libraries
Using readxl
The readxl
package is simple and easy to use for reading Excel files. It supports both .xls
and .xlsx
formats.
# Read the data from the specified sheet using readxl
<- read_excel(here( "databases/penn_excel_sample.xlsx"),
pwt_data_readxl sheet = "Sheet 1")
# Display the data
head(pwt_data_readxl)
# A tibble: 6 × 10
year rgdpna_BRA rgdpna_MEX rgdpna_USA emp_BRA emp_MEX emp_USA pop_BRA pop_MEX
<dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
1 1994 1742187. 1387994 11150584 62.5 31.7 126. 159. 90.2
2 1995 1815770. 1300672 11449898 62.9 32.0 127. 162. 91.7
3 1996 1854818. 1388770 11881846 61.8 33.1 129. 165. 93.1
4 1997 1917423 1483857 12410257 62.8 33.9 132. 167. 94.6
5 1998 1918101. 1560482. 12966412 62.6 35.2 135. 170. 96.1
6 1999 1922974. 1603451 13582736 65.9 36.6 137. 172. 97.5
# ℹ 1 more variable: pop_USA <dbl>
Using openxlsx
The openxlsx
package provides more advanced options for reading and writing Excel files. It is particularly useful if you need to style your Excel output.
# Read the data from the specified sheet using openxlsx
<- read.xlsx(here( "databases/penn_excel_sample.xlsx"),
pwt_data_openxlsx sheet = "Sheet 1")
# Display the data
head(pwt_data_openxlsx)
year rgdpna_BRA rgdpna_MEX rgdpna_USA emp_BRA emp_MEX emp_USA pop_BRA
1 1994 1742187 1387994 11150584 62.45842 31.70289 125.6900 159.4327
2 1995 1815770 1300672 11449898 62.93206 32.00495 127.4627 162.0199
3 1996 1854818 1388770 11881846 61.78389 33.09135 129.3577 164.6147
4 1997 1917423 1483857 12410257 62.79587 33.92621 132.3626 167.2090
5 1998 1918101 1560482 12966412 62.61137 35.24316 134.5009 169.7852
6 1999 1922974 1603451 13582736 65.88960 36.63657 136.7565 172.3187
pop_MEX pop_USA
1 90.15640 262.2412
2 91.66329 265.1637
3 93.14704 268.3350
4 94.61100 271.7136
5 96.05632 275.1753
6 97.48483 278.5482
Comparison of Libraries
readxl
: This package is best for straightforward reading of Excel files. It is lightweight and fast but does not support writing to Excel files.openxlsx
: This package is more versatile, supporting both reading and writing Excel files. It also allows for advanced styling options.
Conclusion
In this tutorial, we have covered how to read Excel data into R using the readxl
, openxlsx
, and XLConnect
packages. Each package has its strengths and can be chosen based on your specific needs for reading and manipulating Excel files in R.