library(xts)
library(here)
<- readRDS(file = here("databases/euro.rds")) euro
Differences and Lags in Time Series
Time series analysis often involves calculating differences and lags to understand the dynamics of the data. In R, we can use the diff.xts
and lag.xts
functions from the xts
package for these purposes. However, it’s important to understand how these functions depend on the frequency of the data.
Differences and Lags with diff.xts
and lag.xts
Calculating Differences
The diff.xts
function calculates the differences between successive elements in a time series. For example, the 2-period difference is calculated as follows:
$d2xus <- diff.xts(euro$XUS, 2) euro
Calculating Lags
The lag.xts
function shifts the time series by a specified number of periods. For example, the 2-period lag is calculated as follows:
$l2xus <- lag.xts(euro$XUS, 2) euro
Problems with Daily Data
With daily data, a common issue arises due to the way R calculates lags. The lag.xts
function calculates the lag based on the number of observations, not calendar days. This can lead to incorrect lags when there are missing days in the data (e.g., weekends and holidays).
Let’s illustrate this with an example:
# View data around the end of the year
"2022-12-27/", c("XUS", "l2xus")] euro[
XUS l2xus
2022-12-27 1.0638 1.0614
2022-12-28 1.0608 1.0635
2022-12-29 1.0661 1.0638
2022-12-30 1.0702 1.0608
2023-01-02 1.0662 1.0661
2023-01-03 1.0546 1.0702
2023-01-04 1.0599 1.0662
2023-01-05 1.0520 1.0546
2023-01-06 1.0644 1.0599
In this case, the lagged value for January 2 is taken from December 29, not from December 31, which is not in the series.
Fixing the Problem
To fix this problem, we need to ensure that the time series accounts for all calendar days, even if some days are missing from the original data. Here’s how we can do that:
Step 1: Create a Sequence of All Dates
First, we create a sequence of all dates from the beginning to the end of the dataset. We also re-read the euro dataset:
<- readRDS(file = here("databases/euro.rds"))
euro <- seq(from = start(euro),
new_dates to = end(euro),
by = "day")
Step 2: Merge the Sequence with the Original Data
Next, we merge this sequence with our euro
dataset to include all calendar days:
<- merge(euro, new_dates)
euro
"2022-12-27/", c("XUS")] euro[
XUS
2022-12-27 1.0638
2022-12-28 1.0608
2022-12-29 1.0661
2022-12-30 1.0702
2022-12-31 NA
2023-01-01 NA
2023-01-02 1.0662
2023-01-03 1.0546
2023-01-04 1.0599
2023-01-05 1.0520
2023-01-06 1.0644
Step 3: Recalculate the Lag
Now, we recalculate the lag with the complete sequence of dates:
$l2xus <- lag.xts(euro$XUS, 2) euro
Step 4: Verify the Fix
Finally, we verify that the lag is now calculated correctly:
"2022-12-27/", c("XUS", "l2xus")] euro[
XUS l2xus
2022-12-27 1.0638 NA
2022-12-28 1.0608 1.0635
2022-12-29 1.0661 1.0638
2022-12-30 1.0702 1.0608
2022-12-31 NA 1.0661
2023-01-01 NA 1.0702
2023-01-02 1.0662 NA
2023-01-03 1.0546 NA
2023-01-04 1.0599 1.0662
2023-01-05 1.0520 1.0546
2023-01-06 1.0644 1.0599
By including all calendar days in the series, the lag is correctly calculated based on calendar days rather than observations.
This tutorial has shown how to handle the calculation of differences and lags in time series data, especially when dealing with daily data. By creating a complete sequence of dates and merging it with the original data, we ensure that functions like lag.xts
calculate lags correctly based on calendar days.