python \\ecnswn12p\ems_shared\pub\datatools\installer.py dev
IMF Datatools
In this tutorial, we will explore how to install IMF Datatools and access data in R.
IMF Datatools is a Python library that allows us to access various internal and external data sources, and retrieve data directly in R.
Pre-requisites
Since IMF Datatools is Python based, we first need download Python 3.11 or above from the IMF Software Center and install our Python library.
After you have downloaded Python 3.11 or the latest version from the IMF Software Center, open the Command Prompt. You can load this by typing “cmd” in the Windows search option, as shown below.
In the command prompt, type the following command to install Datatools.
You should generate something similar to the following message when the installation is complete.
Loading the Python Interpreter
After Python 3.11 is downloaded and Datatools is installed, we need to load it into R Studio. To use IMF Datatools in RStudio, you need to set the location of your Python interpreter.
In the menu tab of R Studio, go to Tools, Global Options, choose Python and then select the interpreter from your computer. When you click select, it should automatically populate under system after download. Click Apply and OK.
Your R session will likely need to be restarted to ensure the changes take into effect. This process only needs to be done once.
Loading additional packages
To bridge the communication between R and Python, you will also need to install and load the reticulate package, only once.
#install.packages("reticulate")
library(reticulate)
Handling Dates in Reticulate
When using the reticulate
package to work with IMF Datatools, it is important to note that reticulate
interprets dates in UTC time by default. This can lead to incorrect date stamps when downloading data if your system time is set to a different timezone.
To prevent this issue, you must set your system timezone to UTC before retrieving data. You can do this in R by running:
Sys.setenv(TZ = "UTC")
Lastly, you will need to import the sys module and the IMF Datatools library for calling from R.
<- import("sys")
sys <- import("imf_datatools") imf_datatools
You should now be set up to use IMF Datatools in R Studio.
Further documentation on IMF Datatools can also be found at this link.
Datasets available in IMF Datatools
Understanding the available data
IMF Datatools has direct access to a plethora of internal and external databases like EcOS, Haver, EDI, and more. For exploratory work, you can view all the databases or import a specific database.
Let’s look at all the available databases.
#Get all databases
<- imf_datatools$ecos_sdmx_utilities$get_databases()
dbnames
#Quickly view the top part of all the databases
head(dbnames)
dbpath
ECDATA_ADCP_WB_QEDS Databases/External Databases/International Organizations/World Bank (WB)/Quarterly External Debt Statistics Unified (ADCP_WB_QEDS)
ECDATA_ADCP_WB_WDI_ECDATA Databases/External Databases/International Organizations/World Bank (WB)/World Development Indicators (WDI)
ECDATA_ALBANIA Databases/Country Data/Albania
ECDATA_ANALYTIC_DATABASE_01182024 Databases/External Databases/International Organizations/OECD/Archive/Analytic Database_01182024
ECDATA_ANGOLA Databases/Country Data/Angola
ECDATA_ANNUAL_CENTRAL_GOVERNMENT_BOND_YIELDS_ACGBY Databases/Fiscal Affairs Department (FAD)/Eurostat/Interest Rates/Interest Rates - Historical Data/Central Government Bond Yields/Annual Central Government Bond Yields (ACGBY)
This encompasses a very long list. Let’s try to narrow it down to something more relevant.
#Import a specific library under IMF Datatools, e.g., EDI
<- imf_datatools$edi_utilities
edi_utilities
#Look at the available databases in EDI
<- edi_utilities$get_databases()
databases head(databases)
shortname
bloomberg Bloomberg
bop-published-latest BOP
cofer-published-latest COFER
consensus-forecast-timeseries CONSENSUS
csd-forecasts-ccx CSDCCX
csd-forecasts-desk CSDDesk
description
bloomberg Bloomberg Data License
bop-published-latest Balance of Payments and International Investment Position
cofer-published-latest Currency Composition of Official Foreign Exchange Reserves
consensus-forecast-timeseries Consensus Forecasts (ECDATA)
csd-forecasts-ccx CSD - Cross Country Exercises (CCX)
csd-forecasts-desk CSD - Desk Data
environment
bloomberg production
bop-published-latest production
cofer-published-latest production
consensus-forecast-timeseries production
csd-forecasts-ccx production
csd-forecasts-desk production
primary_keys
bloomberg database/ticker/field/freq
bop-published-latest database/country/indicator/freq
cofer-published-latest database/country/indicator/freq
consensus-forecast-timeseries database/country/indicator/freq
csd-forecasts-ccx database/exercise/country/indicator/vintage/freq
csd-forecasts-desk database/country/indicator/exercise/vintage/vintagetimestamp/freq
This looks better. We can also use the glimpse function from the ‘dplyr’ package to have a more compact and structured overview of our data. This will show the structure of the data frame and previous any contents. Let’s try this for reference.
library(dplyr)
Attaching package: 'dplyr'
The following objects are masked from 'package:stats':
filter, lag
The following objects are masked from 'package:base':
intersect, setdiff, setequal, union
glimpse(databases)
Rows: 18
Columns: 4
$ shortname <chr> "Bloomberg", "BOP", "COFER", "CONSENSUS", "CSDCCX", "CSDD…
$ description <chr> "Bloomberg Data License", "Balance of Payments and Intern…
$ environment <chr> "production", "production", "production", "production", "…
$ primary_keys <chr> "database/ticker/field/freq", "database/country/indicator…
Let’s also take a look at the WEO country codes and groups.
#Get WEO country codes and groups
<- imf_datatools$ecos_sdmx_utilities$get_weo_country_codes()
countrydata
#Take a glimpse at data
glimpse(countrydata)
Rows: 196
Columns: 38
$ `ISO-3 code` <chr> …
$ Code <chr> …
$ `ISO-2 code` <chr> …
$ `WB region` <chr> …
$ `WB capitalCity` <chr> …
$ `WB incomeLevel` <chr> …
$ `WB latitude` <dbl> …
$ `WB longitude` <dbl> …
$ `Advanced Economies` <dbl> …
$ G7 <dbl> …
$ `Euro area (member states)` <dbl> …
$ `Other Advanced Economies (Advanced Economies excluding G7 and Euro Area countries)` <dbl> …
$ `Other Advanced Economies (Advanced Economies excluding U.S., Euro Area countries, and Japan)` <dbl> …
$ `Emerging and developing economies` <dbl> …
$ `Caucasus and Central Asia (CCA)` <dbl> …
$ `Emerging and Developing Asia` <dbl> …
$ `Emerging and Developing Asia excl. China and India` <dbl> …
$ `ASEAN-5` <dbl> …
$ `Emerging and Developing Europe` <dbl> …
$ `Latin America and the Caribbean` <dbl> …
$ `Middle East and Central Asia` <dbl> …
$ `Middle East and North Africa` <dbl> …
$ `Middle East, North Africa, Afghanistan, and Pakistan` <dbl> …
$ `Sub-Sahara Africa` <dbl> …
$ `Emerging Market and Developing Economies by Source of Export Earnings: Fuel` <dbl> …
$ `Emerging Market and Developing Economies by Source of Export Earnings: Nonfuel` <dbl> …
$ `Emerging Market and Developing Economies by Source of Export Earnings: Manufactures` <dbl> …
$ `Emerging Market and Developing Economies by Source of Export Earnings: Primary Products (excluding Fuel)` <dbl> …
$ `Emerging Market and Developing Economies by Source of Export Earnings: Services (including Income, Transfers)` <dbl> …
$ `Emerging Market and Developing Economies by Source of Export Earnings: Diversified` <dbl> …
$ `Emerging Market and Developing Economies by External Financing: Net Creditors` <dbl> …
$ `Emerging Market and Developing Economies by External Financing: Net Debtors` <dbl> …
$ `Emerging Market and Developing Economies by External Financing: Net Debtors with Arrears and/or Rescheduling During 2019-23` <dbl> …
$ `Emerging Market and Developing Economies by External Financing: Net Debtors without Arrears and/or Rescheduling During 2019-23` <dbl> …
$ `Emerging Market and Middle-Income Economies` <dbl> …
$ `Emerging Market and Developing Economies: Low Income Developing Countries` <dbl> …
$ `Heavily Indebted Poor Countries (HIPC)` <dbl> …
$ `European Union` <dbl> …
Using the datasets available in IMF Datatools
Let’s now download some series directly from these databases.
Using EcOS and EDI data
To download the appropriate series, make sure to specify the country code and indicator, for example, 111 for the USA, and NGDP for gross domestic product. Let’s try to download some WEO data.
First let’s start with WEO published data.
#Download WEO Published data for NGDP for country code 111, USA.
<- imf_datatools$get_ecos_sdmx_data('WEO_WEO_PUBLISHED', '111', 'NGDP')
df
#You can quickly glance at the tail end of the data
tail(df)
111NGDP.A
2025-01-01 3.050722e+13
2026-01-01 3.171764e+13
2027-01-01 3.294171e+13
2028-01-01 3.434213e+13
2029-01-01 3.571282e+13
2030-01-01 3.715309e+13
You will notice that the numbers are a little difficult to interpret, as the units are dollars and they are shown in scientific notation. Let’s change the scale to the one used in the WEO (billions).
#Download WEO Published data for NGDP for country code 111, USA.
<- imf_datatools$get_ecos_sdmx_data('WEO_WEO_PUBLISHED', '111', 'NGDP', scale='ecos')
df
#Quickly glance at the tail end of the data
tail(df)
111NGDP.A
2025-01-01 30507.22
2026-01-01 31717.64
2027-01-01 32941.71
2028-01-01 34342.13
2029-01-01 35712.82
2030-01-01 37153.09
Now take a closer look at a specific frequency.
#Download this same data, but at a quarterly frequency
<- imf_datatools$get_ecos_sdmx_data('WEO_WEO_PUBLISHED','111', 'NGDP', scale='ecos', freq='Q')
df
#Quickly glance at the tail end of the data
tail(df)
111NGDP.Q
2025-07-01 30680.70
2025-10-01 31006.97
2026-01-01 31277.33
2026-04-01 31572.19
2026-07-01 31865.61
2026-10-01 32155.45
Finally, download multiple countries and series at once.
#Download multiple countries and series in one command from ECOS
<- imf_datatools$get_ecos_sdmx_data('WEO_WEO_PUBLISHED',c('111', '193'), c('NGDP', 'GGR'), scale='ecos')
df
#Quickly glance at the tail end of the data
tail(df)
111GGR.A 111NGDP.A 193GGR.A 193NGDP.A
2025-01-01 9576.032 30507.22 1020.056 2818.239
2026-01-01 10312.742 31717.64 1071.168 2945.565
2027-01-01 10789.839 32941.71 1118.061 3085.038
2028-01-01 11129.959 34342.13 1166.897 3229.983
2029-01-01 11507.495 35712.82 1221.181 3380.686
2030-01-01 11981.010 37153.09 1278.539 3538.743
Using Haver data
For Haver data, make sure to specify the series and the database, such as GDP@USECON.
#Download Haver series for one country
<- imf_datatools$get_haver_data('GDP@USECON')
df
#Quickly glance at the tail end of the data
tail(df)
GDP@USECON
2023-10-01 28297.0
2024-01-01 28624.1
2024-04-01 29016.7
2024-07-01 29374.9
2024-10-01 29723.9
2025-01-01 29962.0
You can also download multiple series in one command, but make sure they have compatible scales.
#Download multiple series in one command
<- imf_datatools$get_haver_data(c('PJ4@USECON', 'PCUP@USECON'))
df
#Quickly glance at the tail end of the data
tail(df)
PJ4@USECON PCUP@USECON
2024-12-01 0.5 0.4
2025-01-01 0.7 0.5
2025-02-01 0.1 0.2
2025-03-01 -0.1 -0.1
2025-04-01 -0.2 0.2
2025-05-01 0.1 0.1
If you need more info on the scales, you can easily browse the metadata for an indicator.
#Obtain details on the metadata for a particular series
<- imf_datatools$haver_utilities$get_haver_metadata('GDP@USECON')
metadata
#Quickly glance at the tail end the metadata
tail(metadata)
database startdate enddate frequency
gdp usecon 1947-03-31 2025-03-31 Q
descriptor numobs datetimemod magnitude
gdp Gross Domestic Product (SAAR, Bil.$) 313 2025-06-26 12:32:00 9
decprecision diftype aggtype datatype group geography1 geography2
gdp 1 0 AVG US$ N01 111
shortsource longsource
gdp BEA Bureau of Economic Analysis
Using World Bank data
You can also access World Bank data, but make sure you denote country abbreviation instead of IMF country code.
# Download series for population in USA from World Bank DAta
<- imf_datatools$get_worldbank_data('SP.POP.TOTL', 'USA')
df
#Quickly glance at the tail end of the data
tail(df)
USA.SP.POP.TOTL
2019-01-01 330226227
2020-01-01 331577720
2021-01-01 332099760
2022-01-01 334017321
2023-01-01 336806231
2024-01-01 340110988
Using Bloomberg data
Bloomberg data typically accessible via ECData can also be accessed, ensuring the correct nomenclature.
#Download daily Vix Index Bloomberg data
<- imf_datatools$get_ecos_bloomberg_data('VIX Index', 'PX_LAST')
df
#Quickly glance at the tail end of the data
tail(df)
VIX IndexPX_LAST.D
2025-07-02 16.64
2025-07-03 16.38
2025-07-04 17.48
2025-07-05 NaN
2025-07-06 NaN
2025-07-07 17.79
You have now successfully downloaded and installed IMF Datatools, understood the available datasets, and downloaded relevant series.
Using iData
The new Fund‐wide iData (2025) will replaces EcOS, EDI, and can be accessed via the reticulate package. This is slightly different than the other sites as it includes both publicly available and private datasets. Let’s explore those below.
In order to access publicly available datasets, you need to specify that no token is needed via stating “FALSE” below. Let’s explore how to see the public datasets in iData below.
#alias the imported Python module for brevity of calls
<- imf_datatools
dt
#Public datasets (no token needed)
$idata_utilities$PRIVATE <- FALSE
dthead(dt$idata_utilities$get_databases())
name
IMF.STA:PPI Producer Price Index (PPI)
IMF.STA:WPCPER Crypto-based Parallel Exchange Rates (Working Paper dataset WP-CPER)
IMF.FAD:PSBS Public Sector Balance Sheet (PSBS)
IMF.STA:GFS_SFCP GFS Stocks and Flows by Counterparty
IMF.RES:WEO World Economic Outlook (WEO)
IMF.RES:EQ Export Quality (EQ)
Agency ID Resource ID Latest Version Unique ID
IMF.STA:PPI IMF.STA PPI 3.0.0 IMF.STA:PPI(3.0.0)
IMF.STA:WPCPER IMF.STA WPCPER 5.0.1 IMF.STA:WPCPER(5.0.1)
IMF.FAD:PSBS IMF.FAD PSBS 2.0.0 IMF.FAD:PSBS(2.0.0)
IMF.STA:GFS_SFCP IMF.STA GFS_SFCP 10.0.0 IMF.STA:GFS_SFCP(10.0.0)
IMF.RES:WEO IMF.RES WEO 6.0.0 IMF.RES:WEO(6.0.0)
IMF.RES:EQ IMF.RES EQ 2.0.0 IMF.RES:EQ(2.0.0)
You can also list your private datasets, which you have access to.
#Private datasets based on your access level
$idata_utilities$PRIVATE <- TRUE
dthead(dt$idata_utilities$get_databases(refresh = TRUE))
name
IMF.AFR:AFRREO Sub-Saharan Africa Regional Economic Outlook (AFRREO)
IMF.STA:MFS_OFC Monetary and Financial Statistics (MFS_OFC): Other Financial Corporations
IMF.RES:FSI Financial Stress Index (FSI)
IMF.STA:PI Production Indexes (PI)
IMF.FAD:HPD Historical Public Debt (HPD)
IMF.MCD:MCDREO Middle East and Central Asia Regional Economic Outlook (MCDREO)
Agency ID Resource ID Latest Version Unique ID
IMF.AFR:AFRREO IMF.AFR AFRREO 6.0.1 IMF.AFR:AFRREO(6.0.1)
IMF.STA:MFS_OFC IMF.STA MFS_OFC 6.0.0 IMF.STA:MFS_OFC(6.0.0)
IMF.RES:FSI IMF.RES FSI 3.0.1 IMF.RES:FSI(3.0.1)
IMF.STA:PI IMF.STA PI 2.0.0 IMF.STA:PI(2.0.0)
IMF.FAD:HPD IMF.FAD HPD 1.0.0 IMF.FAD:HPD(1.0.0)
IMF.MCD:MCDREO IMF.MCD MCDREO 7.0.1 IMF.MCD:MCDREO(7.0.1)
Once you’ve picked a database (e.g. "IMF.STA:CPI"
), you can also check its structure.
get_dimensions()
lists all the ways you can slice your data, i.e., by country, frequency etc.<- dt$idata_utilities$get_dimensions("IMF.STA:CPI") dims print(dims)
Description Dimension Order COUNTRY Country 0 INDEX_TYPE Index type 1 COICOP_1999 Expenditure Category 2 TYPE_OF_TRANSFORMATION Type of Transformation 3 FREQUENCY Frequency 4
get_dimension_values()
this function provides the full list of entries within a specific dimension, e.g., Country.<- dt$idata_utilities$get_dimension_values("IMF.STA:CPI", "COUNTRY") vals
Retrieving a series
Once you have itentified your dataset and explored its dimensions you can build a query by joining dimension codes with period. Leave the key specifications empty to include all values as shown below.
# Example: CPI level for USA & JPN monthly (_T = level, M = monthly)
<- "USA+JPN.CPI._T..M"
key <- dt$idata_utilities$get_idata_data("IMF.STA:CPI", key)
df tail(df)
JPN.CPI._T.IX.M JPN.CPI._T.POP_PCH_PA_PT.M JPN.CPI._T.SRP_2010_IX.M
2024-12-01 110.7 0.6363636 116.7516
2025-01-01 111.2 0.4516712 117.2790
2025-02-01 110.8 -0.3597122 116.8571
2025-03-01 111.1 0.2707581 117.1735
2025-04-01 111.5 0.3600360 117.5954
2025-05-01 111.8 0.2690583 117.9118
JPN.CPI._T.YOY_PCH_PA_PT.M USA.CPI._T.IX.M
2024-12-01 3.651685 144.7361
2025-01-01 4.022451 145.6836
2025-02-01 3.648269 146.3306
2025-03-01 3.638060 146.6595
2025-04-01 3.528319 147.1162
2025-05-01 3.422757 147.4235
USA.CPI._T.POP_PCH_PA_PT.M USA.CPI._T.SRP_2010_IX.M
2024-12-01 0.0355000 144.7361
2025-01-01 0.6546157 145.6836
2025-02-01 0.4441702 146.3306
2025-03-01 0.2247071 146.6595
2025-04-01 0.3114456 147.1162
2025-05-01 0.2088561 147.4235
USA.CPI._T.YOY_PCH_PA_PT.M
2024-12-01 2.888057
2025-01-01 3.000483
2025-02-01 2.821549
2025-03-01 2.390725
2025-04-01 2.311289
2025-05-01 2.354897
To obtain data in long format, add
longformat = TRUE
.# Long <- dt$idata_utilities$get_idata_data("IMF.STA:CPI", key, longformat = TRUE) lf tail(lf)
COUNTRY INDEX_TYPE COICOP_1999 TYPE_OF_TRANSFORMATION FREQUENCY dates 6729 USA CPI _T YOY_PCH_PA_PT M 2024-12-01 6730 USA CPI _T YOY_PCH_PA_PT M 2025-01-01 6731 USA CPI _T YOY_PCH_PA_PT M 2025-02-01 6732 USA CPI _T YOY_PCH_PA_PT M 2025-03-01 6733 USA CPI _T YOY_PCH_PA_PT M 2025-04-01 6734 USA CPI _T YOY_PCH_PA_PT M 2025-05-01 OBS_VALUE 6729 2.888057 6730 3.000483 6731 2.821549 6732 2.390725 6733 2.311289 6734 2.354897
To pivot by a dimension (e.g. COUNTRY), use
panel = "COUNTRY"
# Panel by country <- dt$idata_utilities$get_idata_data("IMF.STA:CPI", key, panel = "COUNTRY") pf tail(pf)
COUNTRY dates CPI._T.IX.M CPI._T.POP_PCH_PA_PT.M CPI._T.SRP_2010_IX.M 1685 USA 2024-12-01 144.7361 0.0355000 144.7361 1686 USA 2025-01-01 145.6836 0.6546157 145.6836 1687 USA 2025-02-01 146.3306 0.4441702 146.3306 1688 USA 2025-03-01 146.6595 0.2247071 146.6595 1689 USA 2025-04-01 147.1162 0.3114456 147.1162 1690 USA 2025-05-01 147.4235 0.2088561 147.4235 CPI._T.YOY_PCH_PA_PT.M 1685 2.888057 1686 3.000483 1687 2.821549 1688 2.390725 1689 2.311289 1690 2.354897
IMF Datatools allows you to easily browse, inspect, and retrieve IMF datasets in R.