Accessing IMF Portwatch Data using an API

This tutorial will teach you how to query large datasets from the IMF’s PortWatch platform using R and the ArcGIS REST API. It was designed alongside members of the IMF’s PortWatch team Mario Saraiva and Alessandra Sozzi.

Setup

Start by loading the required packages:

library(httr)
library(jsonlite)
library(dplyr)

Attaching package: 'dplyr'
The following objects are masked from 'package:stats':

    filter, lag
The following objects are masked from 'package:base':

    intersect, setdiff, setequal, union
library(glue)
library(data.table)

Attaching package: 'data.table'
The following objects are masked from 'package:dplyr':

    between, first, last

Define Available Datasets

PortWatch datasets are hosted on ArcGIS Online. Each dataset has a unique service name. It is important to explore and specify the datasets to query. In our case, we will use PortWatch’s Daily_Trade_Data (ports) and Daily_Chokepoints_Data (key maritime passages).

# Datasets
chokepoints.url <- "Daily_Chokepoints_Data"
ports.url <- "Daily_Trade_Data"

Build Dynamic API URLs

Each dataset has a unique name that is inserted into a standard REST API URL structure. Instead of hardcoding URLs, we define a flexible function that takes a dataset name and returns the corresponding API endpoint.

# Function to compose dataset URL
get_api_url <- function(dataset) {
  base <- glue("https://services9.arcgis.com/weJ1QsnbMYJlCHdG/arcgis/rest/services/{dataset}/FeatureServer/0/query")
  return(base)
}

For example, if I wanted to retrieve the URL for the Daily Trade Data, I would do it in the following manner:

get_api_url("Daily_Trade_Data")
https://services9.arcgis.com/weJ1QsnbMYJlCHdG/arcgis/rest/services/Daily_Trade_Data/FeatureServer/0/query
# returns:
# "https://services9.arcgis.com/weJ1QsnbMYJlCHdG/arcgis/rest/services/Daily_Trade_Data/FeatureServer/0/query"

Define API Query Helpers

To retrieve data from the ArcGIS API, we make repeated GET requests with specific query parameters (like filters and output fields). Instead of repeating the same request logic multiple times, we define a wrapper function around httr::GET() that sends the request and parses the response into a JSON object.

This function simplifies the rest of our workflow by abstracting away the raw HTTP logic:

# Function to make API requests and increment resultOffset
get_api_data <- function(url, params) {
  response <- GET(url, query = params)
  response1 <- fromJSON(rawToChar(response$content))
  return(response1)
}

This helper allows us to pass any combination of where, outFields, and parameters to the API and return a result ready for processing.

Creating query_portwatch()

Given the millions of records in ArcGIS Feature Servie, we build a core function that handles large-scale data extraction from Portwatch, the query_portwatch() function. This handles data extraction in three main steps:

  1. It first queries the API to get the total number of available records.

  2. Then, it loops through the data in batches (up to 5,000 records at a time), appending each batch to a unified result.

  3. Lastly, it returns a cleaned data.table with parsed date values.

# Function to query PortWatch data
query_portwatch <- function(url, where = "1=1", maxRecordCountFactor = 5, outFields = "*") {
  
  batch_size <- maxRecordCountFactor * 1000
  
  # Step 1: Get total record count
  params_initial <- list(where = where, returnCountOnly = TRUE, f = "json")
  total_records <- GET(url, query = params_initial) %>%
    content("parsed") %>%
    .$count
  
  print(paste0("Begin extraction of ", total_records, " records..."))
  
  # Prepare to store results
  all_results <- list()
  params_query <- list(where = where, outFields = outFields, f = "json", maxRecordCountFactor = maxRecordCountFactor)
  
  # Step 2: Batch fetch
  for (offset in seq(0, total_records, by = batch_size)) {
    print(paste0("Extracting batch from record ", offset, "..."))
    params_query$resultOffset <- offset
    
    result <- get_api_data(url, params_query)
    print('Length of result$features:')
    print(length(result$features))
    
    if (length(result$features) > 0) {
      df_batch <- as.data.frame(result$features$attributes)
      all_results[[length(all_results) + 1]] <- df_batch
    } else {
      break
    }
    Sys.sleep(1)
  }
  
  final_df <- rbindlist(all_results, fill = TRUE)
  
  if ("date" %in% colnames(final_df)) {
    final_df$date <- as.POSIXct(final_df$date / 1000, origin = "1970-01-01", tz = "UTC")
    final_df <- final_df %>% arrange(date)
  }
  
  return(final_df)
}

You can now use this function to extract data from any PortWatch-compatible ArcGIS service with just a few lines of code.

Query Examples

We will now show you how to use query_portwatch() with real-world filters to retrieve data from the PortWatch API. Each example corresponds to a typical use case: chokepoint activity, individual port traffic, or country-level trade data.

Query a Chokepoint (e.g., Suez Canal)

The Daily_Chokepoints_Data dataset includes maritime chokepoints such as the Suez Canal. You can query by portid, using values like 'chokepoint1', 'chokepoint2', etc. This will return daily vessel activity and trade volume passing through the Suez Canal.

# 1. Query Suez Canal chokepoint
ck1 <- query_portwatch(
  url = get_api_url(chokepoints.url),
  where = "portid='chokepoint1'"
)
[1] "Begin extraction of 2351 records..."
[1] "Extracting batch from record 0..."
[1] "Length of result$features:"
[1] 1
head(ck1)
         date  year month   day      portid   portname n_container n_dry_bulk
       <POSc> <int> <int> <int>      <char>     <char>       <int>      <int>
1: 2019-01-01  2019     1     1 chokepoint1 Suez Canal          21         21
2: 2019-01-02  2019     1     2 chokepoint1 Suez Canal          24          4
3: 2019-01-03  2019     1     3 chokepoint1 Suez Canal          13         14
4: 2019-01-04  2019     1     4 chokepoint1 Suez Canal          17         11
5: 2019-01-05  2019     1     5 chokepoint1 Suez Canal          20          9
6: 2019-01-06  2019     1     6 chokepoint1 Suez Canal          13         12
   n_general_cargo n_roro n_tanker n_cargo n_total capacity_container
             <int>  <int>    <int>   <int>   <int>              <num>
1:              15      6       23      63      86            1155533
2:               5      6       10      39      49            1534925
3:               9      2       23      38      61             555097
4:               2      2       14      32      46            1044185
5:               1      1       13      31      44            1218925
6:               6      0       18      31      49             832089
   capacity_dry_bulk capacity_general_cargo capacity_roro capacity_tanker
               <num>                  <num>         <num>           <num>
1:          754935.5             158669.745     21900.778       1355622.5
2:          227315.8               8288.384     37934.295        518306.2
3:          977689.8              39398.171     10829.719        804257.6
4:          430139.7               3673.345      3430.301        759384.5
5:          777548.1               4587.562      5567.024        628481.1
6:          664305.1              27559.968         0.000        314955.6
   capacity_cargo capacity ObjectId
            <num>    <num>    <int>
1:        2091039  3446662        1
2:        1808463  2326769        2
3:        1583015  2387272        3
4:        1481428  2240813        4
5:        2006628  2635109        5
6:        1523954  1838910        6

Query a Specific Port (e.g., Rotterdam)

You can target a specific commercial port using the Daily_Trade_Data dataset. Each port has a unique portid identifier — in this case, 'port1114' for Rotterdam. This will return time series data for vessel calls, imports, and exports for Rotterdam.

# 2. Query Rotterdam port (port1114)
port1114 <- query_portwatch(
  url = get_api_url(ports.url),
  where = "portid='port1114'"
)
[1] "Begin extraction of 2349 records..."
[1] "Extracting batch from record 0..."
[1] "Length of result$features:"
[1] 1
head(port1114)
         date  year month   day   portid  portname         country   ISO3
       <POSc> <int> <int> <int>   <char>    <char>          <char> <char>
1: 2019-01-01  2019     1     1 port1114 Rotterdam The Netherlands    NLD
2: 2019-01-02  2019     1     2 port1114 Rotterdam The Netherlands    NLD
3: 2019-01-03  2019     1     3 port1114 Rotterdam The Netherlands    NLD
4: 2019-01-04  2019     1     4 port1114 Rotterdam The Netherlands    NLD
5: 2019-01-05  2019     1     5 port1114 Rotterdam The Netherlands    NLD
6: 2019-01-06  2019     1     6 port1114 Rotterdam The Netherlands    NLD
   portcalls_container portcalls_dry_bulk portcalls_general_cargo
                 <int>              <int>                   <int>
1:                  47                 11                      40
2:                  12                  0                      22
3:                  27                  1                      13
4:                  20                  2                      14
5:                  17                  4                       7
6:                  18                  3                       8
   portcalls_roro portcalls_tanker portcalls_cargo portcalls import_container
            <int>            <int>           <int>     <int>            <num>
1:             10              176             108       284        276024.38
2:              1               42              35        77        292660.59
3:              8               49              49        98        257497.92
4:              7               46              43        89        159704.55
5:              4               59              32        91        134572.34
6:              4               44              33        77         58048.29
   import_dry_bulk import_general_cargo import_roro import_tanker import_cargo
             <num>                <num>       <num>         <num>        <num>
1:       520863.95            28920.810      0.0000     1160914.4     825809.1
2:            0.00            20610.609    403.7565      361886.0     313675.0
3:        29763.03             3190.438   2212.2034      644234.5     292663.6
4:        84292.14             5485.041   2100.2969      267727.9     251582.0
5:       258645.16             4581.078   4083.5841      984177.3     401882.2
6:        55223.17             2852.193      0.0000      300890.8     116123.7
      import export_container export_dry_bulk export_general_cargo export_roro
       <num>            <num>           <num>                <num>       <num>
1: 1986723.6        348997.63           0.000            24520.746    7052.115
2:  675560.9         44664.53           0.000             9396.785       0.000
3:  936898.1        152092.64           0.000             7888.683    5535.969
4:  519309.9        129285.05           0.000             5991.319    1220.126
5: 1386059.5        171126.13           0.000             1077.742    3660.378
6:  417014.4        149982.52        1262.953             3210.253    2171.112
   export_tanker export_cargo   export ObjectId
           <num>        <num>    <num>    <int>
1:     493409.37    380570.49 873979.9   298024
2:      57301.03     54061.32 111362.3   298025
3:      93818.74    165517.30 259336.0   298026
4:     125372.83    136496.49 261869.3   298027
5:     157996.88    175864.25 333861.1   298028
6:     237929.13    156626.84 394556.0   298029

View the full list of portids here.

Query All U.S. Ports (with selected fields)

You can extract trade data for all ports in a specific country using the ISO3 country code (e.g., "USA" for the United States). To keep results focused, this example limits the output to four key fields. This will return daily totals of port calls, imports, and exports across U.S. ports.

# 3. Query all ports in the USA with select fields
us_ports <- query_portwatch(
  url = get_api_url(ports.url),
  outFields = "date,portcalls,import,export",
  where = "ISO3='USA'"
)
[1] "Begin extraction of 258390 records..."
[1] "Extracting batch from record 0..."
[1] "Length of result$features:"
[1] 1
[1] "Extracting batch from record 5000..."
[1] "Length of result$features:"
[1] 1
[1] "Extracting batch from record 10000..."
[1] "Length of result$features:"
[1] 1
[1] "Extracting batch from record 15000..."
[1] "Length of result$features:"
[1] 1
[1] "Extracting batch from record 20000..."
[1] "Length of result$features:"
[1] 1
[1] "Extracting batch from record 25000..."
[1] "Length of result$features:"
[1] 1
[1] "Extracting batch from record 30000..."
[1] "Length of result$features:"
[1] 1
[1] "Extracting batch from record 35000..."
[1] "Length of result$features:"
[1] 1
[1] "Extracting batch from record 40000..."
[1] "Length of result$features:"
[1] 1
[1] "Extracting batch from record 45000..."
[1] "Length of result$features:"
[1] 1
[1] "Extracting batch from record 50000..."
[1] "Length of result$features:"
[1] 1
[1] "Extracting batch from record 55000..."
[1] "Length of result$features:"
[1] 1
[1] "Extracting batch from record 60000..."
[1] "Length of result$features:"
[1] 1
[1] "Extracting batch from record 65000..."
[1] "Length of result$features:"
[1] 1
[1] "Extracting batch from record 70000..."
[1] "Length of result$features:"
[1] 1
[1] "Extracting batch from record 75000..."
[1] "Length of result$features:"
[1] 1
[1] "Extracting batch from record 80000..."
[1] "Length of result$features:"
[1] 1
[1] "Extracting batch from record 85000..."
[1] "Length of result$features:"
[1] 1
[1] "Extracting batch from record 90000..."
[1] "Length of result$features:"
[1] 1
[1] "Extracting batch from record 95000..."
[1] "Length of result$features:"
[1] 1
[1] "Extracting batch from record 1e+05..."
[1] "Length of result$features:"
[1] 1
[1] "Extracting batch from record 105000..."
[1] "Length of result$features:"
[1] 1
[1] "Extracting batch from record 110000..."
[1] "Length of result$features:"
[1] 1
[1] "Extracting batch from record 115000..."
[1] "Length of result$features:"
[1] 1
[1] "Extracting batch from record 120000..."
[1] "Length of result$features:"
[1] 1
[1] "Extracting batch from record 125000..."
[1] "Length of result$features:"
[1] 1
[1] "Extracting batch from record 130000..."
[1] "Length of result$features:"
[1] 1
[1] "Extracting batch from record 135000..."
[1] "Length of result$features:"
[1] 1
[1] "Extracting batch from record 140000..."
[1] "Length of result$features:"
[1] 1
[1] "Extracting batch from record 145000..."
[1] "Length of result$features:"
[1] 1
[1] "Extracting batch from record 150000..."
[1] "Length of result$features:"
[1] 1
[1] "Extracting batch from record 155000..."
[1] "Length of result$features:"
[1] 1
[1] "Extracting batch from record 160000..."
[1] "Length of result$features:"
[1] 1
[1] "Extracting batch from record 165000..."
[1] "Length of result$features:"
[1] 1
[1] "Extracting batch from record 170000..."
[1] "Length of result$features:"
[1] 1
[1] "Extracting batch from record 175000..."
[1] "Length of result$features:"
[1] 1
[1] "Extracting batch from record 180000..."
[1] "Length of result$features:"
[1] 1
[1] "Extracting batch from record 185000..."
[1] "Length of result$features:"
[1] 1
[1] "Extracting batch from record 190000..."
[1] "Length of result$features:"
[1] 1
[1] "Extracting batch from record 195000..."
[1] "Length of result$features:"
[1] 1
[1] "Extracting batch from record 2e+05..."
[1] "Length of result$features:"
[1] 1
[1] "Extracting batch from record 205000..."
[1] "Length of result$features:"
[1] 1
[1] "Extracting batch from record 210000..."
[1] "Length of result$features:"
[1] 1
[1] "Extracting batch from record 215000..."
[1] "Length of result$features:"
[1] 1
[1] "Extracting batch from record 220000..."
[1] "Length of result$features:"
[1] 1
[1] "Extracting batch from record 225000..."
[1] "Length of result$features:"
[1] 1
[1] "Extracting batch from record 230000..."
[1] "Length of result$features:"
[1] 1
[1] "Extracting batch from record 235000..."
[1] "Length of result$features:"
[1] 1
[1] "Extracting batch from record 240000..."
[1] "Length of result$features:"
[1] 1
[1] "Extracting batch from record 245000..."
[1] "Length of result$features:"
[1] 1
[1] "Extracting batch from record 250000..."
[1] "Length of result$features:"
[1] 1
[1] "Extracting batch from record 255000..."
[1] "Length of result$features:"
[1] 1
head(us_ports)
         date portcalls      import   export
       <POSc>     <int>       <num>    <num>
1: 2019-01-01         1    773.8445 10138.03
2: 2019-01-01         1   6651.9296     0.00
3: 2019-01-01        10 140714.2323 71887.93
4: 2019-01-01         1   5010.4337     0.00
5: 2019-01-01         1   7466.1084     0.00
6: 2019-01-01         0      0.0000     0.00

Creating your own filters with field names

To customize your queries, it’s helpful to inspect the available field names for the dataset. This ensures your where clauses match real column names.

#Inspect available fields to construct your own filters
meta_url <- "https://services9.arcgis.com/weJ1QsnbMYJlCHdG/arcgis/rest/services/Daily_Trade_Data/FeatureServer/0?f=json"
meta_resp <- GET(meta_url)
fields <- fromJSON(content(meta_resp, as = "text", encoding = "UTF-8"))

# List available field names
field_names <- fields$fields$name
print(field_names)
 [1] "date"                    "year"                   
 [3] "month"                   "day"                    
 [5] "portid"                  "portname"               
 [7] "country"                 "ISO3"                   
 [9] "portcalls_container"     "portcalls_dry_bulk"     
[11] "portcalls_general_cargo" "portcalls_roro"         
[13] "portcalls_tanker"        "portcalls_cargo"        
[15] "portcalls"               "import_container"       
[17] "import_dry_bulk"         "import_general_cargo"   
[19] "import_roro"             "import_tanker"          
[21] "import_cargo"            "import"                 
[23] "export_container"        "export_dry_bulk"        
[25] "export_general_cargo"    "export_roro"            
[27] "export_tanker"           "export_cargo"           
[29] "export"                  "ObjectId"               

Here are a few filter examples you can plug into your query_portwatch() call:

where = "country = 'China'"           # Filter by country name
where = "portid = 'port1207'"         # Filter by specific port
where = "ISO3 = 'BRA'"                # Filter by ISO3 country code
where = "portcalls > 10"              # Filter by numeric threshold
where = "year = 2024 AND ISO3 = 'IND'" # Combine conditions

You are now equipped with creating your own custom queries in PortWatch.