Announcing the first complete R and Python libraries for comprehensive access to the IMF API

252 global economic databases with tens of thousands of indicators, accessible at your convenience

Mar 29, 2023

I’m very excited to announce the release of my first R and Python packages, imfr v2 and imfp v1. The R package is an update of a package initially created by Christopher Gandrud. It is available on Github and has been submitted for publication through the CRAN repository. The Python package is available though both Github and PyPi. To install imfr, use:

devtools::install_github("christophergandrud/imfr")

To install imfp, use:

pip install imfp

Over two hundred fifty databases at your fingertips

There have been a few half-hearted efforts to make a Python package for downloading data from the International Monetary Fund API, but there has never been a complete, well-documented package that provides comprehensive access to all the API’s databases and key features. imfr and imfp are designed to fill this need.

The International Monetary Fund API offers an incredible 252 databases, including some really important standbys such as International Financial Statistics and the Primary Commodity Price System. The latter, a database of global commodity prices, is also available through FRED. However, the IMF’s data is more up-to-date, with FRED merely repackaging the IMF data and making it available with a lag.

To view the available databases in R, use:

library(imfr)
View(imf_databases())

In Python, you may want to open the data frame in a browser window to allow for easy viewing. There are a couple ways to do that, but I like this implementation of a simple `View` function similar to RStudio’s:

import imfp
import tempfile
import webbrowser

df = imfp.imf_databases()

def View(df):
    html = df.to_html()
    with tempfile.NamedTemporaryFile('w', delete=False, suffix='.html') as f:
        url = 'file://' + f.name
        f.write(html)
    webbrowser.open(url)

View(df)

Intuitive workflow

As you might imagine, querying the IMF’s 252 different databases can be quite complicated, since there are many different parameters used in different databases, and many different possible combinations of parameters. To solve this problem, the imfr and imfp databases introduce a workflow for exploring these databases and crafting your requests.

First, use the imf_databases method to get a full list of databases. Select a database of interest, and then use imf_parameter_defs to see a short list of parameters that can be used to query it. For a complete list of the values that each parameter can take, use imf_parameters to get a list (in R) or a dict (in Python) of dataframes for each parameter. Finally, supply your parameters as arguments to imf_dataset to get a data frame of the series you want. For full instructions, I refer you to the READMEs on the Github repos linked above.

Example project

To illustrate how to request data from one of the more complex databases, we’ll take a look at the “Balance of Payments” database using imfp in Python:

import imfp  

df = imfp.imf_databases()
df[df['description'].str.contains("Balance of Payments")]

We find that there are in fact a whole bunch of Balance of Payments databases, but for our purposes we’ll just take the main one, with input_code ‘BOP’.

Next, we’ll get the parameters for this database:

params = imfp.imf_parameters('BOP')
params.keys()

Running params.keys() tells us that the parameter names for this database are ‘freq’, ‘ref_area’, and ‘indicator’. For each key, the params dict contains a data frame of valid values. We can view each data frame with the View function we defined above:

View(params['freq'])
View(params['ref_area'])
View(params['indicator'])

In the `indicator` data frame we find over 6,000 rows. Let’s assume we’re only interested in rows that report “Totals”. That lets us narrow things down a bit:

View(params['indicator'][params['indicator']['description'].str.contains('Total')])

You’ll want to do this for each parameter, but let’s skip ahead and assume we’ve looked at the data frames and manually picked out the codes we want. We want to look at annual data for the US and Brazil, and we want the net capital account and net current account in US dollars.

(Sidenote here for the uninitiated. Capital account is the net flow of investment transaction into an economy. Current account is net income from imports and exports. Together, these are the two main components of balance of payments.)

With parameter codes in hand, we craft our queries and make our database requests:

cur_acc = imfp.imf_dataset(database_id='BOP',freq='A',indicator='BK_BP6_USD',ref_area=['US','BR'])
cap_acc = imfp.imf_dataset(database_id='BOP',freq='A',indicator='BCA_BP6_USD',ref_area=['US','BR'])

We can get a quick glimpse of what the capital account data frame looks like with data.head():

Sample Balance of Payments data frame from imfp library

For our purposes, we are primarily interested in the obs_value and time_period columns. We’ll also need ref_area to distinguish between US data and Brazil data. Note that the unit_mult column tells us that our obs_values are reported in millions. (The ‘6’ tells us to multiply by 1e6, or add 6 zeroes to our obs_value.)

All entries are received in string format, so for plotting we have to convert to numeric or datetime. We also divide by a thousand to convert from millions to billions.

cur_acc['time_period'] = cur_acc['time_period'].astype(int)
cur_acc['obs_value'] = cur_acc['obs_value'].astype(float)

cur_acc['obs_value'] = cur_acc['obs_value'] / 1e3

cap_acc['time_period'] = cap_acc['time_period'].astype(int)
cap_acc['obs_value'] = cap_acc['obs_value'].astype(float)

cap_acc['obs_value'] = cap_acc['obs_value'] / 1e3

And finally, we plot the capital account data using the seaborn package, with each ref_area represented by a different hue:

sns.lineplot(data=data, x='time_period', y='obs_value', hue='ref_area')

This gives us the following lovely plot, showing the US’s rather large capital account deficit (i.e. net loss of investment) and the more modest deficit for Brazil.

Similarly, we plot the current account balance and find the US mostly running deficits, while Brazil recently ran a surplus:

Of course, for a serious analysis, you’d want to normalize to the size of a country’s economy for a side-by-side comparison like this.

As your homework assignment, you can download imfp or imfr and go find that statistic in the International Financial Statistics database, database_id ‘IFS’. (Or, to make things even more interesting, import the fredapi library and combine the IMF balance of payments data with the GDP series from FRED.)

Modeling Markets

Discussion about this post