Announcing the first complete R and Python libraries for comprehensive access to the IMF API
252 global economic databases with tens of thousands of indicators, accessible at your convenience
I’m very excited to announce the release of my first R and Python packages, imfr
v2 and imfp
v1. The R package is an update of a package initially created by Christopher Gandrud. It is available on Github and has been submitted for publication through the CRAN repository. The Python package is available though both Github and PyPi. To install imfr
, use:
devtools::install_github("christophergandrud/imfr")
To install imfp
, use:
pip install imfp
Over two hundred fifty databases at your fingertips
There have been a few half-hearted efforts to make a Python package for downloading data from the International Monetary Fund API, but there has never been a complete, well-documented package that provides comprehensive access to all the API’s databases and key features. imfr
and imfp
are designed to fill this need.
The International Monetary Fund API offers an incredible 252 databases, including some really important standbys such as International Financial Statistics and the Primary Commodity Price System. The latter, a database of global commodity prices, is also available through FRED. However, the IMF’s data is more up-to-date, with FRED merely repackaging the IMF data and making it available with a lag.
To view the available databases in R, use:
library(imfr)
View(imf_databases())
In Python, you may want to open the data frame in a browser window to allow for easy viewing. There are a couple ways to do that, but I like this implementation of a simple `View` function similar to RStudio’s:
import imfp
import tempfile
import webbrowser
df = imfp.imf_databases()
def View(df):
html = df.to_html()
with tempfile.NamedTemporaryFile('w', delete=False, suffix='.html') as f:
url = 'file://' + f.name
f.write(html)
webbrowser.open(url)
View(df)
Intuitive workflow
As you might imagine, querying the IMF’s 252 different databases can be quite complicated, since there are many different parameters used in different databases, and many different possible combinations of parameters. To solve this problem, the imfr
and imfp
databases introduce a workflow for exploring these databases and crafting your requests.
First, use the imf_databases
method to get a full list of databases. Select a database of interest, and then use imf_parameter_defs
to see a short list of parameters that can be used to query it. For a complete list of the values that each parameter can take, use imf_parameters
to get a list (in R) or a dict (in Python) of dataframes for each parameter. Finally, supply your parameters as arguments to imf_dataset
to get a data frame of the series you want. For full instructions, I refer you to the READMEs on the Github repos linked above.
Example project
To illustrate how to request data from one of the more complex databases, we’ll take a look at the “Balance of Payments” database using imfp
in Python:
import imfp
df = imfp.imf_databases()
df[df['description'].str.contains("Balance of Payments")]
We find that there are in fact a whole bunch of Balance of Payments databases, but for our purposes we’ll just take the main one, with input_code
‘BOP’.
Next, we’ll get the parameters for this database:
params = imfp.imf_parameters('BOP')
params.keys()
Running params.keys()
tells us that the parameter names for this database are ‘freq’, ‘ref_area’, and ‘indicator’. For each key, the params dict contains a data frame of valid values. We can view each data frame with the View
function we defined above:
View(params['freq'])
View(params['ref_area'])
View(params['indicator'])
In the `indicator` data frame we find over 6,000 rows. Let’s assume we’re only interested in rows that report “Totals”. That lets us narrow things down a bit:
View(params['indicator'][params['indicator']['description'].str.contains('Total')])
You’ll want to do this for each parameter, but let’s skip ahead and assume we’ve looked at the data frames and manually picked out the codes we want. We want to look at annual data for the US and Brazil, and we want the net capital account and net current account in US dollars.
(Sidenote here for the uninitiated. Capital account is the net flow of investment transaction into an economy. Current account is net income from imports and exports. Together, these are the two main components of balance of payments.)
With parameter codes in hand, we craft our queries and make our database requests:
cur_acc = imfp.imf_dataset(database_id='BOP',freq='A',indicator='BK_BP6_USD',ref_area=['US','BR'])
cap_acc = imfp.imf_dataset(database_id='BOP',freq='A',indicator='BCA_BP6_USD',ref_area=['US','BR'])
We can get a quick glimpse of what the capital account data frame looks like with data.head()
:
For our purposes, we are primarily interested in the obs_value
and time_period
columns. We’ll also need ref_area
to distinguish between US data and Brazil data. Note that the unit_mult
column tells us that our obs_value
s are reported in millions. (The ‘6’ tells us to multiply by 1e6, or add 6 zeroes to our obs_value
.)
All entries are received in string format, so for plotting we have to convert to numeric or datetime. We also divide by a thousand to convert from millions to billions.
cur_acc['time_period'] = cur_acc['time_period'].astype(int)
cur_acc['obs_value'] = cur_acc['obs_value'].astype(float)
cur_acc['obs_value'] = cur_acc['obs_value'] / 1e3
cap_acc['time_period'] = cap_acc['time_period'].astype(int)
cap_acc['obs_value'] = cap_acc['obs_value'].astype(float)
cap_acc['obs_value'] = cap_acc['obs_value'] / 1e3
And finally, we plot the capital account data using the seaborn
package, with each ref_area
represented by a different hue:
sns.lineplot(data=data, x='time_period', y='obs_value', hue='ref_area')
This gives us the following lovely plot, showing the US’s rather large capital account deficit (i.e. net loss of investment) and the more modest deficit for Brazil.
Similarly, we plot the current account balance and find the US mostly running deficits, while Brazil recently ran a surplus:
Of course, for a serious analysis, you’d want to normalize to the size of a country’s economy for a side-by-side comparison like this.
As your homework assignment, you can download imfp
or imfr
and go find that statistic in the International Financial Statistics database, database_id ‘IFS’. (Or, to make things even more interesting, import the fredapi
library and combine the IMF balance of payments data with the GDP series from FRED.)