Explore all the economic data from different providers (national and international statistical institutes, central banks, etc.), for free, following the link db.nomics.world
(N.B.: in the examples, data have already been retrieved on december 11rd 2019).
ids
First, let’s assume that we know which series we want to download. A series identifier (ids
) is defined by three values, formatted like this: provider_code
/dataset_code
/series_code
.
library(magrittr)
library(dplyr)
library(ggplot2)
library(rdbnomics)
df <- rdb(ids = "AMECO/ZUTN/EA19.1.0.0.0.ZUTN") %>%
filter(!is.na(value))
In such data.frame (data.table or tibble), you will always find at least ten columns:
provider_code
dataset_code
dataset_name
series_code
series_name
original_period
(character string)period
(date of the first day of original_period
)original_value
(character string)value
@frequency
(harmonized frequency generated by DBnomics)The other columns depend on the provider and on the dataset. They always come in pairs (for the code and the name). In the data.frame df
, you have:
unit
(code) and Unit
(name)geo
(code) and Country
(name)freq
(code) and Frequency
(name)ggplot(df, aes(x = period, y = value, color = series_name)) +
geom_line(size = 1.2) +
geom_point(size = 2) +
dbnomics()
In the event that you only use the argument ids
, you can drop it and run:
df <- rdb("AMECO/ZUTN/EA19.1.0.0.0.ZUTN")
df <- rdb(ids = c("AMECO/ZUTN/EA19.1.0.0.0.ZUTN", "AMECO/ZUTN/DNK.1.0.0.0.ZUTN")) %>%
filter(!is.na(value))
ggplot(df, aes(x = period, y = value, color = series_name)) +
geom_line(size = 1.2) +
geom_point(size = 2) +
dbnomics()
df <- rdb(ids = c("AMECO/ZUTN/EA19.1.0.0.0.ZUTN", "Eurostat/une_rt_q/Q.SA.TOTAL.PC_ACT.T.EA19")) %>%
filter(!is.na(value))
ggplot(df, aes(x = period, y = value, color = series_name)) +
geom_line(size = 1.2) +
geom_point(size = 2) +
dbnomics(legend.text = element_text(size = 7))
mask
The code mask notation is a very concise way to select one or many time series at once.
df <- rdb("IMF", "BOP", mask = "A.FR.BCA_BP6_EUR") %>%
filter(!is.na(value))
ggplot(df, aes(x = period, y = value, color = series_name)) +
geom_step(size = 1.2) +
geom_point(size = 2) +
dbnomics()
In the event that you only use the arguments provider_code
, dataset_code
and mask
, you can drop the name mask
and run:
df <- rdb("IMF", "BOP", "A.FR.BCA_BP6_EUR")
You just have to add a +
between two different values of a dimension.
df <- rdb("IMF", "BOP", mask = "A.FR+ES.BCA_BP6_EUR") %>%
filter(!is.na(value))
ggplot(df, aes(x = period, y = value, color = series_name)) +
geom_step(size = 1.2) +
geom_point(size = 2) +
dbnomics()
df <- rdb("IMF", "BOP", mask = "A..BCA_BP6_EUR") %>%
filter(!is.na(value)) %>%
arrange(desc(period), REF_AREA) %>%
head(100)
df <- rdb("IMF", "BOP", mask = "A.FR.BCA_BP6_EUR+IA_BP6_EUR") %>%
filter(!is.na(value)) %>%
group_by(INDICATOR) %>%
top_n(n = 50, wt = period)
dimensions
Searching by dimensions
is a less concise way to select time series than using the code mask
, but it works with all the different providers. You have a “Description of series code” at the bottom of each dataset page on the DBnomics website.
df <- rdb("AMECO", "ZUTN", dimensions = list(geo = "ea19")) %>%
filter(!is.na(value))
# or
# df <- rdb("AMECO", "ZUTN", dimensions = '{"geo": ["ea19"]}') %>%
# filter(!is.na(value))
ggplot(df, aes(x = period, y = value, color = series_name)) +
geom_line(size = 1.2) +
geom_point(size = 2) +
dbnomics()
df <- rdb("AMECO", "ZUTN", dimensions = list(geo = c("ea19", "dnk"))) %>%
filter(!is.na(value))
# or
# df <- rdb("AMECO", "ZUTN", dimensions = '{"geo": ["ea19", "dnk"]}') %>%
# filter(!is.na(value))
ggplot(df, aes(x = period, y = value, color = series_name)) +
geom_line(size = 1.2) +
geom_point(size = 2) +
dbnomics()
df <- rdb("WB", "DB", dimensions = list(country = c("DZ", "PE"), indicator = c("ENF.CONT.COEN.COST.ZS", "IC.REG.COST.PC.FE.ZS"))) %>%
filter(!is.na(value))
# or
# df <- rdb("WB", "DB", dimensions = '{"country": ["DZ", "PE"], "indicator": ["ENF.CONT.COEN.COST.ZS", "IC.REG.COST.PC.FE.ZS"]}') %>%
# filter(!is.na(value))
ggplot(df, aes(x = period, y = value, color = series_name)) +
geom_line(size = 1.2) +
geom_point(size = 2) +
dbnomics()
query
The query is a Google-like search that will filter/select time series from a provider’s dataset.
df <- rdb("IMF", "WEO", query = "France current account balance percent") %>%
filter(!is.na(value))
ggplot(df, aes(x = period, y = value, color = series_name)) +
geom_line(size = 1.2) +
geom_point(size = 2) +
dbnomics()
df <- rdb("IMF", "WEO", query = "current account balance percent") %>%
filter(!is.na(value))
ggplot(df, aes(x = period, y = value, color = `WEO Country`)) +
geom_line(size = 1.2) +
geom_point(size = 2) +
ggtitle("Current account balance (% GDP)") +
dbnomics(legend.direction = "horizontal")
When you don’t know the codes of the dimensions, provider, dataset or series, you can:
go to the page of a dataset on DBnomics website, for example Doing Business,
select some dimensions by using the input widgets of the left column,
click on “Copy API link” in the menu of the “Download” button,
use the rdb(api_link = ...)
function such as below.
df <- rdb(api_link = "https://api.db.nomics.world/v22/series/WB/DB?dimensions=%7B%22country%22%3A%5B%22FR%22%2C%22IT%22%2C%22ES%22%5D%7D&q=IC.REG.PROC.FE.NO&observations=1&format=json&align_periods=1&offset=0&facets=0") %>%
filter(!is.na(value))
ggplot(df, aes(x = period, y = value, color = series_name)) +
geom_step(size = 1.2) +
geom_point(size = 2) +
dbnomics()
In the event that you only use the argument api_link
, you can drop the name and run:
df <- rdb("https://api.db.nomics.world/v22/series/WB/DB?dimensions=%7B%22country%22%3A%5B%22FR%22%2C%22IT%22%2C%22ES%22%5D%7D&q=IC.REG.PROC.FE.NO&observations=1&format=json&align_periods=1&offset=0&facets=0")
rdb(api_link = ...)
function. Please note that when you update your cart, you have to copy this link again, because the link itself contains the ids of the series in the cart.
df <- rdb(api_link = "https://api.db.nomics.world/v22/series?observations=1&series_ids=BOE/6008/RPMTDDC,BOE/6231/RPMTBVE") %>%
filter(!is.na(value))
ggplot(df, aes(x = period, y = value, color = series_name)) +
geom_line(size = 1.2) +
geom_point(size = 2) +
dbnomics()
Could not resolve host
When using the function rdb
, you may come across the following error:
Error in open.connection(con, "rb") :
Could not resolve host: api.db.nomics.world
To get round this situation, you have two options:
configure curl to use a specific and authorized proxy.
use the default R internet connection i.e. the Internet Explorer proxy defined in internet2.dll.
To retrieve the data with the default R internet connection, rdbnomics will use the base function readLines
.
To activate this feature for a session, you need to enable an option of the package:
options(rdbnomics.use_readLines = TRUE)
And then use the standard function as follows:
df1 <- rdb(ids = "AMECO/ZUTN/EA19.1.0.0.0.ZUTN")
This configuration can be disabled with:
options(rdbnomics.use_readLines = FALSE)
If you just want to do it once, you may use the argument use_readLines
of the function rdb
:
df1 <- rdb(ids = "AMECO/ZUTN/EA19.1.0.0.0.ZUTN", use_readLines = TRUE)
The rdbnomics package can interact with the Time Series Editor of DBnomics to transform time series by applying filters to them.
Available filters are listed on the filters page https://editor.nomics.world/filters.
Here is an example of how to proceed to interpolate two annual time series with a monthly frequency, using a spline interpolation:
filters <- list(
code = "interpolate",
parameters = list(frequency = "monthly", method = "spline")
)
The request is then:
df <- rdb(
ids = c("AMECO/ZUTN/EA19.1.0.0.0.ZUTN", "AMECO/ZUTN/DNK.1.0.0.0.ZUTN"),
filters = filters
)
If you want to apply more than one filter, the filters
argument will be a list of valid filters:
filters <- list(
list(
code = "interpolate",
parameters = list(frequency = "monthly", method = "spline")
),
list(
code = "aggregate",
parameters = list(frequency = "bi-annual", method = "end_of_period")
)
)
df <- rdb(
ids = c("AMECO/ZUTN/EA19.1.0.0.0.ZUTN", "AMECO/ZUTN/DNK.1.0.0.0.ZUTN"),
filters = filters
)
The data.frame (data.table or tibble) columns change a little bit when filters are used. There are two new columns:
period_middle_day
: the middle day of original_period
(can be useful when you compare graphically interpolated series and original ones).filtered
(boolean): TRUE
if the series is filtered, FALSE
otherwise.The content of two columns are modified:
series_code
: same as before for original series, but the suffix _filtered
is added for filtered series.series_name
: same as before for original series, but the suffix (filtered)
is added for filtered series.ggplot(filter(df, !is.na(value)), aes(x = period, y = value, color = series_name)) +
geom_line(size = 1.2) +
geom_point(size = 2) +
dbnomics()
dbnomics()
used in the vignetteWe show the function dbnomics()
as an information.
dbnomics <- function(color_palette = "Set1", ...) {
# Check if ggplot2 is installed.
ggplot2_ok <- try(utils::packageVersion("ggplot2"), silent = TRUE)
if (inherits(ggplot2_ok, "try-error")) {
stop(
"Please run install.packages('ggplot2') to use dbnomics().",
call. = FALSE
)
}
# DBnomics vignette theme
result <- list(
ggplot2::scale_x_date(expand = c(0, 0)),
ggplot2::scale_y_continuous(
labels = function(x) { format(x, big.mark = " ") }
),
ggplot2::xlab(""),
ggplot2::ylab(""),
ggplot2::theme_bw(),
ggplot2::theme(
legend.position = "bottom", legend.direction = "vertical",
legend.background = ggplot2::element_rect(
fill = "transparent", colour = NA
),
legend.key = ggplot2::element_blank(),
panel.background = ggplot2::element_rect(
fill = "transparent", colour = NA
),
plot.background = ggplot2::element_rect(
fill = "transparent", colour = NA
),
legend.title = ggplot2::element_blank()
),
ggplot2::theme(...),
ggplot2::annotate(
geom = "text", label = "DBnomics <https://db.nomics.world>",
x = structure(Inf, class = "Date"), y = -Inf,
hjust = 1.1, vjust = -0.4, col = "grey",
fontface = "italic"
)
)
if (!is.null(color_palette)) {
result <- c(
result,
list(ggplot2::scale_color_brewer(palette = color_palette))
)
}
result
}
Banque de France, https://github.com/s915↩
CEPREMAP↩