| Title: | Data Science Infrastructure for Global Health |
|---|---|
| Description: | Supports global health data analysis, including a publication-ready 'ggplot2' theme, a 'flextable' defaults helper, a thin pie chart wrapper, built-in regional country-code datasets with a WHO region lookup helper, a geometric mean function for indicator aggregation, and convenience clients for the World Health Organization Global Health Observatory (GHO) OData API <https://ghoapi.azureedge.net/api/> and the United Nations Sustainable Development Goals (SDG) API <https://unstats.un.org/SDGAPI/swagger/>. |
| Authors: | Shanlong Ding [aut, cre] (ORCID: <https://orcid.org/0000-0001-9048-6670>) |
| Maintainer: | Shanlong Ding <[email protected]> |
| License: | MIT + file LICENSE |
| Version: | 0.7.1 |
| Built: | 2026-05-23 13:23:01 UTC |
| Source: | https://github.com/shanlong-who/dsir |
Combines two or more tibbles produced by gho_clean() or
sdg_clean() into a single tibble. Because both cleaners output the
same 15-column schema, the result is a uniform table that can be
filtered, joined, or visualised without source-specific code paths;
use the source column to tell GHO rows apart from SDG rows.
bind_indicators(...)bind_indicators(...)
... |
Two or more tibbles returned by |
Inputs do not need to be in any particular order. NULL inputs are
silently dropped, which makes it ergonomic to write code like
bind_indicators(maybe_gho, maybe_sdg) where some sources may not
have been fetched.
A single tibble with the unified cleaned-
indicator schema (15 columns). Row order is c(input_1, input_2, ...), preserving within-input order.
gho <- gho_data("NCDMORT3070", area = wpro_cty) |> gho_clean() sdg <- sdg_data("3.4.1", area = wpro_cty) |> sdg_clean() bind_indicators(gho, sdg)gho <- gho_data("NCDMORT3070", area = wpro_cty) |> gho_clean() sdg <- sdg_data("3.4.1", area = wpro_cty) |> sdg_clean() bind_indicators(gho, sdg)
Applies a consistent set of flextable formatting defaults for
publication-ready tables (booktabs theme, bold headers, modest
padding). Pick any font you like — the default "" leaves the
flextable default in place so the call is safe on systems where
Cambria is not installed.
dsi_flextable_defaults( font_size = 12, font_family = "", font_color = "#333333", border_color = "black", padding = c(3, 3, 4, 4) )dsi_flextable_defaults( font_size = 12, font_family = "", font_color = "#333333", border_color = "black", padding = c(3, 3, 4, 4) )
font_size |
Font size in points. Default |
font_family |
Font family name. Default |
font_color |
Font color. Default |
border_color |
Border color. Default |
padding |
Numeric vector of length 1 (applied to all sides)
or length 4 ( |
Invisibly returns NULL. Called for its side effect of
mutating the flextable global defaults via
flextable::set_flextable_defaults().
dsi_flextable_defaults()dsi_flextable_defaults()
Computes the geometric mean of a numeric vector, with optional weights. Useful for aggregating multiplicative quantities such as ratio-based health indicators (e.g. UHC service-coverage tracers, where the composite index is the geometric mean of component coverage values).
geomean(x, w = NULL, na.rm = TRUE)geomean(x, w = NULL, na.rm = TRUE)
x |
A numeric vector. Zeros produce a result of |
w |
Optional numeric vector of weights, the same length as
|
na.rm |
Logical. Should missing values in |
Pass w to compute a weighted geometric mean, defined as
exp(weighted.mean(log(x), w)).
A numeric scalar. Returns NA_real_ when the input is
empty, when it is entirely NA, or when na.rm = FALSE and
any element is NA. Returns NaN with a warning when x
contains negative values, or when all weights are zero.
# Unweighted geomean(c(1, 4, 16)) # 4 geomean(c(0.6, 0.8, 0.95)) # ~0.772 — typical UHC tracer aggregation geomean(c(1, NA, 4)) # 2 geomean(c(1, NA, 4), na.rm = FALSE) # NA_real_ geomean(c(1, 0, 4)) # 0 # Weighted geomean(c(1, 4, 16), w = c(1, 1, 1)) # 4 (equal weights = unweighted) geomean(c(1, 4, 16), w = c(1, 2, 1)) # weighted toward 4 geomean(c(0.6, 0.8, 0.95), w = c(2, 1, 1))# Unweighted geomean(c(1, 4, 16)) # 4 geomean(c(0.6, 0.8, 0.95)) # ~0.772 — typical UHC tracer aggregation geomean(c(1, NA, 4)) # 2 geomean(c(1, NA, 4), na.rm = FALSE) # NA_real_ geomean(c(1, 0, 4)) # 0 # Weighted geomean(c(1, 4, 16), w = c(1, 1, 1)) # 4 (equal weights = unweighted) geomean(c(1, 4, 16), w = c(1, 2, 1)) # weighted toward 4 geomean(c(0.6, 0.8, 0.95), w = c(2, 1, 1))
Builds a pie chart from a data frame using one categorical column and one numeric column. Slices are labeled with the category name and percentage share.
ggpie( df, .x, .y, .offset = 1, .color = "white", .legend = FALSE, .label = TRUE, .label_size = 3.5 )ggpie( df, .x, .y, .offset = 1, .color = "white", .legend = FALSE, .label = TRUE, .label_size = 3.5 )
df |
A data frame. |
.x |
Column name (string) of the categorical variable used for the slices. |
.y |
Column name (string) of the numeric variable used for the slice values. |
.offset |
Numeric scalar (> 0). Controls label position
along the slice radius. Default |
.color |
Border color between slices. Default |
.legend |
Logical. Show the legend? Default |
.label |
Logical. Draw |
.label_size |
Label text size in mm. Default |
A ggplot object.
df <- data.frame( category = c("A", "B", "C"), value = c(40, 35, 25) ) ggpie(df, "category", "value")df <- data.frame( category = c("A", "B", "C"), value = c(40, 35, 25) ) ggpie(df, "category", "value")
Selects, renames, and type-casts the most useful columns from a GHO
observation table returned by gho_data(), producing a compact
tibble in the unified DSIR cleaned-indicator schema — the same
schema produced by sdg_clean(), so the two outputs can be combined
directly with bind_indicators().
gho_clean(df)gho_clean(df)
df |
A data frame returned by |
The mapping (GHO source → unified column) is:
IndicatorCode → id
IndicatorCode resolved against the GHO indicator catalog →
indicator (the human-readable name; cached at session level
after the first call)
SpatialDim → location; also iso3 when it matches a WHO
Member State, otherwise iso3 = NA
TimeDim → year (integer)
Value → value (character; raw)
NumericValue → value_num (numeric)
Low, High → low, high (numeric)
Dim1, Dim2, Dim3 → dim1, dim2, dim3 (character)
The series column is always NA for GHO output (it is an SDG-only
concept). The location_name column is populated by looking up
location (an ISO3 code or a WHO region code) against the
who_countries dataset and a hardcoded set of WHO regional names;
locations that match neither (e.g. non-Member State areas) are left
as NA.
Source columns absent from df (e.g. Low / High for indicators
without confidence intervals) are filled with typed NA, so the
output always has the same 15 columns with the same column types.
The GHO data endpoint (/api/{IndicatorCode}) does not return
IndicatorName; that field lives on the catalog endpoint queried by
gho_indicators(). On the first call within an R session,
gho_clean() fetches the catalog once and caches it for the rest of
the session, so the indicator column carries the full
human-readable indicator name. If the catalog cannot be fetched
(e.g. no network), gho_indicators() emits a warning and the
indicator column falls back to NA.
A tibble with 15 columns: source (always
"gho"), id, indicator, location, iso3, location_name,
year, value, value_num, low, high, series (NA),
dim1, dim2, dim3. Sorted by location then year.
Empty input returns an empty tibble with the same columns and
types.
gho_data(), sdg_clean(), bind_indicators().
gho_data("NCDMORT3070", spatial_type = "country") |> gho_clean()gho_data("NCDMORT3070", spatial_type = "country") |> gho_clean()
Sends a $top=0&$count=true request to the WHO GHO OData API,
which returns the matching row count without transferring any
observations. Useful for sizing a download before issuing it.
gho_count( indicator, spatial_type = NULL, area = NULL, year_from = NULL, year_to = NULL )gho_count( indicator, spatial_type = NULL, area = NULL, year_from = NULL, year_to = NULL )
indicator |
Character scalar. The indicator code
(e.g. |
spatial_type |
Character. Spatial dimension to filter on:
one of |
area |
Character vector of country or region codes
(e.g. |
year_from |
Numeric. Start year filter (inclusive).
Default |
year_to |
Numeric. End year filter (inclusive).
Default |
An integer scalar — the number of observations the
server would return for the same filter via gho_data().
Returns NA_integer_ (with a warning) if the request fails.
gho_data(), gho_has_data(), gho_coverage().
# How many rows would gho_data() pull for France? gho_count("WHOSIS_000001", area = "FRA") # Compare coverage across regions gho_count("NCDMORT3070", spatial_type = "country") gho_count("NCDMORT3070", spatial_type = "region")# How many rows would gho_data() pull for France? gho_count("WHOSIS_000001", area = "FRA") # Compare coverage across regions gho_count("NCDMORT3070", spatial_type = "country") gho_count("NCDMORT3070", spatial_type = "region")
Fetches only the SpatialDim and TimeDim columns for a GHO
indicator (much lighter than gho_data()) and summarises the
year range and observation count per location. Useful for
answering "which countries have data, and for what years?"
before committing to a full download.
gho_coverage( indicator, spatial_type = "country", area = NULL, year_from = NULL, year_to = NULL )gho_coverage( indicator, spatial_type = "country", area = NULL, year_from = NULL, year_to = NULL )
indicator |
Character scalar. The indicator code
(e.g. |
spatial_type |
Character. Spatial dimension to filter on:
one of |
area |
Character vector of country or region codes
(e.g. |
year_from |
Numeric. Start year filter (inclusive).
Default |
year_to |
Numeric. End year filter (inclusive).
Default |
A tibble with one row per location and columns:
location (chr) — the SpatialDim value (typically ISO3).
year_min (int) — earliest year with data.
year_max (int) — latest year with data.
n_obs (int) — number of observations.
Sorted by location. Empty input or service failure returns
an empty tibble with the same four columns.
gho_data(), gho_has_data(), gho_count().
# Year coverage of life expectancy for three countries gho_coverage("WHOSIS_000001", area = c("FRA", "DEU", "JPN")) # All countries with any life-expectancy data, since 2010 gho_coverage("WHOSIS_000001", year_from = 2010)# Year coverage of life expectancy for three countries gho_coverage("WHOSIS_000001", area = c("FRA", "DEU", "JPN")) # All countries with any life-expectancy data, since 2010 gho_coverage("WHOSIS_000001", year_from = 2010)
Retrieves observations for a specific indicator from the WHO GHO OData API, with optional filters by spatial level, country / region and year range.
gho_data( indicator, spatial_type = NULL, area = NULL, year_from = NULL, year_to = NULL )gho_data( indicator, spatial_type = NULL, area = NULL, year_from = NULL, year_to = NULL )
indicator |
Character scalar. The indicator code
(e.g. |
spatial_type |
Character. Spatial dimension to filter on:
one of |
area |
Character vector of country or region codes
(e.g. |
year_from |
Numeric. Start year filter (inclusive).
Default |
year_to |
Numeric. End year filter (inclusive).
Default |
A tibble of indicator observations, or an empty tibble when the service is unreachable.
gho_indicators(), gho_dimensions().
# Country-level data for one indicator gho_data("NCDMORT3070", spatial_type = "country") # Specific countries and years gho_data("WHOSIS_000001", area = c("FRA", "DEU"), year_from = 2015)# Country-level data for one indicator gho_data("NCDMORT3070", spatial_type = "country") # Specific countries and years gho_data("WHOSIS_000001", area = c("FRA", "DEU"), year_from = 2015)
Returns the unique values of a given dimension across all
observations of a GHO indicator. Useful for discovering which
ages, sexes, regions, or other breakdowns are available before
calling gho_data().
gho_dimensions(indicator, dimension = "SpatialDimType")gho_dimensions(indicator, dimension = "SpatialDimType")
indicator |
Character scalar. The indicator code
(e.g. |
dimension |
Character. Name of the dimension column in the
indicator data. Common values include |
A character vector of unique, sorted dimension values, or an empty character vector when the service is unreachable or the dimension is missing.
gho_dimensions("NCDMORT3070") gho_dimensions("NCDMORT3070", dimension = "Dim1")gho_dimensions("NCDMORT3070") gho_dimensions("NCDMORT3070", dimension = "Dim1")
Sends a minimal request ($top=1&$select=Id) to the WHO GHO OData
API to find out whether any observations exist for the given
indicator and filter combination, without downloading the full
result set. Useful as a quick precheck before gho_data().
gho_has_data( indicator, spatial_type = NULL, area = NULL, year_from = NULL, year_to = NULL )gho_has_data( indicator, spatial_type = NULL, area = NULL, year_from = NULL, year_to = NULL )
indicator |
Character scalar. The indicator code
(e.g. |
spatial_type |
Character. Spatial dimension to filter on:
one of |
area |
Character vector of country or region codes
(e.g. |
year_from |
Numeric. Start year filter (inclusive).
Default |
year_to |
Numeric. End year filter (inclusive).
Default |
A logical scalar:
TRUE if at least one observation exists for the filter.
FALSE if the server returns an empty result.
NA if the request fails (network failure, unreachable host,
or the indicator code does not exist and the server returns an
HTTP error). A warning is emitted in the failure case.
gho_data(), gho_count(), gho_coverage().
# Does WHO have life-expectancy data for France? gho_has_data("WHOSIS_000001", area = "FRA") # Quickly screen a list of indicators before downloading any data inds <- c("WHOSIS_000001", "NCDMORT3070") vapply(inds, gho_has_data, logical(1), area = "FRA")# Does WHO have life-expectancy data for France? gho_has_data("WHOSIS_000001", area = "FRA") # Quickly screen a list of indicators before downloading any data inds <- c("WHOSIS_000001", "NCDMORT3070") vapply(inds, gho_has_data, logical(1), area = "FRA")
Fetches the catalog of indicators from the WHO Global Health Observatory (GHO) OData API.
gho_indicators(search = NULL)gho_indicators(search = NULL)
search |
Optional character. Search keywords matched against
Single quotes in any term are escaped for the OData filter. |
A tibble with columns IndicatorCode,
IndicatorName and Language. Returns an empty tibble (with
a message) when the service is unreachable.
# All indicators inds <- gho_indicators() # Single keyword gho_indicators("mortality") # Multiple keywords from one string (AND): both terms must appear gho_indicators("child mortality") # Or pass terms as a vector gho_indicators(c("child", "mortality"))# All indicators inds <- gho_indicators() # Single keyword gho_indicators("mortality") # Multiple keywords from one string (AND): both terms must appear gho_indicators("child mortality") # Or pass terms as a vector gho_indicators(c("child", "mortality"))
Maps ISO 3166-1 alpha-3 country codes to UN M49 numeric area codes
using the who_countries dataset shipped with DSIR. Useful when
moving from data sources keyed by ISO3 (e.g. the WHO GHO API) to
sources keyed by M49 (e.g. the UN SDG API).
iso3_to_m49(iso3)iso3_to_m49(iso3)
iso3 |
Character vector of ISO3 codes. Case-insensitive; values are upper-cased before lookup. |
Codes that do not correspond to a WHO Member State return NA.
This includes Associate Members (e.g. Puerto Rico) and other
non-Member areas that some indicator data sets cover.
Most users will not need to call this function directly:
sdg_data() and sdg_coverage() accept ISO3 codes for their
area argument and convert internally. This helper is exported
for cases where you want to inspect or manipulate the conversion
yourself.
A character vector the same length as iso3, with M49
codes in the same format as who_countries$m49_code (three-
character zero-padded strings, e.g. "076"). Non-Member areas
return NA.
who_countries, iso3_to_region(), sdg_data().
iso3_to_m49(c("PHL", "FRA", "JPN")) # "608" "250" "392" # Case-insensitive iso3_to_m49("phl") # "608" # Non-Member areas return NA iso3_to_m49(c("PRI", "PHL")) # NA "608"iso3_to_m49(c("PHL", "FRA", "JPN")) # "608" "250" "392" # Case-insensitive iso3_to_m49("phl") # "608" # Non-Member areas return NA iso3_to_m49(c("PRI", "PHL")) # NA "608"
Maps ISO 3166-1 alpha-3 country codes to WHO region codes using
the who_countries dataset shipped with DSIR. Stays in sync with
WHO governance changes reflected in DSIR — for example, Indonesia's
reassignment from SEAR to WPR following EB156 (May 2025).
iso3_to_region(iso3, long = FALSE)iso3_to_region(iso3, long = FALSE)
iso3 |
Character vector of ISO3 codes. Case-sensitive
(uppercase, as in |
long |
Logical. If |
Codes that do not correspond to a WHO Member State return NA.
This includes Associate Members (Puerto Rico, Tokelau) and other
non-Member areas that some indicator data sets cover.
A character vector the same length as iso3.
iso3_to_region(c("PHL", "FRA", "USA", "COK")) # "WPR" "EUR" "AMR" "WPR" iso3_to_region(c("IDN", "JPN"), long = TRUE) # "Western Pacific" "Western Pacific" (Indonesia in WPR since May 2025) # Non-Member areas return NA iso3_to_region(c("PRI", "TKL", "PHL")) # NA NA "WPR"iso3_to_region(c("PHL", "FRA", "USA", "COK")) # "WPR" "EUR" "AMR" "WPR" iso3_to_region(c("IDN", "JPN"), long = TRUE) # "Western Pacific" "Western Pacific" (Indonesia in WPR since May 2025) # Non-Member areas return NA iso3_to_region(c("PRI", "TKL", "PHL")) # NA NA "WPR"
Maps UN M49 numeric area codes to ISO 3166-1 alpha-3 country codes
using the who_countries dataset shipped with DSIR. Counterpart
to iso3_to_m49() and used internally by sdg_clean() to populate
the iso3 column on SDG output.
m49_to_iso3(m49)m49_to_iso3(m49)
m49 |
Character vector of M49 codes. |
M49 codes that do not correspond to a WHO Member State return NA.
This includes region / world aggregates (e.g. "900" for World,
"001" for World, "419" for Latin America and the Caribbean)
and codes for non-Member areas (e.g. Puerto Rico, Tokelau).
Input accepts either the zero-padded form ("076") or the bare
form ("76"); both are normalised before lookup. Non-numeric
input returns NA (with a single warning from the underlying
as.integer() coercion).
A character vector the same length as m49. Non-Member
codes (region aggregates, non-Member areas) return NA.
who_countries, iso3_to_m49(), sdg_clean().
m49_to_iso3(c("608", "250", "392")) # "PHL" "FRA" "JPN" # Zero-padded and bare forms both accepted m49_to_iso3(c("076", "76")) # "BRA" "BRA" # Non-Member areas / aggregates return NA m49_to_iso3(c("900", "608")) # NA "PHL"m49_to_iso3(c("608", "250", "392")) # "PHL" "FRA" "JPN" # Zero-padded and bare forms both accepted m49_to_iso3(c("076", "76")) # "BRA" "BRA" # Non-Member areas / aggregates return NA m49_to_iso3(c("900", "608")) # NA "PHL"
Character vector of ISO3 codes for the 14 WHO Member States classified as Pacific Island Countries (PICs). Sorted alphabetically.
pic_ctypic_cty
A character vector of 14 ISO 3166-1 alpha-3 codes.
All 14 PICs are within the Western Pacific Region, so
all(pic_cty %in% wpro_cty) is TRUE. Non-Member Pacific areas
(e.g. New Caledonia, French Polynesia, American Samoa, Tokelau) are
not included.
The 14 PIC Member States are: Cook Islands, Fiji, Kiribati, Marshall Islands, Micronesia (Federated States of), Nauru, Niue, Palau, Papua New Guinea, Samoa, Solomon Islands, Tonga, Tuvalu, and Vanuatu.
length(pic_cty) # 14 all(pic_cty %in% wpro_cty) # TRUElength(pic_cty) # 14 all(pic_cty %in% wpro_cty) # TRUE
Thin wrappers around ggplot2::scale_y_continuous() and
ggplot2::scale_x_continuous() that remove the default lower expansion
so that columns sit flush with the axis — the convention for WHO and
most publication-style bar charts. The upper expansion is preserved at
5% so the tallest column has breathing room above (or to the right of) it.
scale_y_dsi_col(...) scale_x_dsi_col(...)scale_y_dsi_col(...) scale_x_dsi_col(...)
... |
Arguments forwarded to the underlying
|
Use scale_y_dsi_col() for vertical bars (geom_col() / geom_bar())
and scale_x_dsi_col() when bars are horizontal (via coord_flip() or
geom_col(orientation = "y")).
Pass any other scale_*_continuous() argument
(labels, breaks, limits, ...) through ....
A ggplot2 Scale object, to be added to a plot with +.
library(ggplot2) # Vertical bars ggplot(mtcars, aes(factor(cyl))) + geom_bar(fill = "#0093D5") + scale_y_dsi_col() + theme_dsi()library(ggplot2) # Vertical bars ggplot(mtcars, aes(factor(cyl))) + geom_bar(fill = "#0093D5") + scale_y_dsi_col() + theme_dsi()
Fetches the list of geographic areas available from the UN SDG database.
sdg_areas()sdg_areas()
A tibble with area codes and names, or
NULL when the service is unreachable.
sdg_areas()sdg_areas()
Selects, renames, and type-casts the most useful columns from an
SDG observation table returned by sdg_data(), producing a compact
tibble in the unified DSIR cleaned-indicator schema — the same
schema produced by gho_clean(), so the two outputs can be combined
directly with bind_indicators().
sdg_clean(df)sdg_clean(df)
df |
A data frame returned by |
The mapping (SDG source → unified column) is:
indicator (list-column, flattened) → id (e.g. "3.4.1")
seriesDescription → indicator (human-readable
label; NA if the API response does not include it)
geoAreaCode → location (UN M49 numeric,
as character); also iso3 via m49_to_iso3() for WHO Member
States — region / world aggregates and non-Member areas get
iso3 = NA
location_name is resolved by looking up iso3 against
who_countries (so a WHO Member State has the same
location_name here and in gho_clean() output), with a
fallback to the SDG API's raw geoAreaName for non-Member-State
rows (e.g. regional / world aggregates)
timePeriodStart → year (integer)
value → value (character; raw)
and value_num (numeric; NA for non-numeric entries like
"<0.1" or aggregate notes)
lowerBound, upperBound → low, high (numeric)
series → series
Three columns are always present but never populated for SDG
output: dim1, dim2, dim3 (GHO-only concepts).
A tibble with 15 columns: source (always
"sdg"), id, indicator, location, iso3, location_name,
year, value, value_num, low, high, series, dim1
(NA), dim2 (NA), dim3 (NA). Sorted by location then
year. Empty input returns an empty tibble with the same columns
and types.
sdg_data(), gho_clean(), bind_indicators(),
m49_to_iso3().
sdg_data("3.2.1", area = "156", year_from = 2015) |> sdg_clean()sdg_data("3.2.1", area = "156", year_from = 2015) |> sdg_clean()
A single SDG indicator (for example "3.4.1", NCD mortality) is
typically published as several series stratified by sex, age,
or cause. Different series may have different country and year
coverage. sdg_coverage() summarises year range and observation
count per (location, series) combination, so you can see which
series exist for an indicator and how each one is covered before
committing to a downstream analysis.
sdg_coverage(indicator, area = NULL, year_from = NULL, year_to = NULL)sdg_coverage(indicator, area = NULL, year_from = NULL, year_to = NULL)
indicator |
Character vector of SDG indicator codes
(e.g. |
area |
Character vector of country/area codes. Accepts either
ISO3 codes (e.g. |
year_from |
Numeric. Start year filter (inclusive).
Default |
year_to |
Numeric. End year filter (inclusive).
Default |
Unlike the GHO availability helpers, this function is a
series-exploration tool rather than a payload-saving precheck:
SDG data is generally complete enough that GHO-style
has_data() / count() helpers add little value, so they are
intentionally not provided. The SDG API also offers no
payload-reduction option (no $select equivalent), so
sdg_coverage() calls sdg_data() internally and aggregates
the result client-side.
A tibble with one row per
(location, series) and columns:
location (chr) — area code (geoAreaCode).
series (chr) — SDG series code.
year_min (int) — earliest year with data.
year_max (int) — latest year with data.
n_obs (int) — number of observations.
Sorted by location then series. Empty input or service
failure returns an empty tibble with the same five columns.
sdg_data(), sdg_indicators(), gho_coverage().
# Series available for NCD mortality in China and Brazil sdg_coverage("3.4.1", area = c("156", "076")) # Filter to a year range sdg_coverage("3.4.1", area = "156", year_from = 2015)# Series available for NCD mortality in China and Brazil sdg_coverage("3.4.1", area = c("156", "076")) # Filter to a year range sdg_coverage("3.4.1", area = "156", year_from = 2015)
Retrieves data for one or more SDG indicators from the UN SDG API, with optional filters by area and year.
sdg_data( indicator, area = NULL, year_from = NULL, year_to = NULL, page_size = 1000L )sdg_data( indicator, area = NULL, year_from = NULL, year_to = NULL, page_size = 1000L )
indicator |
Character vector of indicator codes
(e.g. |
area |
Character vector of country/area codes. Accepts either
ISO3 codes (e.g. |
year_from |
Numeric. Start year filter (inclusive).
Default |
year_to |
Numeric. End year filter (inclusive).
Default |
page_size |
Integer. Number of records per page.
Default |
A tibble of indicator observations, or an empty tibble when the service is unreachable or there are no matching rows.
sdg_indicators(), sdg_areas(), iso3_to_m49().
# One indicator, one country — the typical entry point sdg_data("1.1.1", area = "PHL") # Specific area and year range (M49 code) sdg_data("3.2.1", area = "156", year_from = 2015, year_to = 2023) # ISO3 codes work directly — DSIR's regional vectors can be passed in sdg_data("3.4.1", area = c("PHL", "FRA", "JPN"))# One indicator, one country — the typical entry point sdg_data("1.1.1", area = "PHL") # Specific area and year range (M49 code) sdg_data("3.2.1", area = "156", year_from = 2015, year_to = 2023) # ISO3 codes work directly — DSIR's regional vectors can be passed in sdg_data("3.4.1", area = c("PHL", "FRA", "JPN"))
Fetches the list of Sustainable Development Goals from the UN SDG API.
sdg_goals(include_children = FALSE)sdg_goals(include_children = FALSE)
include_children |
Logical. Include targets and indicators
nested under each goal? Default |
A list (or tibble) of SDG goals, or
NULL when the service is unreachable.
sdg_targets(), sdg_indicators(), sdg_data().
sdg_goals() sdg_goals(include_children = TRUE)sdg_goals() sdg_goals(include_children = TRUE)
Fetches the list of SDG indicators from the UN SDG API, with optional keyword filtering on the indicator description.
sdg_indicators(search = NULL)sdg_indicators(search = NULL)
search |
Optional character. Search keywords matched against
the
The filter is applied client-side using
|
A list (or tibble) of SDG indicators,
or NULL when the service is unreachable. When search matches
no rows, an empty tibble with the same columns as the unfiltered
response is returned.
# Full list sdg_indicators() # Single keyword sdg_indicators("mortality") # Multi-keyword — AND semantics sdg_indicators("mortality cancer") sdg_indicators(c("maternal", "mortality"))# Full list sdg_indicators() # Single keyword sdg_indicators("mortality") # Multi-keyword — AND semantics sdg_indicators("mortality cancer") sdg_indicators(c("maternal", "mortality"))
Fetches the list of SDG targets from the UN SDG API.
sdg_targets(include_children = FALSE)sdg_targets(include_children = FALSE)
include_children |
Logical. Include indicators nested under
each target? Default |
A list (or tibble) of SDG targets, or
NULL when the service is unreachable.
sdg_goals(), sdg_indicators().
sdg_targets()sdg_targets()
A clean, modern theme tuned for WHO and global-health publications.
Removes the panel border, draws light grid lines, and uses a muted
text colour so that the data — not the chart chrome — is the visual
focus. The grid argument controls which direction(s) the grid
lines run, so the theme works equally well for vertical bars,
horizontal bars (via coord_flip()), scatter plots, and line charts.
theme_dsi( base_size = 12, base_family = "", accent = "#0093D5", grid_color = "grey92", grid = c("both", "x", "y", "none"), legend_position = "bottom" )theme_dsi( base_size = 12, base_family = "", accent = "#0093D5", grid_color = "grey92", grid = c("both", "x", "y", "none"), legend_position = "bottom" )
base_size |
Base font size in points. Default |
base_family |
Base font family. Default |
accent |
Accent colour used for axis lines and as a default for
highlight elements. Default |
grid_color |
Colour of the major grid. Default
|
grid |
Which direction(s) to draw major grid lines. One of
|
legend_position |
Position of the legend. Default |
A ggplot2 theme object.
theme_dsi_facet() for a sibling theme tuned for faceted plots.
library(ggplot2) # Default — grid in both directions, works under coord_flip() ggplot(mtcars, aes(wt, mpg, color = factor(cyl))) + geom_point(size = 3) + theme_dsi() + labs(title = "Fuel efficiency by weight", x = "Weight (1000 lbs)", y = "Miles per gallon", color = "Cylinders") # Minimal look — only horizontal grid lines ggplot(mtcars, aes(wt, mpg)) + geom_point(size = 3, color = "#0093D5") + theme_dsi(grid = "y") + labs(x = "Weight (1000 lbs)", y = "Miles per gallon")library(ggplot2) # Default — grid in both directions, works under coord_flip() ggplot(mtcars, aes(wt, mpg, color = factor(cyl))) + geom_point(size = 3) + theme_dsi() + labs(title = "Fuel efficiency by weight", x = "Weight (1000 lbs)", y = "Miles per gallon", color = "Cylinders") # Minimal look — only horizontal grid lines ggplot(mtcars, aes(wt, mpg)) + geom_point(size = 3, color = "#0093D5") + theme_dsi(grid = "y") + labs(x = "Weight (1000 lbs)", y = "Miles per gallon")
A sibling of theme_dsi() tuned for faceted plots. Where theme_dsi()
uses half-frame axis lines and only horizontal grid lines — a look that
suits a single panel — repeating those across every facet looks heavy
and the panels run together. theme_dsi_facet() replaces the axis
lines with a light panel border, draws grid lines on both axes, gives
the strip a soft background, and inserts whitespace between panels.
theme_dsi_facet( base_size = 12, base_family = "", accent = "#0093D5", grid_color = "grey92", strip_fill = "grey95", strip_color = "grey20", grid = c("both", "x", "y", "none"), legend_position = "bottom" )theme_dsi_facet( base_size = 12, base_family = "", accent = "#0093D5", grid_color = "grey92", strip_fill = "grey95", strip_color = "grey20", grid = c("both", "x", "y", "none"), legend_position = "bottom" )
base_size |
Base font size in points. Default |
base_family |
Base font family. Default |
accent |
Accent colour, kept for parity with |
grid_color |
Colour of the major grid (both axes). Default
|
strip_fill |
Background fill colour for facet strips. Default
|
strip_color |
Text colour inside facet strips. Default |
grid |
Which direction(s) to draw major grid lines. One of
|
legend_position |
Position of the legend. Default |
Use theme_dsi() for single-panel plots and theme_dsi_facet() for
plots with facet_wrap() or facet_grid(). Shared elements (text
styles, title block, legend, plot margins) match theme_dsi() exactly,
so the two themes feel like the same family.
A ggplot2 theme object.
theme_dsi() for the single-panel sibling.
library(ggplot2) ggplot(mtcars, aes(wt, mpg)) + geom_point(size = 2, color = "#0093D5") + facet_wrap(~ cyl, labeller = label_both) + theme_dsi_facet() + labs(title = "Fuel efficiency by cylinder count", x = "Weight (1000 lbs)", y = "Miles per gallon")library(ggplot2) ggplot(mtcars, aes(wt, mpg)) + geom_point(size = 2, color = "#0093D5") + facet_wrap(~ cyl, labeller = label_both) + theme_dsi_facet() + labs(title = "Fuel efficiency by cylinder count", x = "Weight (1000 lbs)", y = "Miles per gallon")
A tibble of the 194 World Health Organization (WHO) Member States, with standard country identifiers and WHO/Pacific groupings used across DSIR analytical workflows.
who_countrieswho_countries
A tibble with 194 rows and 7 columns:
ISO 3166-1 alpha-3 code (3-letter, e.g. "PHL").
ISO 3166-1 alpha-2 code (2-letter, e.g. "PH").
UN M49 numeric code, stored as a 3-character string with
leading zeros where present (e.g. "008" for Albania). Stored as
character because some downstream APIs (notably the UN SDG API) expect
the leading-zero form.
WHO official English name (e.g.
"Iran (Islamic Republic of)"). Use for formal documents and reports.
A shorter form suitable for charts and tables (e.g.
"Iran"). Equal to name_official for countries whose official name
is already concise. See Details.
WHO region code: one of "AFR", "AMR", "SEAR",
"EUR", "EMR", "WPR".
Logical. TRUE for the 14 Pacific Island Country (PIC)
Member States in WPR, FALSE otherwise.
Scope. WHO has 194 Member States plus 2 Associate Members (Puerto Rico, Tokelau) and reports on additional non-Member areas (e.g. West Bank and Gaza Strip). This dataset includes only the 194 Member States. Cook Islands and Niue are full WHO Member States and are included even though they are not UN member states.
Region coverage (as of May 2025, after WHO EB156 reassignment of Indonesia from SEAR to WPR):
AFR (African Region): 47
AMR (Region of the Americas): 35
SEAR (South-East Asia Region): 10
EUR (European Region): 53
EMR (Eastern Mediterranean Region): 21
WPR (Western Pacific Region): 28
Short names. name_short differs from name_official for 13
countries where the official name is a parenthetical descriptor or is
otherwise long. Examples: "DPR Korea", "DR Congo", "Lao PDR",
"United Kingdom". These short forms follow conventions used in WHO
regional reports and OECD Health at a Glance: Asia/Pacific.
PICs. The Pacific Island Countries flag (is_pic) marks the 14
WHO Member States in the Pacific sub-region: Cook Islands, Fiji,
Kiribati, Marshall Islands, Micronesia (Federated States of), Nauru,
Niue, Palau, Papua New Guinea, Samoa, Solomon Islands, Tonga, Tuvalu,
and Vanuatu. Non-Member Pacific areas (e.g. New Caledonia, French
Polynesia, American Samoa) are not included in this dataset.
WHO official region listing: https://www.who.int/countries
ISO 3166-1 codes and UN M49 numeric codes: UN Statistics Division https://unstats.un.org/unsd/methodology/m49/
# All WPR Member States, sorted alphabetically by short name wpr <- who_countries[who_countries$who_region == "WPR", ] wpr <- wpr[order(wpr$name_short), c("iso3", "name_short")] head(wpr) # Filter a data frame to PIC member states # df_pic <- subset(my_data, country_iso3 %in% who_countries$iso3[who_countries$is_pic])# All WPR Member States, sorted alphabetically by short name wpr <- who_countries[who_countries$who_region == "WPR", ] wpr <- wpr[order(wpr$name_short), c("iso3", "name_short")] head(wpr) # Filter a data frame to PIC member states # df_pic <- subset(my_data, country_iso3 %in% who_countries$iso3[who_countries$is_pic])
Convenience character vectors of ISO3 codes for the WHO Member States
in each region. Each vector is the regional subset of
who_countries's iso3 column, sorted alphabetically.
afro_cty amro_cty searo_cty euro_cty emro_cty wpro_ctyafro_cty amro_cty searo_cty euro_cty emro_cty wpro_cty
Character vectors of ISO 3166-1 alpha-3 codes.
An object of class character of length 47.
An object of class character of length 35.
An object of class character of length 10.
An object of class character of length 53.
An object of class character of length 21.
An object of class character of length 28.
These are derived directly from who_countries and exist for
ergonomic filtering, e.g.
dplyr::filter(data, iso3 %in% wpro_cty).
If you need to update group membership, edit
data-raw/who_countries.R and re-run that script — the vectors
regenerate from the master table.
afro_cty47 Member States in the WHO African Region.
amro_cty35 Member States in the WHO Region of the Americas.
searo_cty10 Member States in the WHO South-East Asia Region.
euro_cty53 Member States in the WHO European Region.
emro_cty21 Member States in the WHO Eastern Mediterranean Region.
wpro_cty28 Member States in the WHO Western Pacific Region. Includes Indonesia from May 2025 (per WHO EB156).
length(wpro_cty) # 28 "IDN" %in% wpro_cty # TRUE — Indonesia in WPR since May 2025length(wpro_cty) # 28 "IDN" %in% wpro_cty # TRUE — Indonesia in WPR since May 2025