Population Denominators from the Census with healthbR
Source:vignettes/censo-denominadores.Rmd
censo-denominadores.RmdOverview
The Censo Demografico (Demographic Census) is the main source of population denominators in Brazil, essential for calculating mortality rates, disease incidence, and other epidemiological indicators.
The healthbR package provides direct access to Census
population data via the IBGE SIDRA API, covering:
| Function | Description | Years |
|---|---|---|
censo_populacao() |
Population by sex, age, race, situation | 1970-2022 |
censo_estimativa() |
Intercensitary population estimates | 2001-2021 |
censo_sidra_data() |
Any Census SIDRA table | All available |
Getting started
Check available years
censo_years()
#> [1] "1970" "1980" "1991" "2000" "2010" "2022"Survey information
censo_info(2022)Population by state
The most common use case: getting population by state as a denominator for health indicators.
# total population by state, Census 2022
pop_state <- censo_populacao(year = 2022, territorial_level = "state")
pop_statePopulation by sex
# population by sex, Brazil level
pop_sex <- censo_populacao(
year = 2022,
variables = "sex",
territorial_level = "brazil"
)
pop_sexAge pyramids
# population by age and sex
pop_age_sex <- censo_populacao(
year = 2022,
variables = "age_sex",
territorial_level = "brazil"
)
pop_age_sexPopulation by race/color
# population by race, 2022
pop_race <- censo_populacao(
year = 2022,
variables = "race",
territorial_level = "state"
)
pop_raceIntercensitary estimates
For years between censuses, IBGE publishes annual population estimates that serve as denominators:
# population estimates 2015-2021
estimates <- censo_estimativa(
year = 2015:2021,
territorial_level = "state"
)
estimatesExample: calculating a mortality rate
A typical epidemiological workflow combines mortality data (SIM) with Census denominators:
# step 1: get population denominator
pop_2010 <- censo_populacao(
year = 2010,
variables = "total",
territorial_level = "state"
)
# step 2: suppose you have mortality data (from SIM or other source)
# deaths_by_state <- sim_data(year = 2010) |> count(state)
# step 3: calculate crude mortality rate
# mortality_rate <- deaths_by_state |>
# left_join(pop_2010, by = "state") |>
# mutate(rate_per_100k = (n / population) * 100000)Exploring Census tables
The Census module includes a catalog of SIDRA tables organized by theme:
# list all available tables
censo_sidra_tables()
# filter by theme
censo_sidra_tables(theme = "disability")
censo_sidra_tables(theme = "indigenous")
# search by keyword
censo_sidra_search("quilombola")
censo_sidra_search("saneamento")Custom SIDRA queries
For full flexibility, use censo_sidra_data() to query
any Census table:
# population by race from table 9605
pop_race_raw <- censo_sidra_data(
table = 9605,
territorial_level = "state",
year = 2022,
variable = 93,
classifications = list("86" = "allxt")
)
pop_race_rawHistorical comparisons
# compare population across census years
pop_2010 <- censo_populacao(year = 2010, territorial_level = "brazil")
pop_2022 <- censo_populacao(year = 2022, territorial_level = "brazil")
# or use estimates for intercensitary years
pop_series <- censo_estimativa(
year = 2001:2021,
territorial_level = "brazil"
)
pop_series