Analyzing Health Data from POF with healthbR
Source:vignettes/pof-health-data.Rmd
pof-health-data.RmdOverview
The POF (Pesquisa de Orçamentos Familiares) is a household budget survey conducted by IBGE that investigates household expenditures, living conditions, and nutritional profiles of the Brazilian population. It is conducted in partnership with the Ministry of Health.
The healthbR package provides access to POF microdata
with a focus on health-related data:
| Module | Description | Available editions |
|---|---|---|
| Food Security (EBIA) | Brazilian Food Insecurity Scale | 2017-2018 |
| Food Consumption | Detailed personal food intake | 2008-2009, 2017-2018 |
| Anthropometry | Weight, height, BMI | 2008-2009 |
| Health Expenses | Medications, insurance, consultations | All editions |
Getting started
Check available editions
pof_years()
#> [1] "2002-2003" "2008-2009" "2017-2018"Survey information
Use pof_info() to see which health modules are available
for each edition:
pof_info("2017-2018")List available registers
Each POF edition contains multiple data registers. Use
pof_registers() to see them:
# all registers
pof_registers("2017-2018")
# only health-related registers
pof_registers("2017-2018", health_only = TRUE)Explore variables
Before downloading data, you can browse available variables:
# list all variables in the domicilio register
pof_variables("2017-2018", "domicilio")
# search for food security variables
pof_variables("2017-2018", search = "ebia")
# search for weight-related variables
pof_variables("2017-2018", "morador", search = "peso")Food Security Analysis (EBIA)
The EBIA (Escala Brasileira de Insegurança
Alimentar) is available in the 2017-2018 edition through the
domicilio register. The variable V6199
contains the food security classification.
Download domicilio data
domicilio <- pof_data("2017-2018", "domicilio")EBIA classification
The EBIA classifies households into four levels:
| Code | Classification |
|---|---|
| 1 | Food security |
| 2 | Mild food insecurity |
| 3 | Moderate food insecurity |
| 4 | Severe food insecurity |
Weighted estimates with survey design
For proper population estimates, use the survey design:
library(srvyr)
domicilio_svy <- pof_data("2017-2018", "domicilio", as_survey = TRUE)
# add EBIA categories
domicilio_svy <- domicilio_svy |>
mutate(
ebia = factor(
V6199,
levels = 1:4,
labels = c(
"Food security",
"Mild insecurity",
"Moderate insecurity",
"Severe insecurity"
)
)
)
# weighted prevalence
domicilio_svy |>
group_by(ebia) |>
summarize(
prevalence = survey_mean(na.rm = TRUE, vartype = "ci"),
n = unweighted(n())
)EBIA by region (UF)
# food insecurity by state
domicilio_svy |>
group_by(UF, ebia) |>
summarize(
prevalence = survey_mean(na.rm = TRUE, vartype = "ci"),
n = unweighted(n())
) |>
filter(ebia == "Severe insecurity") |>
arrange(desc(prevalence))Food Consumption Analysis
The consumo_alimentar register contains detailed
personal food intake data from a subsample. This data is available for
the 2008-2009 and 2017-2018 editions.
Download food consumption data
consumo <- pof_data("2017-2018", "consumo_alimentar")Key variables
| Variable | Description |
|---|---|
V9001 |
Food item code |
V9005 |
Amount consumed |
V9007 |
Unit of measure |
ENERGIA_KCAL |
Energy (kcal) |
PROTEINA |
Protein (g) |
CARBOIDRATO |
Carbohydrate (g) |
LIPIDIO |
Total lipids (g) |
Average caloric intake
# total daily caloric intake per person
consumo |>
group_by(COD_UPA, NUM_DOM, NUM_UC, COD_INFORMANTE) |>
summarize(
total_kcal = sum(ENERGIA_KCAL, na.rm = TRUE),
total_protein = sum(PROTEINA, na.rm = TRUE),
total_carb = sum(CARBOIDRATO, na.rm = TRUE),
total_fat = sum(LIPIDIO, na.rm = TRUE),
.groups = "drop"
) |>
summarize(
mean_kcal = mean(total_kcal, na.rm = TRUE),
mean_protein = mean(total_protein, na.rm = TRUE),
mean_carb = mean(total_carb, na.rm = TRUE),
mean_fat = mean(total_fat, na.rm = TRUE)
)Health Expenses
The despesa_individual register contains individual
expenses, including health-related spending such as medications, health
insurance, and medical consultations.
Download expense data
despesas <- pof_data("2017-2018", "despesa_individual")Combining registers
For many analyses you need to combine data from multiple registers.
Use the household identifier variables (COD_UPA,
NUM_DOM, NUM_UC) to merge:
# download morador (demographic data) and domicilio (household data)
morador <- pof_data("2017-2018", "morador")
domicilio <- pof_data("2017-2018", "domicilio")
# merge: add household-level EBIA to individual-level data
morador_ebia <- morador |>
left_join(
domicilio |> select(COD_UPA, NUM_DOM, NUM_UC, V6199),
by = c("COD_UPA", "NUM_DOM", "NUM_UC")
) |>
mutate(
ebia = factor(
V6199,
levels = 1:4,
labels = c(
"Food security",
"Mild insecurity",
"Moderate insecurity",
"Severe insecurity"
)
)
)
# food insecurity by age group
morador_ebia |>
mutate(age_group = cut(V0403, breaks = c(0, 5, 12, 18, 30, 60, Inf))) |>
count(age_group, ebia) |>
group_by(age_group) |>
mutate(pct = n / sum(n) * 100)Comparing editions
The POF has been conducted in different years, and data structure may
vary. Use pof_info() to check what is available in each
edition:
Cache management
POF data files are large. healthbR caches downloaded files locally so you only download once:
# check cached files
pof_cache_status()
# clear cache if needed
pof_clear_cache()If the arrow package is installed, data is cached in
Parquet format for faster loading:
# install arrow for optimized caching (recommended)
install.packages("arrow")Additional resources
- POF official page
(
www.ibge.gov.br/estatisticas/sociais/saude/24786-pesquisa-de-orcamentos-familiares-2) - POF 2017-2018 Food Security publication
(
biblioteca.ibge.gov.br) - POF 2017-2018 Food Consumption publication
(
biblioteca.ibge.gov.br) - srvyr package documentation