Skip to contents

Downloads POF microdata from IBGE FTP and returns as a tibble. Data is cached locally to avoid repeated downloads.

Usage

pof_data(
  year = "2017-2018",
  register = "morador",
  vars = NULL,
  cache_dir = NULL,
  as_survey = FALSE,
  refresh = FALSE,
  lazy = FALSE,
  backend = c("arrow", "duckdb")
)

Arguments

year

Character. POF edition (e.g., "2017-2018"). Default is "2017-2018".

register

Character. Which register to download. Use pof_registers() to see available options. Default is "morador".

vars

Character vector. Optional: specific variables to select. If NULL, returns all variables from the register. Default is NULL.

cache_dir

Character. Directory for caching downloaded files. Default uses tools::R_user_dir("healthbR", "cache").

as_survey

Logical. If TRUE, returns survey design object. Requires srvyr package. Default is FALSE.

refresh

Logical. If TRUE, re-download even if file exists in cache. Default is FALSE.

lazy

Logical. If TRUE, returns a lazy query object instead of a tibble. Requires the arrow package. The lazy object supports dplyr verbs (filter, select, mutate, etc.) which are pushed down to the query engine before collecting into memory. Call dplyr::collect() to materialize the result. Default: FALSE.

backend

Character. Backend for lazy evaluation: "arrow" (default) or "duckdb". Only used when lazy = TRUE. DuckDB backend requires the duckdb package.

Value

A tibble with microdata, or tbl_svy if as_survey = TRUE.

Details

The POF (Pesquisa de Orcamentos Familiares) is a household survey conducted by IBGE that investigates household budgets, living conditions, and nutritional profiles of the Brazilian population.

The POF contains several health-related modules:

  • EBIA (Food Security Scale): Available in 2017-2018, variable V6199 in the domicilio register

  • Food Consumption: Detailed food consumption data in the consumo_alimentar register (2008-2009, 2017-2018)

  • Health Expenses: Expenses with medications, health insurance, consultations in the despesa_individual register

  • Anthropometry: Weight, height, BMI in morador register (2008-2009 only)

Survey design

For proper statistical analysis with complex survey design, use as_survey = TRUE which creates a survey design object with:

  • Weight variable: PESO_FINAL

  • Stratum variable: ESTRATO_POF

  • PSU variable: COD_UPA

Data source

Data is downloaded from the IBGE FTP server: https://ftp.ibge.gov.br/Orcamentos_Familiares/

Examples

if (FALSE) { # interactive()
# basic usage - download morador register
morador <- pof_data("2017-2018", "morador", cache_dir = tempdir())

# download domicilio register (includes EBIA)
domicilio <- pof_data("2017-2018", "domicilio", cache_dir = tempdir())

# select specific variables
df <- pof_data(
  "2017-2018", "morador",
  vars = c("COD_UPA", "ESTRATO_POF", "PESO_FINAL", "V0403"),
  cache_dir = tempdir()
)

# with survey design (requires srvyr package)
morador_svy <- pof_data("2017-2018", "morador", as_survey = TRUE,
                         cache_dir = tempdir())
}