Live Birth Data from SINASC with healthbR
Source:vignettes/sinasc-live-births.Rmd
sinasc-live-births.RmdOverview
The SINASC (Sistema de Informacoes sobre Nascidos Vivos) is Brazil’s national live birth information system, managed by the Ministry of Health through DATASUS. It records individual birth certificates (Declaracao de Nascido Vivo) with maternal, delivery, and newborn characteristics.
| Feature | Details |
|---|---|
| Coverage | Per state (UF), all 27 states |
| Years | 1996–2024 |
| Unit | One row per live birth certificate |
| Format | .dbc files from DATASUS FTP |
Getting started
Check available years
sinasc_years()
# include preliminary data
sinasc_years(status = "all")Downloading data
Basic download
births <- sinasc_data(year = 2022, uf = "AC")Multiple states and years
births <- sinasc_data(year = 2020:2022, uf = c("SP", "RJ"))Filter by congenital anomaly
Use CID-10 code prefixes to filter births with congenital anomalies:
# Down syndrome (Q90)
down <- sinasc_data(year = 2022, uf = "SP", anomaly = "Q90")
# All congenital anomalies (Chapter XVII)
anomalies <- sinasc_data(year = 2022, uf = "SP", anomaly = "Q")Select variables
births <- sinasc_data(
year = 2022,
uf = "SP",
vars = c("DTNASC", "SEXO", "PESO", "IDADEMAE", "GESTACAO",
"PARTO", "CONSULTAS", "CODMUNRES")
)Key variables
| Variable | Description |
|---|---|
| DTNASC | Birth date |
| SEXO | Sex (1=Male, 2=Female, 0=Unknown) |
| PESO | Birth weight (grams) |
| IDADEMAE | Mother’s age (years) |
| GESTACAO | Gestational age (weeks, categorized) |
| PARTO | Delivery type (1=Vaginal, 2=Cesarean) |
| CONSULTAS | Prenatal consultations (categorized) |
| CODANOMAL | Congenital anomaly (CID-10 code) |
| CODMUNRES | Municipality of mother’s residence (IBGE 6 digits) |
| ESCMAE | Mother’s education level |
| RACACOR | Newborn’s race/color |
| APGAR1, APGAR5 | Apgar score at 1 and 5 minutes |
Data dictionary
sinasc_dictionary()
sinasc_dictionary("PARTO")
sinasc_dictionary("GESTACAO")Explore variables
sinasc_variables()
sinasc_variables(search = "mae")
sinasc_variables(search = "peso")Example: Low birth weight by state
births <- sinasc_data(year = 2022, uf = c("SP", "RJ", "MG", "BA", "RS"))
lbw <- births |>
filter(!is.na(PESO), PESO != "0") |>
mutate(
weight = as.numeric(PESO),
low_weight = weight < 2500
) |>
group_by(uf_source) |>
summarise(
total = n(),
low_weight_n = sum(low_weight),
low_weight_pct = low_weight_n / total * 100
)Example: Teen pregnancy
births <- sinasc_data(year = 2022, uf = "SP")
teen <- births |>
filter(!is.na(IDADEMAE)) |>
mutate(
mother_age = as.integer(IDADEMAE),
teen_mother = mother_age < 20
) |>
summarise(
total = n(),
teen_n = sum(teen_mother, na.rm = TRUE),
teen_pct = teen_n / total * 100
)Smart type parsing
# parsed types (default)
births <- sinasc_data(year = 2022, uf = "AC")
class(births$DTNASC) # Date
class(births$PESO) # integer
# all character
births_raw <- sinasc_data(year = 2022, uf = "AC", parse = FALSE)Cache and lazy evaluation
sinasc_cache_status()
sinasc_clear_cache()
# lazy query
lazy <- sinasc_data(year = 2022, uf = "SP", lazy = TRUE)
lazy |>
filter(PARTO == "2") |>
collect()Further reading
- SINASC on DATASUS (
datasus.saude.gov.br) - SIM vignette for mortality data
- Census vignette for population denominators