Format Health Outcome Data for geoExposeR
format_health_data.RdFormats and validates health outcome data to ensure compatibility with geoExposeR's requirements.
Usage
format_health_data(
health_data,
id_col = "FIPS",
outcome_cols,
covariate_cols = NULL,
geoid_digits = 5,
validate = TRUE,
output_file = NULL,
output_format = c("txt", "csv")
)Arguments
- health_data
A data frame containing health outcomes and covariates.
- id_col
Character. Column name containing geographic identifiers (e.g., FIPS codes). Default is "FIPS".
- outcome_cols
Character vector. Column names of health outcome variables to include.
- covariate_cols
Optional character vector. Column names of covariate variables to include.
- geoid_digits
Integer. Number of digits for FIPS code formatting. Default is 5 (county level).
- validate
Logical. Whether to perform validation checks. Default TRUE.
- output_file
Optional file path to save the resulting file.
- output_format
Character. Output format: "txt" (tab-delimited) or "csv". Default is "txt".
Details
The function performs the following operations:
Formats FIPS codes to specified number of digits with zero-padding
Validates that outcome columns exist and contain valid data
Standardizes missing value representation to NA
Optionally validates data ranges for common health outcomes
Examples
if (FALSE) { # \dontrun{
health <- data.frame(
FIPS = c(6001, 6003, 6005),
BWT = c(3200, 3150, NA),
OEGEST = c(39.2, 38.8, 40.1),
MAGE = c(28, 32, 25)
)
formatted <- format_health_data(
health_data = health,
id_col = "FIPS",
outcome_cols = c("BWT", "OEGEST"),
covariate_cols = c("MAGE"),
output_file = "health_outcomes.txt"
)
} # }