Prepare Modeled Probability Data from GeoTIFFs
prepare_prob_data.RdExtracts exposure probability estimates from GeoTIFF rasters and aggregates them by geographic unit (e.g., county) to create the CSV format required by geoExposeR.
Arguments
- prob_rasters
A named list of file paths to GeoTIFF rasters containing exceedance probabilities. Names should indicate thresholds (e.g.,
list(gt1 = "prob_gt1.tif", gt5 = "prob_gt5.tif", gt10 = "prob_gt10.tif")).- boundaries
An sf object containing polygon boundaries (e.g., counties) for spatial aggregation.
- id_col
Character. Column name in
boundariescontaining the geographic identifier. Supports any identifier format (FIPS codes, census tract IDs, ZIP codes, custom IDs). Default is "GEOID".- pop_data
Optional data frame with population data. Should have columns for geographic ID and private well population.
- pop_id_col
Character. Column name in
pop_datafor geographic ID. Default is "GEOID".- pop_well_col
Character. Column name in
pop_datafor private well population. Default is "private_well_pop".- extraction_method
Character. Method for extracting raster values: "centroid" (faster) or "mean" (more accurate). Default is "mean".
- cutoffs
Numeric vector of length 2 specifying concentration cutoffs in ug/L for creating categories. Default is
c(5, 10)for categories: <5, 5-10, >=10.- output_file
Optional file path to save the resulting CSV.
Value
A data.table with columns for geographic ID and multinomial probabilities for each exposure category.
Details
The function converts exceedance probabilities (P(conc > threshold)) to multinomial probabilities that sum to 1:
Category 1: P(conc < lower_cutoff) = 1 - P(conc > lower_cutoff)
Category 2: P(lower <= conc < upper) = P(conc > lower) - P(conc > upper)
Category 3: P(conc >= upper_cutoff) = P(conc > upper_cutoff)
Required Packages
This function requires the terra and sf packages. For the "mean"
extraction method, exactextractr is also recommended for better accuracy.
Examples
if (FALSE) { # \dontrun{
# Load required packages
library(terra)
library(sf)
# Define raster paths
rasters <- list(
gt5 = "path/to/prob_gt5.tif",
gt10 = "path/to/prob_gt10.tif"
)
# Load county boundaries
counties <- st_read("path/to/counties.shp")
# Extract and format probability data
prob_data <- prepare_prob_data(
prob_rasters = rasters,
boundaries = counties,
id_col = "GEOID",
cutoffs = c(5, 10),
output_file = "prob_model_output.csv"
)
} # }