Skip to contents

Downloads all per-trip predicted fishing track files produced by [predict_pds_tracks()] and aggregates them into an H3 hexagonal grid of cumulative fishing effort. The result is uploaded as a versioned parquet file to the country-level cloud storage bucket.

Usage

aggregate_pds_effort(
  log_threshold = logger::DEBUG,
  h3_res = 9L,
  package = "coasts"
)

Arguments

log_threshold

The logging threshold to use. Default is `logger::DEBUG`.

h3_res

Integer (0–15). H3 resolution level for the output grid. Default is `9` (~174 m edge length). Different resolutions write to separate cloud prefixes.

package

Name of the package whose `inst/conf.yml` to read. Defaults to `"coasts"`. Pass your own package name when calling from a downstream package with a compatible configuration.

Value

Invisibly returns the merged H3 grid data frame (columns: `h3_index`, `year`, `fishing_hours`, `unique_trips`, `n_active_days`, `first_active_date`, `last_active_date`, `avg_fidelity_sum`, `n_trips_for_fidelity`, `fishing_pings`), or `NULL` if there was nothing to process.

Details

Predicted track files contain fishing-only GPS points (columns: `trip`, `timestamp`, `latitude`, `longitude`). This function:

1. Lists all files under `conf$pds$pds_tracks_predicted$file_prefix` in the PDS bucket. 2. Downloads only **new** files (incremental via manifest) in parallel using `furrr`. 3. Prepares each track with [prepare_tracks_for_effort()], which computes per-ping time intervals (`dt_hours`), assigns H3 cell indices, and records per-trip total hours for fidelity computation. 4. Runs a **two-pass aggregation**: first a trip × cell summary to compute the fidelity components (`avg_fidelity_sum`, `n_trips_for_fidelity`), then a cell-level summary for effort totals. 5. Merges with the previously stored grid and uploads the updated version.

## Grid schema

The grid includes a `year` column for temporal effort maps (see [plot_effort_map()]). An all-time aggregate is obtained by summing over `year`. Primary effort columns:

- `fishing_hours`: accumulated fishing time (sum of capped inter-ping intervals). This is the primary effort metric. - `unique_trips`: count of distinct trips contributing to the cell. - `n_active_days`: count of distinct calendar days with fishing activity. - `first_active_date` / `last_active_date`: date range for inferring the study period length (`n_total_days`) downstream. - `avg_fidelity_sum`: sum of per-trip fidelity values (fraction of each trip's total fishing hours spent in this cell). Divide by `n_trips_for_fidelity` to get `avg_fidelity` ∈ [0, 1]. - `n_trips_for_fidelity`: number of trips contributing to `avg_fidelity_sum`. - `fishing_pings`: raw GPS point count (retained for QA; not used as a primary metric because ping frequency is irregular).

**Multi-resolution support:** passing different `h3_res` values writes to separate cloud prefixes (e.g. `predicted-pds-h3_grid_r9`, `predicted-pds-h3_grid_r7`), so grids at multiple resolutions can coexist. Use [rollup_h3_resolution()] to derive coarser views from a stored fine grid, or pass a coarser `h3_res` directly to recompute from raw tracks. [derive_fishing_grounds()] can further roll up to any resolution before extracting contiguous fishing ground polygons.

See also

[predict_pds_tracks()], [derive_fishing_grounds()], [rollup_h3_resolution()], [plot_effort_map()]