Skip to contents

coasts 4.3.0

  • FIX export_pds_spatial() per-cell metrics — avg_hours_per_day and avg_visits_per_day were divided by the whole study period (n_total_days, ~850+ days), producing values near zero. They now divide by n_active_days (the number of days the cell was actually visited), so the metric matches its name: average fishing hours / trips on days the cell was active. constancy (fraction of study period the cell was active) still uses n_total_days and is unchanged. Same fix applied to derive_fishing_grounds() per-cell metrics.

  • FIX aggregate_pds_effort()n_active_days was double-counted on incremental merge whenever the same calendar day was visited by trips from different aggregation runs (common: many boats fishing the same cell daily produce one parquet per trip, batched separately). The grid now stores active_dates as a list-column of Date per cell-year; merges take the unique union, and n_active_days is recomputed from it. derive_fishing_grounds() applies the same union semantics when collapsing years, rolling up to coarser resolutions, and aggregating cells into ground polygons. # coasts 4.2.1

  • IMPROVEMENT Filter out NAs in countries taxa summary (in export_portal()) to save storage space and loading time

coasts 4.2.0

  • FIX Fix critical bug in downloading versioned files

coasts 4.1.0

  • IMPROVEMENT Kenya matched trips now combine surveys from all Kenyan sources, not just KEFS — giving a more complete picture of fishing activity in the country.
  • FIX Restored the fishing-effort aggregation step of the automated pipeline, which had stopped running on the server due to a missing system component.

coasts 4.0.0

Spatial CPUE Model Pipeline

  • NEW model_cpue() - Estimates spatial Catch Per Unit Effort (CPUE) by joining matched survey trips with predicted PDS tracks. Supports two estimation methods: "weighted" (direct catch-to-effort ratio, robust for sparse data) and "nnls" (non-negative least squares, for denser datasets). Uploads results as a versioned parquet to cloud storage.
  • NEW run_weighted_cpue() - Computes CPUE as sum(catch_kg) / sum(fishing_hours) per H3 cell and country.
  • NEW run_nnls_cpue() - Solves a non-negative least squares system min ||Xq - y||² s.t. q ≥ 0 across all H3 cells simultaneously.
  • NEW join_effort_catch() - Builds the effort-catch matrix linking per-trip H3 effort vectors with catch records.
  • NEW load_matched_trips() - Downloads the trips-matched parquet and returns validated catch records for matched PDS trips.
  • NEW download_predicted_tracks() - Downloads predicted track files for a set of matched trip IDs from the PDS bucket.
  • NEW prepare_tracks_for_effort() - Projects predicted fishing points into an H3 effort matrix (fishing hours and pings per cell).
  • NEW get_combined_tbl() - Combines effort and catch into a single analysis table for CPUE modelling.
  • NEW build_catch_wide() - Pivots catch records to a wide matrix (trips × species) for the NNLS solver.
  • NEW .finalise_cpue() - Post-processes raw CPUE estimates: adds centroid coordinates, filters cells below min_trips, and attaches country labels.
  • NEW .top_species() - Selects the top-N species by total catch weight to focus CPUE estimation.

Web-Ready Spatial Export

  • NEW export_pds_spatial() - Reads H3 effort grid and CPUE parquet files from cloud storage, derives fishing grounds, and uploads three web-ready files for the DeckGL portal: H3 effort JSON, CPUE JSON, and fishing grounds GeoJSON.

  • NEW derive_fishing_grounds() - Converts an H3 effort grid to a GeoJSON FeatureCollection of discrete fishing ground polygons, enriched with area, constancy, and activity metrics.

  • NEW aggregate_trip_effort() - Aggregates per-trip H3 effort vectors into a cumulative effort grid across all trips.

  • NEW plot_effort_map() / plot_cpue_map() - Interactive Leaflet maps for visualising effort and CPUE grids during exploratory analysis. ## Taxa Enrichment

  • NEW enrich_taxa() - Augments catch records with FishBase and SeaLifeBase taxonomic backbone data (class, order, family, genus) for all species in the matched trips dataset.

  • NEW get_taxa_backbone() - Queries the GBIF taxonomic backbone to resolve species names to canonical taxonomy.

  • NEW expand_taxonomic_info() - Expands the taxa lookup table with full higher classification.

Bug Fixes

  • FIX aggregate_pds_effort() - Manifest was silently uploaded to a temp-dir GCS path instead of the correct {grid_prefix}/aggregated_manifest.rds key, causing incremental processing to always rebuild the entire grid from scratch. Fixed by passing name = manifest_name explicitly to upload_cloud_file() in both the main and early-return paths.
  • FIX model_cpue() - Removed dead code left from an earlier refactor (map_effort, map_cpue, out_dir block) that caused an R error at runtime: “object ‘map_effort’ not found”.
  • FIX export_pds_spatial() - No longer crashes with a cryptic 404 when the effort grid parquet does not yet exist in GCS (e.g. first run or after manual deletion). The function now logs a warning and returns early, matching the existing behaviour for the CPUE file.
  • FIX HTTP/2 PROTOCOL_ERROR failures on GCS uploads in CI — upload_cloud_file() now calls cloud_storage_authenticate(force = TRUE) unconditionally before every upload. Service-account tokens expire after 1 hour; long upstream jobs (e.g. predict_pds_tracks) can exhaust this window, causing gargle (which uses httr2) to attempt a mid-flight token refresh over a stale HTTP/2 connection. Forcing fresh re-auth before the upload avoids this path entirely.

CI / Workflow

  • Merged predict-pds-tracks and aggregate-pds-effort pipeline jobs into a single job — they are always sequential and sharing a container saves startup overhead.
  • Deleted superseded model-tracks.yaml workflow (its steps are fully covered by data-pipeline.yaml).
  • Fixed pkgdown.yaml deploy step: added required environment: name: github-pages block (needed by actions/deploy-pages@v4); bumped actions/upload-pages-artifact to @v4 (native Node 24 support); added Changelog to pkgdown navbar.

Naming & Versioning Coherence

  • CPUE parquet files are now stored under pds-cpue_r{h3_res} (e.g. pds-cpue_r9) to match the effort grid naming convention (predicted-pds-h3_grid_r9). This ensures that running the pipeline at different H3 resolutions never silently mixes effort and CPUE data from different resolutions.
  • Portal CPUE JSON files follow the same pattern: pds-cpue-r{h3_res}__timestamp__json.
  • inst/conf.yml portal.cpue.file_prefix updated from pds-cpue to pds-cpue-r.

Documentation & Website

  • Vignettes (pipeline.Rmd, metrics-and-models.Rmd) moved from project root to vignettes/ so pkgdown can discover them correctly.
  • pkgdown CI workflow (pkgdown.yaml) fixed: system dependencies (GDAL, GEOS, PROJ, udunits2) now installed before r-lib/actions/setup-r-dependencies@v2.
  • _pkgdown.yml articles section re-enabled now that vignettes are in the correct location.

coasts 3.0.1

Align export functions according to countries API schema

coasts 3.0.0

Fishing Activity Prediction Pipeline

A new end-to-end pipeline for classifying GPS boat tracks into fishing and non-fishing activity using the ssfaitk statistical model, and aggregating the results into spatial effort maps.

New Workflow Functions

  • NEW predict_pds_tracks() - Downloads GPS tracks for all active vessels, applies the ssfaitk fishing activity model to each trip, and uploads fishing-only point files to cloud storage. Implements version-aware incremental processing: trips already classified with the current model version are skipped, and files from outdated model versions are automatically replaced when the model is updated.

  • NEW aggregate_pds_effort() - Consolidates all classified fishing tracks into a single H3 hexagonal grid representing cumulative fishing effort across the fleet. Counts fishing pings and unique trips per cell and uploads the grid as a versioned parquet file ready for portal consumption.

New Spatial Analysis Utilities

Automated Pipeline

  • NEW GitHub Actions workflow (model-tracks.yaml) - Runs the full fishing activity prediction and effort aggregation pipeline every two days. Always fetches the latest ssfaitk model version at runtime, so improvements to the underlying model are picked up automatically without rebuilding the Docker image.

Infrastructure

  • ENHANCED Docker image - Added Python environment support required by the ssfaitk model, including automatic Python path configuration for reticulate

coasts 2.2.7

  • IMPROVEMENT Use scientific names rather than FAO alpha3 codes for dashboard data

coasts 2.2.6

  • FIX Update PDS ingestion and preprocessing according to new config paths

coasts 2.2.5

coasts 2.2.4

coasts 2.2.3

  • FIX Refine pds ingestion functions to improve comatibility with country pipelines

coasts 2.2.2

  • NEW Upgrade and optimize all functions related to pds ingestion and preprocessing in order to be compatible with current countries pipelines. This means countries data flows follow the same data processing enhancing processes mangeabiity and data consistency

coasts 2.2.1

Minor fix

Add “version” argument to download_parquet_from_cloud()

coasts 2.2.0

New Features

  • NEW Upgrade and optimize all functions related to storage (google cloud and mogodb auth, download and upload). These will then replace the exsisting related functions for all the countries pipeline for improved manageability and centralization of common processes # coasts 2.1.0

New Features

  • NEW get_kobo_data() - upgraded function to pull data from kobotoolbox according to Kobo API changes. The new function will replace the exisitn g data pulling process in all the pipeline for improved manageability and centralization of common processes

  • BUG FIX Fixed bug related to the automatic generation of credentials of Peskas Tracks App.

coasts 2.0.0

Refactoring

Optmize package to export and process data from countries pipelines

coasts 1.5.0

New Features

Survey & Fleet Analysis Pipeline

  • NEW summarize_data() - End-to-end summarization of WorldFish survey data into five output tables (monthly, taxa, district, gear, grid summaries) uploaded to cloud storage as versioned parquet files
  • NEW calculate_fishery_metrics() - Transforms catch-level records into normalized fishery indicators (site-level CPUE/RPUE, predominant gear, species composition) in long format for portal consumption
  • NEW generate_fleet_analysis() - Orchestrates full fleet activity estimation pipeline and uploads aggregated results to cloud storage
  • NEW prepare_boat_registry() - Constructs a boat registry from asset metadata for scaling GPS-tracked data to fleet-wide estimates
  • NEW process_trip_data() - Processes PDS API trip records by device IMEI into per-trip summaries
  • NEW calculate_monthly_trip_stats() - Aggregates trip data to monthly statistics per district
  • NEW estimate_fleet_activity() - Scales GPS-sampled trips to fleet-wide activity estimates using boat registry sampling rates
  • NEW calculate_district_totals() - Joins fleet estimates with survey summaries to produce district-level catch and revenue totals

Data Export

  • NEW export_portal() - Downloads WorldFish summary datasets from cloud storage, joins modelled aggregate estimates, pivots to long format, and uploads all tables to MongoDB portal collections

Enhancements

Multi-Package Architecture

Automated Workflows

  • ENHANCED app-usage-report.yaml, sync-devices-users.yaml, tracks-backup.yaml - All jobs now carry an explicit if: github.ref == 'refs/heads/main' guard, ensuring workflows triggered via workflow_dispatch on non-main branches are safely skipped

Package Infrastructure

  • ENHANCED DESCRIPTION - Migrated Author/Maintainer fields to Authors@R: person(...) format (fixes R CMD check WARNING); removed spurious LazyData: true (no data/ directory); added URL and BugReports fields pointing to the GitHub repository
  • ENHANCED _pkgdown.yml - Added new “Survey & Fleet Analysis” reference section; added export_portal() to “Data Export & Storage”; removed two non-exported internal helpers that would have caused build errors

coasts 1.4.0

New Features

GPS-Survey Trip Matching

  • NEW merge_survey_trips() - Downloads matched GPS and survey trip data across regions, harmonizes columns, and combines into a single dataset

Automated Workflows

  • ENHANCED GitHub Actions data pipeline with new match-trips job

Multi-Bucket Regional Storage

  • ENHANCED download_parquet_from_cloud() and upload_parquet_to_cloud() - Added bucket_name parameter to download/upload from regional buckets (Kenya, Mozambique, Zanzibar)
  • ENHANCED inst/conf.yml - Regional bucket configuration with environment-specific bucket names (dev vs prod)

Code Organization

  • Refactored PDS ingestion and API functions into dedicated files (R/ingestion-pds.R, R/pds-api.R)
  • Updated pkgdown reference index with tracks app and preprocessing sections

coasts 1.3.0

Fisher Performance Analytics

  • NEW export_fishers_stats() - Comprehensive fisher performance analysis and export
    • Integrates catch events from tracks-app with GPS tracking data from PDS API
    • Matches fisher-reported landings with automated trip tracking by date and device
    • Calculates fishing efficiency metrics: CPUE (kg/hour, kg/km), search efficiency ratios
    • Estimates fuel consumption and catch per liter efficiency
    • Categorizes trips by distance (nearshore, mid-range, offshore)
    • Exports aggregated fisher statistics and trip-level performance metrics to MongoDB

Automated Workflows

  • ENHANCED GitHub Actions data pipeline workflow
    • Added export-fishers-stats job to automated pipeline
    • Runs after track preprocessing to ensure data availability
    • Automatically exports fisher performance data on every pipeline run

Development Experience

  • NEW .Rprofile - Interactive environment switching for local development
    • Added helper functions: use_prod(), use_local(), use_default()
    • Visual environment indicator on R session startup
    • Quick commands reference displayed in interactive sessions
    • Simplified testing across different configuration profiles

Configuration Updates

MongoDB Collections

  • ENHANCED tracks-app MongoDB configuration in inst/conf.yml
    • Added fishers-stats collection for aggregated fisher summaries
    • Added fishers-performance collection for trip-level efficiency metrics
    • Improved data organization for analytics and reporting

coasts 1.2.0

Breaking Changes

Configuration System Migration

  • BREAKING CHANGE - Migrated from auth folder to .env-based credentials management
    • Removed local configuration profile from inst/conf.yml
    • All environments now use environment variables loaded via .env file in local development
    • Added dotenv package dependency for automatic .env file loading
    • Created .env.example template with all required environment variables
    • Updated .gitignore to properly handle .env files while tracking .env.example
    • Migration Guide: Copy .env.example to .env and fill in credentials (see updated README)

New Features

Asset Management

  • ENHANCED ingest_assets() - Comprehensive fisheries asset metadata ingestion
    • Added log_threshold parameter for configurable logging
    • Now includes PDS device metadata from Airtable (pds_devices table)
    • Retrieves 6 asset types: taxa, gear, vessels, landing sites, forms, and devices
    • Changed output format from parquet to RDS for better R object serialization
    • Added complete roxygen documentation following package standards

Data Ingestion Improvements

  • ENHANCED ingest_pds_trips() - Improved trip data ingestion workflow
    • Now downloads device metadata from cloud storage instead of Google Sheets
    • Filters devices by last_seen date (>= 2023-01-01) for active devices only
    • Enhanced PDS API calls with deviceInfo and withLastSeen parameters
    • Client-side IMEI filtering for reliable data retrieval
    • Updated documentation with detailed configuration examples and notes

Documentation

Package Documentation

  • ENHANCED README with .env-based configuration instructions
    • Added step-by-step local development setup guide
    • Documented all required environment variables with descriptions
    • Separated local and production deployment instructions
  • UPDATED CLAUDE.md with new configuration system details
    • Revised Configuration System section to explain .env approach
    • Updated Configuration Requirements with clear setup steps
    • Added dotenv to Key Dependencies section
  • NEW .env.example - Template file for local development credentials
    • Includes all 12 required environment variables with helpful comments
    • Proper formatting examples for complex values (JSON keys, connection strings)

Function Documentation

Technical Improvements

Configuration Loading

  • ENHANCED read_config() function in R/utils.R
    • Automatic detection and loading of .env file if present
    • Seamless integration with existing config::get() workflow
    • Informative logging when .env file is loaded

Development Experience

  • Simplified credentials management for local development
  • Consistent approach with other peskas packages (e.g., peskas.kenya.data.pipeline)
  • Improved security with proper .gitignore configuration
  • Easier onboarding for new developers with template file

coasts 1.1.0

  • NEW - Integrate (Beta) Cabo Delgado (Mozambique) estimates
  • NEW - Ddeveloping code to integrate catch events records from tracks-app

coasts 1.0.0

Major New Features

Airtable Integration System

Enhanced PDS API Integration

Automated Workflows

  • NEW GitHub Actions workflow: ingest-pelagic-boats.yaml (runs every 15 days)
  • NEW GitHub Actions workflow: sync-device-users.yaml (runs every 10 days)
  • Enhanced main data pipeline workflow with improved container management

Configuration System Improvements

  • BREAKING CHANGE Restructured MongoDB configuration to support dual databases:
    • mongodb.coasts_portal - For main coasts geospatial data
    • mongodb.tracks_app - For tracks application user data
  • BREAKING CHANGE Enhanced Airtable configuration with separate base IDs:
    • airtable.frame - For device and country metadata
    • airtable.tracks_app - For user management
  • Updated environment variable requirements for production deployments

Documentation and Development

  • NEW Professional pkgdown website with enhanced theming and navigation
  • Enhanced README with status badges and improved structure
  • Fixed pkgdown configuration issues with pipe operators and tidy evaluation functions
  • Updated function documentation with detailed examples and use cases

Bug Fixes and Improvements

Data Processing

  • Fixed KES to USD conversion units in export_geos()
  • Improved MongoDB collection references to use new dual-database configuration
  • Enhanced error handling in data ingestion functions
  • Better logging and progress tracking across all functions

API and Authentication

  • Robust token refresh mechanisms for long-running processes
  • Improved error messages for authentication failures
  • Server-side filtering for PDS API calls to reduce data transfer

Workflow and Deployment

  • Streamlined Docker image build process with better caching
  • Enhanced GitHub Actions workflows with proper credential management
  • Improved container registry integration

Technical Improvements

  • Password generation system for new users with reproducible seeding
  • Comprehensive data validation and duplicate handling
  • Enhanced country mapping for global fisheries data (13 countries supported)
  • Improved spatial data processing with WGS84 coordinate system standardization
  • Advanced MongoDB operations with geospatial indexing (2dsphere)

Geographic Coverage Expansion

  • Enhanced support for multi-country deployments
  • Improved regional data harmonization
  • Currency conversion support for multiple regions (KES, TZS to USD)

coasts 0.1.0

  • Initial release of the coastal fisheries data pipeline for Western Indian Ocean region.

New Features

Data Ingestion

  • ingest_pds_trips() - Automated ingestion of GPS boat trip data from Pelagic Data Systems (PDS) API
  • ingest_pds_tracks() - Parallel processing of detailed GPS track data with batch processing capabilities
  • get_metadata() - Retrieval of fishery metadata from Google Sheets

Data Preprocessing

  • preprocess_pds_tracks() - Spatial gridding and summarization of fishing activity patterns
  • Multi-scale spatial analysis support (100m, 250m, 500m, 1000m grid cells)
  • Parallel processing for efficient handling of large datasets
  • preprocess_track_data() - Core function for converting GPS tracks to spatial grid summaries

Data Export and Storage

  • export_geos() - Comprehensive export of geospatial data and regional metrics to MongoDB
  • MongoDB integration with 2dsphere geospatial indexing
  • Currency conversion for Kenya (KES to USD) and Zanzibar (TZS to USD) economic indicators
  • Support for regional boundary data and time series metrics

Cloud Storage Integration

Database Operations

API Integration

  • get_trips() - PDS API integration for trip data retrieval
  • get_trip_points() - Detailed GPS point data from PDS API
  • Authentication and token management for external APIs

Automation and Workflow

  • GitHub Actions workflow for automated data pipeline execution
  • Runs every 2 days with complete data processing pipeline
  • Docker containerization for reproducible execution environment
  • Configuration management through conf.yml files

Geographic Coverage

  • Kenya coastal fisheries data processing
  • Zanzibar fisheries data integration
  • Regional harmonization and standardization

Technical Features

  • Parallel processing using future and furrr packages
  • Efficient data formats using Apache Arrow/Parquet
  • Comprehensive logging with configurable thresholds
  • Error handling and recovery mechanisms
  • Versioned data management system