Package index
Data Pipeline Workflow
Core functions that execute each step in the data pipeline, from data ingestion through validation, analysis, and export to MongoDB and cloud storage.
- 
          
calculate_district_totals() - Calculate District-Level Total Catch and Revenue
 
- 
          
calculate_monthly_trip_stats() - Calculate Monthly Trip Statistics by District
 
- 
          
estimate_fleet_activity() - Estimate Fleet-Wide Activity from Sample Data
 
- 
          
export_wf_data() - Export WorldFish Summary Data to MongoDB
 
- 
          
generate_fleet_analysis() - Generate Complete Fleet Activity Analysis Pipeline
 
- 
          
get_validation_status() - Get Validation Status from KoboToolbox
 
- 
          
ingest_pds_tracks() - Ingest Pelagic Data Systems (PDS) Track Data
 
- 
          
ingest_pds_trips() - Ingest Pelagic Data Systems (PDS) Trip Data
 
- 
          
ingest_surveys() - Ingest WCS and WF Catch Survey Data
 
- 
          
prepare_boat_registry() - Prepare Boat Registry Data from Metadata
 
- 
          
preprocess_ba_surveys() - Pre-process Blue Alliance Surveys
 
- 
          
preprocess_pds_tracks() - Preprocess Pelagic Data Systems (PDS) Track Data
 
- 
          
preprocess_wcs_surveys() - Pre-process Zanzibar WCS Surveys
 
- 
          
preprocess_wf_surveys() - Pre-process and Combine WorldFish Surveys - Both Versions
 
- 
          
process_trip_data() - Process Trip Data with District Information
 
- 
          
summarize_data() - Summarize WorldFish Survey Data
 
- 
          
sync_validation_submissions() - Synchronize Validation Statuses with KoboToolbox
 
- 
          
update_validation_status() - Update Validation Status in KoboToolbox
 
- 
          
validate_ba_surveys() - Validate Blue Alliance (BA) Surveys Data
 
- 
          
validate_wcs_surveys() - Validate WCS Surveys Data
 
- 
          
validate_wf_surveys() - Validate Wild Fishing Survey Data
 
Data Ingestion
Functions for pulling data from external sources (KoboToolbox, Pelagic Data Systems) and transforming it into standardized formats.
- 
          
get_trip_points() - Get Trip Points from Pelagic Data Systems API
 
- 
          
get_trips() - Retrieve Trip Details from Pelagic Data API
 
- 
          
ingest_pds_tracks() - Ingest Pelagic Data Systems (PDS) Track Data
 
- 
          
ingest_pds_trips() - Ingest Pelagic Data Systems (PDS) Trip Data
 
- 
          
ingest_surveys() - Ingest WCS and WF Catch Survey Data
 
- 
          
retrieve_surveys() - Retrieve Surveys from Kobotoolbox
 
Cloud Storage Management
Functions for interacting with cloud storage providers (Google Cloud Storage, MongoDB), uploading, downloading, and managing data files in various formats.
- 
          
cloud_object_name() - Retrieve Full Name of Versioned Cloud Object
 
- 
          
cloud_storage_authenticate() - Authenticate to a Cloud Storage Provider
 
- 
          
download_cloud_file() - Download Object from Cloud Storage
 
- 
          
download_parquet_from_cloud() - #' Download Parquet File from Cloud Storage
 
- 
          
get_metadata() - Get metadata tables
 
- 
          
get_preprocessed_surveys() - Download Preprocessed Surveys
 
- 
          
get_validated_surveys() - Download Validated Surveys
 
- 
          
mdb_collection_pull() - Retrieve Data from MongoDB
 
- 
          
mdb_collection_push() - Upload Data to MongoDB and Overwrite Existing Content
 
- 
          
upload_cloud_file() - Upload File to Cloud Storage
 
- 
          
upload_parquet_to_cloud() - Upload Processed Data to Cloud Storage
 
Data Preprocessing
Functions for cleaning, transforming, and structuring raw data into standardized formats ready for analysis, including data nesting, reshaping, and trip processing.
- 
          
calculate_catch() - Calculate Catch Weight from Length-Weight Relationships or Bucket Measurements
 
- 
          
calculate_fishery_metrics() - Calculate Fishery Metrics
 
- 
          
generate_track_summaries() - Generate Grid Summaries for Track Data
 
- 
          
getLWCoeffs() - Get Length-Weight Coefficients and Morphological Data for Species
 
- 
          
get_fao_groups() - Extract and Format FAO Taxonomic Groups
 
- 
          
get_length_weight_batch() - Get Length-Weight and Morphological Parameters for Species (Batch Version)
 
- 
          
get_species_areas_batch() - Get FAO Areas for Species (Batch Version)
 
- 
          
load_taxa_databases() - Load Taxa Data from FishBase and SeaLifeBase
 
- 
          
match_species_from_taxa() - Match Species from Taxa Databases
 
- 
          
prepare_boat_registry() - Prepare Boat Registry Data from Metadata
 
- 
          
preprocess_ba_surveys() - Pre-process Blue Alliance Surveys
 
- 
          
preprocess_pds_tracks() - Preprocess Pelagic Data Systems (PDS) Track Data
 
- 
          
preprocess_track_data() - Preprocess Track Data into Spatial Grid Summary
 
- 
          
preprocess_wcs_surveys() - Pre-process Zanzibar WCS Surveys
 
- 
          
preprocess_wf_surveys() - Pre-process and Combine WorldFish Surveys - Both Versions
 
- 
          
process_species_list() - Process Species List with Taxonomic Information
 
- 
          
process_trip_data() - Process Trip Data with District Information
 
- 
          
reshape_catch_data() - Reshape Catch Data with Length Groupings
 
- 
          
reshape_catch_data_v2() - Reshape Catch Data with Length Groupings - Version 2
 
- 
          
reshape_species_groups() - Reshape Species Groups from Wide to Long Format
 
Data Mining & Summarization
Functions for enriching fisheries data with scientific information, taxonomic classification, biological parameters, and creating summary datasets for analysis.
- 
          
calculate_catch() - Calculate Catch Weight from Length-Weight Relationships or Bucket Measurements
 
- 
          
expand_taxa() - Expand Taxonomic Vectors into a Data Frame
 
- 
          
getLWCoeffs() - Get Length-Weight Coefficients and Morphological Data for Species
 
- 
          
get_fao_groups() - Extract and Format FAO Taxonomic Groups
 
- 
          
get_length_weight_batch() - Get Length-Weight and Morphological Parameters for Species (Batch Version)
 
- 
          
get_species_areas_batch() - Get FAO Areas for Species (Batch Version)
 
- 
          
load_taxa_databases() - Load Taxa Data from FishBase and SeaLifeBase
 
- 
          
match_species_from_taxa() - Match Species from Taxa Databases
 
- 
          
process_species_list() - Process Species List with Taxonomic Information
 
- 
          
summarize_data() - Summarize WorldFish Survey Data
 
Data Modeling & Analysis
Functions for statistical modeling, fleet activity estimation, and scaling sample-based GPS data to fleet-wide estimates using boat registry information.
- 
          
estimate_fleet_activity() - Estimate Fleet-Wide Activity from Sample Data
 
- 
          
calculate_district_totals() - Calculate District-Level Total Catch and Revenue
 
- 
          
calculate_monthly_trip_stats() - Calculate Monthly Trip Statistics by District
 
- 
          
generate_fleet_analysis() - Generate Complete Fleet Activity Analysis Pipeline
 
Data Validation
Functions for validating fisheries data through quality checks, statistical outlier detection, and applying domain-specific validation rules.
- 
          
add_validation_flags() - Add validation flags to catch data
 
- 
          
aggregate_survey_data() - Aggregate survey data and calculate metrics
 
- 
          
calculate_catch_revenue() - Calculate catch revenue from validated data
 
- 
          
extract_trips_info() - Extract trip information from preprocessed surveys
 
- 
          
get_catch_bounds() - Get catch bounds for survey data
 
- 
          
get_length_bounds() - Get length bounds for survey data
 
- 
          
get_validation_status() - Get Validation Status from KoboToolbox
 
- 
          
process_catch_data() - Process catch data from surveys
 
- 
          
sync_validation_submissions() - Synchronize Validation Statuses with KoboToolbox
 
- 
          
update_validation_status() - Update Validation Status in KoboToolbox
 
- 
          
validate_ba_surveys() - Validate Blue Alliance (BA) Surveys Data
 
- 
          
validate_catches() - Validate catches using quality flags
 
- 
          
validate_prices() - Validate market prices
 
- 
          
validate_wcs_surveys() - Validate WCS Surveys Data
 
- 
          
validate_wf_surveys() - Validate Wild Fishing Survey Data
 
Data Export & Visualization
Functions for exporting processed data to MongoDB collections, creating geographic visualizations, and preparing data for portals and reporting.
- 
          
create_geos() - Generate Geographic Regional Summaries of Fishery Data
 
- 
          
create_geos_v1() - Generate Geographic Regional Summaries of Fishery Data (Version 1)
 
- 
          
export_wf_data() - Export WorldFish Summary Data to MongoDB
 
- 
          
kepler_mapper() - Generate a Kepler.gl map
 
Pipeline Orchestration
High-level functions that orchestrate complete analysis pipelines, combining multiple processing steps into integrated workflows.
- 
          
generate_fleet_analysis() - Generate Complete Fleet Activity Analysis Pipeline
 
- 
          
summarize_data() - Summarize WorldFish Survey Data
 
Helper Functions
Utility functions that support the main pipeline operations, providing common data manipulation and processing capabilities.
- 
          
add_version() - Add timestamp and sha string to a file name
 
- 
          
read_config() - Read configuration file