Package index
Data Pipeline Workflow
Core functions that execute each step in the data pipeline, from data ingestion through validation, analysis, and export to MongoDB and cloud storage.
-
calculate_district_totals() - Calculate District-Level Total Catch and Revenue
-
calculate_monthly_trip_stats() - Calculate Monthly Trip Statistics by District
-
estimate_fleet_activity() - Estimate Fleet-Wide Activity from Sample Data
-
export_wf_data() - Export WorldFish Summary Data to MongoDB
-
generate_fleet_analysis() - Generate Complete Fleet Activity Analysis Pipeline
-
get_validation_status() - Get Validation Status from KoboToolbox
-
ingest_pds_tracks() - Ingest Pelagic Data Systems (PDS) Track Data
-
ingest_pds_trips() - Ingest Pelagic Data Systems (PDS) Trip Data
-
ingest_surveys() - Ingest WCS and WF Catch Survey Data
-
prepare_boat_registry() - Prepare Boat Registry Data from Metadata
-
preprocess_ba_surveys() - Pre-process Blue Alliance Surveys
-
preprocess_pds_tracks() - Preprocess Pelagic Data Systems (PDS) Track Data
-
preprocess_wcs_surveys() - Pre-process Zanzibar WCS Surveys
-
preprocess_wf_surveys() - Pre-process and Combine WorldFish Surveys - Both Versions
-
process_trip_data() - Process Trip Data with District Information
-
summarize_data() - Summarize WorldFish Survey Data
-
sync_validation_submissions() - Synchronize Validation Statuses with KoboToolbox
-
update_validation_status() - Update Validation Status in KoboToolbox
-
validate_ba_surveys() - Validate Blue Alliance (BA) Surveys Data
-
validate_wcs_surveys() - Validate WCS Surveys Data
-
validate_wf_surveys() - Validate Wild Fishing Survey Data
Data Ingestion
Functions for pulling data from external sources (KoboToolbox, Pelagic Data Systems) and transforming it into standardized formats.
-
get_trip_points() - Get Trip Points from Pelagic Data Systems API
-
get_trips() - Retrieve Trip Details from Pelagic Data API
-
ingest_pds_tracks() - Ingest Pelagic Data Systems (PDS) Track Data
-
ingest_pds_trips() - Ingest Pelagic Data Systems (PDS) Trip Data
-
ingest_surveys() - Ingest WCS and WF Catch Survey Data
-
retrieve_surveys() - Retrieve Surveys from Kobotoolbox
Cloud Storage Management
Functions for interacting with cloud storage providers (Google Cloud Storage, MongoDB), uploading, downloading, and managing data files in various formats.
-
cloud_object_name() - Retrieve Full Name of Versioned Cloud Object
-
cloud_storage_authenticate() - Authenticate to a Cloud Storage Provider
-
download_cloud_file() - Download Object from Cloud Storage
-
download_parquet_from_cloud() - #' Download Parquet File from Cloud Storage
-
get_metadata() - Get metadata tables
-
get_preprocessed_surveys() - Download Preprocessed Surveys
-
get_validated_surveys() - Download Validated Surveys
-
mdb_collection_pull() - Retrieve Data from MongoDB
-
mdb_collection_push() - Upload Data to MongoDB and Overwrite Existing Content
-
upload_cloud_file() - Upload File to Cloud Storage
-
upload_parquet_to_cloud() - Upload Processed Data to Cloud Storage
Data Preprocessing
Functions for cleaning, transforming, and structuring raw data into standardized formats ready for analysis, including data nesting, reshaping, and trip processing.
-
calculate_catch() - Calculate Catch Weight from Length-Weight Relationships or Bucket Measurements
-
calculate_fishery_metrics() - Calculate Fishery Metrics
-
generate_track_summaries() - Generate Grid Summaries for Track Data
-
getLWCoeffs() - Get Length-Weight Coefficients and Morphological Data for Species
-
get_fao_groups() - Extract and Format FAO Taxonomic Groups
-
get_length_weight_batch() - Get Length-Weight and Morphological Parameters for Species (Batch Version)
-
get_species_areas_batch() - Get FAO Areas for Species (Batch Version)
-
load_taxa_databases() - Load Taxa Data from FishBase and SeaLifeBase
-
match_species_from_taxa() - Match Species from Taxa Databases
-
prepare_boat_registry() - Prepare Boat Registry Data from Metadata
-
preprocess_ba_surveys() - Pre-process Blue Alliance Surveys
-
preprocess_pds_tracks() - Preprocess Pelagic Data Systems (PDS) Track Data
-
preprocess_track_data() - Preprocess Track Data into Spatial Grid Summary
-
preprocess_wcs_surveys() - Pre-process Zanzibar WCS Surveys
-
preprocess_wf_surveys() - Pre-process and Combine WorldFish Surveys - Both Versions
-
process_species_list() - Process Species List with Taxonomic Information
-
process_trip_data() - Process Trip Data with District Information
-
reshape_catch_data() - Reshape Catch Data with Length Groupings
-
reshape_catch_data_v2() - Reshape Catch Data with Length Groupings - Version 2
-
reshape_species_groups() - Reshape Species Groups from Wide to Long Format
Data Mining & Summarization
Functions for enriching fisheries data with scientific information, taxonomic classification, biological parameters, and creating summary datasets for analysis.
-
calculate_catch() - Calculate Catch Weight from Length-Weight Relationships or Bucket Measurements
-
expand_taxa() - Expand Taxonomic Vectors into a Data Frame
-
getLWCoeffs() - Get Length-Weight Coefficients and Morphological Data for Species
-
get_fao_groups() - Extract and Format FAO Taxonomic Groups
-
get_length_weight_batch() - Get Length-Weight and Morphological Parameters for Species (Batch Version)
-
get_species_areas_batch() - Get FAO Areas for Species (Batch Version)
-
load_taxa_databases() - Load Taxa Data from FishBase and SeaLifeBase
-
match_species_from_taxa() - Match Species from Taxa Databases
-
process_species_list() - Process Species List with Taxonomic Information
-
summarize_data() - Summarize WorldFish Survey Data
Data Modeling & Analysis
Functions for statistical modeling, fleet activity estimation, and scaling sample-based GPS data to fleet-wide estimates using boat registry information.
-
estimate_fleet_activity() - Estimate Fleet-Wide Activity from Sample Data
-
calculate_district_totals() - Calculate District-Level Total Catch and Revenue
-
calculate_monthly_trip_stats() - Calculate Monthly Trip Statistics by District
-
generate_fleet_analysis() - Generate Complete Fleet Activity Analysis Pipeline
Data Validation
Functions for validating fisheries data through quality checks, statistical outlier detection, and applying domain-specific validation rules.
-
add_validation_flags() - Add validation flags to catch data
-
aggregate_survey_data() - Aggregate survey data and calculate metrics
-
calculate_catch_revenue() - Calculate catch revenue from validated data
-
extract_trips_info() - Extract trip information from preprocessed surveys
-
get_catch_bounds() - Get catch bounds for survey data
-
get_length_bounds() - Get length bounds for survey data
-
get_validation_status() - Get Validation Status from KoboToolbox
-
process_catch_data() - Process catch data from surveys
-
sync_validation_submissions() - Synchronize Validation Statuses with KoboToolbox
-
update_validation_status() - Update Validation Status in KoboToolbox
-
validate_ba_surveys() - Validate Blue Alliance (BA) Surveys Data
-
validate_catches() - Validate catches using quality flags
-
validate_prices() - Validate market prices
-
validate_wcs_surveys() - Validate WCS Surveys Data
-
validate_wf_surveys() - Validate Wild Fishing Survey Data
Data Export & Visualization
Functions for exporting processed data to MongoDB collections, creating geographic visualizations, and preparing data for portals and reporting.
-
create_geos() - Generate Geographic Regional Summaries of Fishery Data
-
create_geos_v1() - Generate Geographic Regional Summaries of Fishery Data (Version 1)
-
export_wf_data() - Export WorldFish Summary Data to MongoDB
-
kepler_mapper() - Generate a Kepler.gl map
Pipeline Orchestration
High-level functions that orchestrate complete analysis pipelines, combining multiple processing steps into integrated workflows.
-
generate_fleet_analysis() - Generate Complete Fleet Activity Analysis Pipeline
-
summarize_data() - Summarize WorldFish Survey Data
Helper Functions
Utility functions that support the main pipeline operations, providing common data manipulation and processing capabilities.
-
add_version() - Add timestamp and sha string to a file name
-
read_config() - Read configuration file