Package index
Data Pipeline Workflow
Core functions that execute each step in the data pipeline, from data ingestion to validation and export.
-
export_data()
- Export Processed Fisheries Data to MongoDB
-
export_wf_data()
- Export WorldFish Survey Data
-
get_validation_status()
- Get Validation Status from KoboToolbox
-
ingest_pds_tracks()
- Ingest Pelagic Data Systems (PDS) Track Data
-
ingest_pds_trips()
- Ingest Pelagic Data Systems (PDS) Trip Data
-
ingest_surveys()
- Ingest WCS and WF Catch Survey Data
-
preprocess_ba_surveys()
- Pre-process Blue Alliance Surveys
-
preprocess_pds_tracks()
- Preprocess Pelagic Data Systems (PDS) Track Data
-
preprocess_wcs_surveys()
- Pre-process Zanzibar WCS Surveys
-
preprocess_wf_surveys()
- Pre-process WorldFish Surveys
-
sync_validation_submissions()
- Synchronize Validation Statuses with KoboToolbox
-
update_validation_status()
- Update Validation Status in KoboToolbox
-
validate_ba_surveys()
- Validate Blue Alliance (BA) Surveys Data
-
validate_wcs_surveys()
- Validate WCS Surveys Data
-
validate_wf_surveys()
- Validate Wild Fishing Survey Data
Data Ingestion
Functions for pulling data from external sources (KoboToolbox, Pelagic Data Systems) and transforming it into standardized formats.
-
get_trip_points()
- Get Trip Points from Pelagic Data Systems API
-
get_trips()
- Retrieve Trip Details from Pelagic Data API
-
ingest_pds_tracks()
- Ingest Pelagic Data Systems (PDS) Track Data
-
ingest_pds_trips()
- Ingest Pelagic Data Systems (PDS) Trip Data
-
ingest_surveys()
- Ingest WCS and WF Catch Survey Data
-
retrieve_surveys()
- Retrieve Surveys from Kobotoolbox
Cloud Storage Management
Functions for interacting with cloud storage providers (Google Cloud Storage, MongoDB), uploading, downloading, and managing data files in various formats.
-
cloud_object_name()
- Retrieve Full Name of Versioned Cloud Object
-
cloud_storage_authenticate()
- Authenticate to a Cloud Storage Provider
-
download_cloud_file()
- Download Object from Cloud Storage
-
download_parquet_from_cloud()
- #' Download Parquet File from Cloud Storage
-
get_metadata()
- Get metadata tables
-
get_preprocessed_surveys()
- Download Preprocessed Surveys
-
get_validated_surveys()
- Download Validated Surveys
-
mdb_collection_pull()
- Retrieve Data from MongoDB
-
mdb_collection_push()
- Upload Data to MongoDB and Overwrite Existing Content
-
upload_cloud_file()
- Upload File to Cloud Storage
-
upload_parquet_to_cloud()
- Upload Processed Data to Cloud Storage
Data Preprocessing
Functions for cleaning, transforming, and structuring raw data into standardized formats ready for analysis, including data nesting and reshaping.
-
calculate_catch()
- Calculate Catch Weight from Length-Weight Relationships or Bucket Measurements
-
generate_track_summaries()
- Generate Grid Summaries for Track Data
-
getLWCoeffs()
- Get Length-Weight Coefficients and Morphological Data for Species
-
get_fao_groups()
- Extract and Format FAO Taxonomic Groups
-
get_length_weight_batch()
- Get Length-Weight and Morphological Parameters for Species (Batch Version)
-
get_species_areas_batch()
- Get FAO Areas for Species (Batch Version)
-
load_taxa_databases()
- Load Taxa Data from FishBase and SeaLifeBase
-
match_species_from_taxa()
- Match Species from Taxa Databases
-
preprocess_ba_surveys()
- Pre-process Blue Alliance Surveys
-
preprocess_pds_tracks()
- Preprocess Pelagic Data Systems (PDS) Track Data
-
preprocess_track_data()
- Preprocess Track Data into Spatial Grid Summary
-
preprocess_wcs_surveys()
- Pre-process Zanzibar WCS Surveys
-
preprocess_wf_surveys()
- Pre-process WorldFish Surveys
-
process_species_list()
- Process Species List with Taxonomic Information
-
reshape_catch_data()
- Reshape Catch Data with Length Groupings
-
reshape_species_groups()
- Reshape Species Groups from Wide to Long Format
Data Mining & Enrichment
Functions for enriching fisheries data with scientific information, taxonomic classification, and biological parameters (length-weight relationships).
-
calculate_catch()
- Calculate Catch Weight from Length-Weight Relationships or Bucket Measurements
-
expand_taxa()
- Expand Taxonomic Vectors into a Data Frame
-
getLWCoeffs()
- Get Length-Weight Coefficients and Morphological Data for Species
-
get_fao_groups()
- Extract and Format FAO Taxonomic Groups
-
get_length_weight_batch()
- Get Length-Weight and Morphological Parameters for Species (Batch Version)
-
get_species_areas_batch()
- Get FAO Areas for Species (Batch Version)
-
load_taxa_databases()
- Load Taxa Data from FishBase and SeaLifeBase
-
match_species_from_taxa()
- Match Species from Taxa Databases
-
process_species_list()
- Process Species List with Taxonomic Information
Data Validation
Functions for validating fisheries data through quality checks, statistical outlier detection, and applying domain-specific validation rules.
-
add_validation_flags()
- Add validation flags to catch data
-
aggregate_survey_data()
- Aggregate survey data and calculate metrics
-
calculate_catch_revenue()
- Calculate catch revenue from validated data
-
extract_trips_info()
- Extract trip information from preprocessed surveys
-
get_catch_bounds()
- Get catch bounds for survey data
-
get_length_bounds()
- Get length bounds for survey data
-
get_validation_status()
- Get Validation Status from KoboToolbox
-
process_catch_data()
- Process catch data from surveys
-
sync_validation_submissions()
- Synchronize Validation Statuses with KoboToolbox
-
update_validation_status()
- Update Validation Status in KoboToolbox
-
validate_ba_surveys()
- Validate Blue Alliance (BA) Surveys Data
-
validate_catches()
- Validate catches using quality flags
-
validate_prices()
- Validate market prices
-
validate_wcs_surveys()
- Validate WCS Surveys Data
-
validate_wf_surveys()
- Validate Wild Fishing Survey Data
Data Export & Visualization
Functions for exporting processed data to various formats and creating visualizations for reporting and analysis.
-
create_geos()
- Generate Geographic Regional Summaries of Fishery Data
-
export_data()
- Export Processed Fisheries Data to MongoDB
-
export_wf_data()
- Export WorldFish Survey Data
-
kepler_mapper()
- Generate a Kepler.gl map
Helper Functions
Utility functions that support the main pipeline operations, providing common data manipulation and processing capabilities.
-
add_version()
- Add timestamp and sha string to a file name
-
read_config()
- Read configuration file