Skip to contents

Downloads raw structured data from cloud storage services and pre-process into a binary format that is easier to deal with in R.

Usage

preprocess_landings_step_2(log_threshold = logger::DEBUG)

Arguments

log_threshold

The (standard Apache logj4) log level used as a threshold for the logging infrastructure. See logger::log_levels for more details

Value

no outputs. This function is used for it's side effects

Details

In order to not exceed CPU memory limits in Docker containers, the preprocessing of raw landings data was splitted in two containers (two separate jobs in GitHub actions), this function process the second half of raw data, while the function preprocess_landings_step_1 process the first half.

This function downloads the landings data from a given version (specified in the config file conf.yml.The parameters needed are:

surveys:
  landings:
    api:
    survey_id:
    token:
    file_prefix:
  version:
    preprocess:
storage:
  storage_name:
    key:
    options:
      project:
      bucket:
      service_account_key:

Progress through the function is tracked using the package logger.