Outlier identification based on Cook's distance
validate_price_weight.Rd
This function adds an additional alert to both price and catch alert dataframes when the relation between the price and weight assume abnormal values relatively to each species. The relationship between weight and price is mostly linear, this function identifies the survey IDs where the Cook's distance is higher than cook_dist * mean_cook, where cook_dist is a multiplicative coefficient and cook_dist is the average Cook's distance relatively to each species.
Usage
validate_price_weight(
catch_alerts = NULL,
price_alerts = NULL,
non_regular_ids = NULL,
cook_dist = NULL,
price_weight_min = NULL,
price_weight_max = NULL
)
Arguments
- catch_alerts
The dataframe of catch alerts.
- price_alerts
The dataframe of price alerts.
- non_regular_ids
The dataframe of landings regularity alerts.
- cook_dist
A number that go in the formula cook_dist * (mean(cooksd)).
- price_weight_min
Min price per weight value threshold.
- price_weight_max
Max price per weight value threshold.