eogrow.pipelines.features

Implements a pipeline to construct features for training/prediction.

pydantic model eogrow.pipelines.features.ValidityFiltering[source]

Bases: Schema

Create a new model by parsing and validating input data from keyword arguments.

Raises ValidationError if the input data cannot be parsed to form a valid model.

Fields:
field cloud_mask_feature_name: str | None = None

Name of cloud mask to enable additional filtering by cloud

field valid_data_feature_name: str [Required]

Name of the valid-data mask to use for filtering.

field validity_threshold: float | None = None

Threshold to remove frames with valid data lower than threshold

class eogrow.pipelines.features.FeaturesPipeline(config, raw_config=None)[source]

Bases: Pipeline

A pipeline to calculate and prepare features for ML

Parameters:
  • config (Schema) – A dictionary with configuration parameters

  • raw_config (RawConfig | None) – The configuration parameters pre-validation, for logging purposes only

pydantic model Schema[source]

Bases: Schema

Create a new model by parsing and validating input data from keyword arguments.

Raises ValidationError if the input data cannot be parsed to form a valid model.

Fields:
  • bands_feature_name (str)

  • data_preparation (eogrow.pipelines.features.ValidityFiltering)

  • dtype (numpy.dtype | None)

  • input_folder_key (str)

  • ndis (Dict[str, Tuple[int, int]])

  • output_feature_name (str)

  • output_folder_key (str)

field bands_feature_name: str [Required]

Name of data feature containing band data

field data_preparation: ValidityFiltering [Required]
field dtype: np.dtype | None = None

The dtype under which the concatenated features should be saved

Validated by:
  • optional_parse_dtype

field input_folder_key: str [Required]

The storage manager key pointing to the input folder for the features pipeline.

Validated by:
  • validate_storage_key

field ndis: Dict[str, Tuple[int, int]] [Optional]

A dictionary of kind {feature_name: (id1, id2)} that specifies how to calculate the NDIs of bands (with indices id1 and id2 in the bands feature) and save it under feature_name.

field output_feature_name: str [Required]

Name of output data feature encompassing bands and NDIs

field output_folder_key: str [Required]

The storage manager key pointing to the output folder for the features pipeline.

Validated by:
  • validate_storage_key

config: Schema
filter_patch_list(patch_list)[source]

EOPatches are filtered according to existence of specified output features

Parameters:

patch_list (List[Tuple[str, BBox]]) –

Return type:

List[Tuple[str, BBox]]

build_workflow()[source]

Creates a workflow: 1. Loads and prepares a ‘bands_feature’ and ‘valid_data_feature’ 2. Temporally regularizes bands and NDIs 3. Calculates NDIs based on ‘bands_feature’ 4. Applies post-processing, which prepares all output features 5. Saves all relevant features (specified in _get_output_features)

Return type:

EOWorkflow

get_data_preparation_node()[source]

Nodes that load, filter, and prepare a feature containing all bands

Returns:

A node with preparation tasks and feature for masking invalid data

Return type:

EONode

get_temporal_regularization_node(previous_node)[source]

Builds node adding temporal regularization to workflow.

Parameters:

previous_node (EONode) –

Return type:

EONode

get_ndi_node(previous_node)[source]

Builds a node for constructing Normalized Difference Indices

Parameters:

previous_node (EONode) –

Return type:

EONode

get_postprocessing_node(previous_node)[source]

Tasks performed after temporal regularization. Should also prepare features for the saving step

Parameters:

previous_node (EONode) –

Return type:

EONode

pydantic model eogrow.pipelines.features.MosaickingSpecifications[source]

Bases: Schema

Create a new model by parsing and validating input data from keyword arguments.

Raises ValidationError if the input data cannot be parsed to form a valid model.

Fields:
field max_ndi_indices: Tuple[int, int] | None = None

When omitted uses median value mosaicking. If set, uses max NDI mosaicking for the NDI of the bands at specified indices. For example, to use max NDVI when using all 13 bands of L1C set parameter to [7, 3] (uses B08 and B04)

field n_mosaics: int [Required]
field time_period: Tuple[date, date] [Required]
Validated by:
class eogrow.pipelines.features.MosaickingFeaturesPipeline(config, raw_config=None)[source]

Bases: FeaturesPipeline

A pipeline to calculate and prepare features for ML including mosaicking

Parameters:
  • config (Schema) – A dictionary with configuration parameters

  • raw_config (RawConfig | None) – The configuration parameters pre-validation, for logging purposes only

pydantic model Schema[source]

Bases: Schema

Create a new model by parsing and validating input data from keyword arguments.

Raises ValidationError if the input data cannot be parsed to form a valid model.

Fields:
  • mosaicking (MosaickingSpecifications)

field mosaicking: MosaickingSpecifications [Required]

Fine-tuning of mosaicking parameters. If not set, the interpolation will work on current timestamps

config: Schema
get_data_preparation_node()[source]

Nodes that load, filter, and prepare a feature containing all bands

Returns:

A node with preparation tasks and feature for masking invalid data

Return type:

EONode

get_temporal_regularization_node(previous_node)[source]

Builds node adding temporal regularization to workflow.

Parameters:

previous_node (EONode) –

Return type:

EONode