eogrow.pipelines.merge_samples
Implements a pipeline for merging sampled features into numpy arrays fit for training models.
- class eogrow.pipelines.merge_samples.MergeSamplesPipeline(config, raw_config=None)[source]
Bases:
Pipeline
Pipeline to merge sampled data into joined numpy arrays
- Parameters:
config (Schema) – A dictionary with configuration parameters
raw_config (RawConfig | None) – The configuration parameters pre-validation, for logging purposes only
- pydantic model Schema[source]
Bases:
Schema
Create a new model by parsing and validating input data from keyword arguments.
Raises ValidationError if the input data cannot be parsed to form a valid model.
- Fields:
features_to_merge (List[Tuple[eolearn.core.constants.FeatureType, str]])
id_filename (str | None)
input_folder_key (str)
num_threads (int)
output_folder_key (str)
skip_existing (Literal[False])
suffix (str)
- field features_to_merge: List[Feature] [Required]
Dictionary of all features for which samples are to be merged.
- field id_filename: str | None = None
Filename of array holding patch ID of concatenated features. The patch ID is the index of the patch in the final patch list, any filtration of the patch list will impact the results.
- field input_folder_key: str [Required]
The storage manager key pointing to the input folder for the merge samples.
- Validated by:
validate_storage_key
- field num_threads: int = 1
Number of threads used to load data from EOPatches in parallel.
- field output_folder_key: str [Required]
The storage manager key pointing to the output folder for the merge samples pipeline.
- Validated by:
validate_storage_key
- field skip_existing: Literal[False] = False
- field suffix: str = ''
String to append to array filenames
- run_procedure()[source]
Procedure which merges data from EOPatches into ML-ready numpy arrays
- Return type:
tuple[list[str], list[str]]