eogrow.pipelines.sampling
Implements different pipelines for sampling from data.
- class eogrow.pipelines.sampling.BaseSamplingPipeline(config, raw_config=None)[source]
Bases:
Pipeline
Pipeline to run sampling on EOPatches
- Parameters:
config (Schema) – A dictionary with configuration parameters
raw_config (RawConfig | None) – The configuration parameters pre-validation, for logging purposes only
- pydantic model Schema[source]
Bases:
Schema
Create a new model by parsing and validating input data from keyword arguments.
Raises ValidationError if the input data cannot be parsed to form a valid model.
- Fields:
apply_to (Dict[str, Dict[eolearn.core.constants.FeatureType, List[str]]])
mask_of_samples_name (str | None)
output_folder_key (str)
sampled_suffix (str | None)
- field apply_to: Dict[str, Dict[FeatureType, List[str]]] [Required]
A dictionary defining which features to sample, its structure is {folder_key: {feature_type: [feature_name]}}
- field mask_of_samples_name: str | None = None
A name of a mask timeless output feature with information which pixels were sampled and how many times
- field output_folder_key: str [Required]
The storage manager key pointing to the pipeline output folder.
- Validated by:
validate_storage_key
- field sampled_suffix: str | None = None
If provided features are saved with a suffix, e.g. for suffix SAMPLED the sampled FEATURES are saved as FEATURES_SAMPLED.
- class eogrow.pipelines.sampling.BaseRandomSamplingPipeline(*args, **kwargs)[source]
Bases:
BaseSamplingPipeline
A base class for all sampling pipeline that work on random selection of samples
- Parameters:
config – A dictionary with configuration parameters
raw_config – The configuration parameters pre-validation, for logging purposes only
args (Any) –
kwargs (Any) –
- pydantic model Schema[source]
Bases:
Schema
Create a new model by parsing and validating input data from keyword arguments.
Raises ValidationError if the input data cannot be parsed to form a valid model.
- Fields:
seed (Optional[int])
- field seed: int | None = 42
A random generator seed to be used in order to obtain the same results every pipeline run.
- class eogrow.pipelines.sampling.FractionSamplingPipeline(*args, **kwargs)[source]
Bases:
BaseRandomSamplingPipeline
A pipeline to sample per-class with different distributions
- Parameters:
config – A dictionary with configuration parameters
raw_config – The configuration parameters pre-validation, for logging purposes only
args (Any) –
kwargs (Any) –
- pydantic model Schema[source]
Bases:
Schema
Create a new model by parsing and validating input data from keyword arguments.
Raises ValidationError if the input data cannot be parsed to form a valid model.
- Fields:
erosion_dict (Optional[Dict[int, List[int]]])
exclude_values (List[int])
fraction_of_samples (Union[float, Dict[int, float]])
sampling_feature_name (str)
- field erosion_dict: Dict[int, List[int]] | None = None
A dictionary specifying disc radius of erosion operation to be applied to a list of label IDs
- field exclude_values: List[int] [Optional]
Values to be excluded from sampling
- field fraction_of_samples: float | Dict[int, float] [Required]
A fraction or a dictionary of per-class fractions of valid pixels to sample from the the sampling feature.
- field sampling_feature_name: str [Required]
Name of MASK_TIMELESS feature to be used to create sample point
- class eogrow.pipelines.sampling.BlockSamplingPipeline(*args, **kwargs)[source]
Bases:
BaseRandomSamplingPipeline
A pipeline to randomly sample blocks
- Parameters:
config – A dictionary with configuration parameters
raw_config – The configuration parameters pre-validation, for logging purposes only
args (Any) –
kwargs (Any) –
- pydantic model Schema[source]
Bases:
Schema
Create a new model by parsing and validating input data from keyword arguments.
Raises ValidationError if the input data cannot be parsed to form a valid model.
- Fields:
fraction_of_samples (Optional[float])
number_of_samples (Optional[int])
sample_size (Tuple[int, int])
- field fraction_of_samples: float | None = None
A percentage of samples to be sampled. Exactly one of parameters fraction_of_samples and number_of_samples has to be given.
- Validated by:
cannot_be_used_with_number_of_samples
- field number_of_samples: int | None = None
A number of samples to be sampled. Exactly one of parameters fraction_of_samples and number_of_samples has to be given.
- field sample_size: Tuple[int, int] [Required]
A height and width of each block in pixels.
- class eogrow.pipelines.sampling.GridSamplingPipeline(config, raw_config=None)[source]
Bases:
BaseSamplingPipeline
A pipeline to sample blocks in a regular grid
- Parameters:
config (Schema) – A dictionary with configuration parameters
raw_config (RawConfig | None) – The configuration parameters pre-validation, for logging purposes only
- pydantic model Schema[source]
Bases:
Schema
Create a new model by parsing and validating input data from keyword arguments.
Raises ValidationError if the input data cannot be parsed to form a valid model.
- Fields:
sample_size (Tuple[int, int])
stride (Tuple[int, int])
- field sample_size: Tuple[int, int] [Required]
A height and width of each block in pixels.
- field stride: Tuple[int, int] [Required]
A tuple describing a distance between upper left corners of two consecutive sampled blocks. The first number is the vertical distance and the second number the horizontal distance. If stride in smaller than sample_size in any dimensions then sampled blocks will overlap.