eogrow.pipelines.rasterize
Implements a pipeline for rasterizing vector datasets.
- pydantic model eogrow.pipelines.rasterize.Preprocessing[source]
Bases:
Schema
Create a new model by parsing and validating input data from keyword arguments.
Raises ValidationError if the input data cannot be parsed to form a valid model.
- Fields:
- field reproject_crs: int | None = None
An EPSG code of a CRS in which vector_input data will be reprojected once loaded. Mandatory if the input vector contains multiple layers in different CRS.
- class eogrow.pipelines.rasterize.RasterizePipeline(*args, **kwargs)[source]
Bases:
Pipeline
A pipeline module for rasterizing vector datasets.
- Parameters:
config – A dictionary with configuration parameters
raw_config – The configuration parameters pre-validation, for logging purposes only
args (Any) –
kwargs (Any) –
- pydantic model Schema[source]
Bases:
Schema
Create a new model by parsing and validating input data from keyword arguments.
Raises ValidationError if the input data cannot be parsed to form a valid model.
- Fields:
dataset_layer (str | None)
dtype (numpy.dtype)
input_folder_key (str)
no_data_value (int)
output_feature (Tuple[eolearn.core.constants.FeatureType, str])
output_folder_key (str)
overlap_value (float | None)
polygon_buffer (float)
preprocess_dataset (eogrow.pipelines.rasterize.Preprocessing | None)
raster_shape (Tuple[int, int] | None)
raster_value (float | None)
raster_values_column (str | None)
resolution (float | None)
vector_input (Tuple[eolearn.core.constants.FeatureType, str] | str)
- field dataset_layer: str | None = None
In case of multi-layer files, name of the layer with data to be rasterized.
- field dtype: np.dtype = dtype('int32')
Numpy dtype of the output feature.
- Validated by:
- field input_folder_key: str [Required]
The storage manager key pointing to the input folder.
- Validated by:
validate_storage_key
- field no_data_value: int = 0
The no_data_value argument to be passed to VectorToRasterTask
- field output_feature: Feature [Required]
A feature which should contain the newly rasterized data.
- Validated by:
_check_temporal_nature_match
- field output_folder_key: str [Required]
The storage manager key pointing to the output folder.
- Validated by:
validate_storage_key
- field overlap_value: float | None = None
Value to write over the areas where polygons overlap.
- field polygon_buffer: float = 0
The size of polygon buffering to be applied before rasterization.
- field preprocess_dataset: Preprocessing | None = None
Parameters used by self.preprocess_dataset method. Skipped if set to None.
- field raster_shape: Tuple[int, int] | None = None
Shape of resulting raster image. Cannot be used with resolution.
- Validated by:
cannot_be_used_with_resolution
- field raster_value: float | None = None
Value to be used for all rasterized polygons. Cannot be used with raster_values_column.
- field raster_values_column: str | None = None
GeoPandas column for reading per-geometry rasterization values. Cannot be used with raster_value.
- Validated by:
cannot_be_used_with_raster_value
- field resolution: float | None = None
Rendering resolution in meters. Cannot be used with raster_shape.
- field vector_input: Feature | str [Required]
An input filename or a feature containing vector data.
- Validated by:
_check_vector_input
- filter_patch_list(patch_list)[source]
Specifies which EOPatches should be skipped when skip_existing is enabled.
- Parameters:
patch_list (List[Tuple[str, BBox]]) –
- Return type:
List[Tuple[str, BBox]]
- run_procedure()[source]
Execution procedure of pipeline. Can be overridden if needed.
By default, builds the workflow by using a build_workflow method, which must be additionally implemented.
- Returns:
A pair of lists representing successful and unsuccessful executions.
- Return type:
tuple[list[str], list[str]]
- run_dataset_preprocessing(filename, preprocess_config)[source]
Loads datasets, applies preprocessing steps and saves them to a cache folder
- Parameters:
filename (str) –
preprocess_config (Preprocessing) –
- Return type:
None
- build_workflow()[source]
Creates workflow that is divided into the following sub-parts:
loading data,
preprocessing steps,
rasterization of features,
postprocessing steps,
saving results
- Return type:
EOWorkflow
- preprocess_dataset(dataframe)[source]
Method for applying custom preprocessing steps on the entire dataset
- Parameters:
dataframe (GeoDataFrame) –
- Return type:
GeoDataFrame
- get_prerasterization_node(previous_node)[source]
Builds node with tasks to be applied after loading vector feature but before rasterization
- Parameters:
previous_node (EONode) –
- Return type:
EONode