eogrow.pipelines.rasterize

Implements a pipeline for rasterizing vector datasets.

pydantic model eogrow.pipelines.rasterize.Preprocessing[source]

Bases: Schema

Create a new model by parsing and validating input data from keyword arguments.

Raises ValidationError if the input data cannot be parsed to form a valid model.

Fields:

reproject_crs (int | None)

field reproject_crs: int | None = None: An EPSG code of a CRS in which vector_input data will be reprojected once loaded. Mandatory if the input vector contains multiple layers in different CRS.

class eogrow.pipelines.rasterize.RasterizePipeline(*args, **kwargs)[source]

Bases: Pipeline

A pipeline module for rasterizing vector datasets.

Parameters:

config – A dictionary with configuration parameters
raw_config – The configuration parameters pre-validation, for logging purposes only
args (Any) –
kwargs (Any) –

pydantic model Schema[source]

Bases: Schema

Create a new model by parsing and validating input data from keyword arguments.

Raises ValidationError if the input data cannot be parsed to form a valid model.

Fields:

dataset_layer (str | None)
dtype (numpy.dtype)
input_folder_key (str)
no_data_value (int)
output_feature (Tuple[eolearn.core.constants.FeatureType, str])
output_folder_key (str)
overlap_value (float | None)
polygon_buffer (float)
preprocess_dataset (eogrow.pipelines.rasterize.Preprocessing | None)
raster_shape (Tuple[int, int] | None)
raster_value (float | None)
raster_values_column (str | None)
resolution (float | None)
vector_input (Tuple[eolearn.core.constants.FeatureType, str] | str)

field dataset_layer: str | None = None: In case of multi-layer files, name of the layer with data to be rasterized.

field dtype: np.dtype = dtype('int32')

Numpy dtype of the output feature.

Validated by:

parse_dtype

field input_folder_key: str [Required]

The storage manager key pointing to the input folder.

Validated by:

validate_storage_key

field no_data_value: int = 0: The no_data_value argument to be passed to VectorToRasterTask

field output_feature: Feature [Required]

A feature which should contain the newly rasterized data.

Validated by:

_check_temporal_nature_match

field output_folder_key: str [Required]

The storage manager key pointing to the output folder.

Validated by:

validate_storage_key

field overlap_value: float | None = None: Value to write over the areas where polygons overlap.

field polygon_buffer: float = 0: The size of polygon buffering to be applied before rasterization.

field preprocess_dataset: Preprocessing | None = None: Parameters used by self.preprocess_dataset method. Skipped if set to None.

field raster_shape: Tuple[int, int] | None = None

Shape of resulting raster image. Cannot be used with resolution.

Validated by:

cannot_be_used_with_resolution

field raster_value: float | None = None: Value to be used for all rasterized polygons. Cannot be used with raster_values_column.

field raster_values_column: str | None = None

GeoPandas column for reading per-geometry rasterization values. Cannot be used with raster_value.

Validated by:

cannot_be_used_with_raster_value

field resolution: float | None = None: Rendering resolution in meters. Cannot be used with raster_shape.

field vector_input: Feature | str [Required]

An input filename or a feature containing vector data.

Validated by:

_check_vector_input

config: Schema

filter_patch_list(patch_list)[source]

Specifies which EOPatches should be skipped when skip_existing is enabled.

Parameters:: patch_list (List[Tuple[str, BBox]]) –
Return type:: List[Tuple[str, BBox]]

run_procedure()[source]

Execution procedure of pipeline. Can be overridden if needed.

By default, builds the workflow by using a build_workflow method, which must be additionally implemented.

Returns:: A pair of lists representing successful and unsuccessful executions.
Return type:: tuple[list[str], list[str]]

run_dataset_preprocessing(filename, preprocess_config)[source]

Loads datasets, applies preprocessing steps and saves them to a cache folder

Parameters:

filename (str) –
preprocess_config (Preprocessing) –

Return type:

None

build_workflow()[source]

Creates workflow that is divided into the following sub-parts:

loading data,
preprocessing steps,
rasterization of features,
postprocessing steps,
saving results

Return type:: EOWorkflow

preprocess_dataset(dataframe)[source]

Method for applying custom preprocessing steps on the entire dataset

Parameters:: dataframe (GeoDataFrame) –
Return type:: GeoDataFrame

get_prerasterization_node(previous_node)[source]

Builds node with tasks to be applied after loading vector feature but before rasterization

Parameters:: previous_node (EONode) –
Return type:: EONode

get_rasterization_node(previous_node)[source]

Builds nodes containing rasterization tasks

Parameters:: previous_node (EONode) –
Return type:: EONode

get_postrasterization_node(previous_node)[source]

Builds node with tasks to be applied after rasterization

Parameters:: previous_node (EONode) –
Return type:: EONode