eogrow.pipelines.rasterize

Implements a pipeline for rasterizing vector datasets.

pydantic model eogrow.pipelines.rasterize.Preprocessing[source]

Bases: Schema

Create a new model by parsing and validating input data from keyword arguments.

Raises ValidationError if the input data cannot be parsed to form a valid model.

Fields:
field reproject_crs: int | None = None

An EPSG code of a CRS in which vector_input data will be reprojected once loaded. Mandatory if the input vector contains multiple layers in different CRS.

class eogrow.pipelines.rasterize.RasterizePipeline(*args, **kwargs)[source]

Bases: Pipeline

A pipeline module for rasterizing vector datasets.

Parameters:
  • config – A dictionary with configuration parameters

  • raw_config – The configuration parameters pre-validation, for logging purposes only

  • args (Any) –

  • kwargs (Any) –

pydantic model Schema[source]

Bases: Schema

Create a new model by parsing and validating input data from keyword arguments.

Raises ValidationError if the input data cannot be parsed to form a valid model.

Fields:
  • dataset_layer (str | None)

  • dtype (numpy.dtype)

  • input_folder_key (str)

  • no_data_value (int)

  • output_feature (Tuple[eolearn.core.constants.FeatureType, str])

  • output_folder_key (str)

  • overlap_value (float | None)

  • polygon_buffer (float)

  • preprocess_dataset (eogrow.pipelines.rasterize.Preprocessing | None)

  • raster_shape (Tuple[int, int] | None)

  • raster_value (float | None)

  • raster_values_column (str | None)

  • resolution (float | None)

  • vector_input (Tuple[eolearn.core.constants.FeatureType, str] | str)

field dataset_layer: str | None = None

In case of multi-layer files, name of the layer with data to be rasterized.

field dtype: np.dtype = dtype('int32')

Numpy dtype of the output feature.

Validated by:
field input_folder_key: str [Required]

The storage manager key pointing to the input folder.

Validated by:
  • validate_storage_key

field no_data_value: int = 0

The no_data_value argument to be passed to VectorToRasterTask

field output_feature: Feature [Required]

A feature which should contain the newly rasterized data.

Validated by:
  • _check_temporal_nature_match

field output_folder_key: str [Required]

The storage manager key pointing to the output folder.

Validated by:
  • validate_storage_key

field overlap_value: float | None = None

Value to write over the areas where polygons overlap.

field polygon_buffer: float = 0

The size of polygon buffering to be applied before rasterization.

field preprocess_dataset: Preprocessing | None = None

Parameters used by self.preprocess_dataset method. Skipped if set to None.

field raster_shape: Tuple[int, int] | None = None

Shape of resulting raster image. Cannot be used with resolution.

Validated by:
  • cannot_be_used_with_resolution

field raster_value: float | None = None

Value to be used for all rasterized polygons. Cannot be used with raster_values_column.

field raster_values_column: str | None = None

GeoPandas column for reading per-geometry rasterization values. Cannot be used with raster_value.

Validated by:
  • cannot_be_used_with_raster_value

field resolution: float | None = None

Rendering resolution in meters. Cannot be used with raster_shape.

field vector_input: Feature | str [Required]

An input filename or a feature containing vector data.

Validated by:
  • _check_vector_input

config: Schema
filter_patch_list(patch_list)[source]

Specifies which EOPatches should be skipped when skip_existing is enabled.

Parameters:

patch_list (List[Tuple[str, BBox]]) –

Return type:

List[Tuple[str, BBox]]

run_procedure()[source]

Execution procedure of pipeline. Can be overridden if needed.

By default, builds the workflow by using a build_workflow method, which must be additionally implemented.

Returns:

A pair of lists representing successful and unsuccessful executions.

Return type:

tuple[list[str], list[str]]

run_dataset_preprocessing(filename, preprocess_config)[source]

Loads datasets, applies preprocessing steps and saves them to a cache folder

Parameters:
Return type:

None

build_workflow()[source]

Creates workflow that is divided into the following sub-parts:

  1. loading data,

  2. preprocessing steps,

  3. rasterization of features,

  4. postprocessing steps,

  5. saving results

Return type:

EOWorkflow

preprocess_dataset(dataframe)[source]

Method for applying custom preprocessing steps on the entire dataset

Parameters:

dataframe (GeoDataFrame) –

Return type:

GeoDataFrame

get_prerasterization_node(previous_node)[source]

Builds node with tasks to be applied after loading vector feature but before rasterization

Parameters:

previous_node (EONode) –

Return type:

EONode

get_rasterization_node(previous_node)[source]

Builds nodes containing rasterization tasks

Parameters:

previous_node (EONode) –

Return type:

EONode

get_postrasterization_node(previous_node)[source]

Builds node with tasks to be applied after rasterization

Parameters:

previous_node (EONode) –

Return type:

EONode