eogrow.pipelines.download_batch

Download pipeline that works with Sentinel Hub batch service.

pydantic model eogrow.pipelines.download_batch.InputDataSchema[source]

Bases: Schema

Parameter structure for a single data collection used in a batch request.

Create a new model by parsing and validating input data from keyword arguments.

Raises ValidationError if the input data cannot be parsed to form a valid model.

Fields:
field data_collection: DataCollection [Required]

Data collection from which data will be downloaded. See utils.validators.parse_data_collection for more info on input options.

Validated by:
field maxcc: float | None = None

Maximal cloud coverage filter.

Constraints:
  • minimum = 0

  • maximum = 1

field mosaicking_order: MosaickingOrder | None = None

The mosaicking order used by Sentinel Hub service

field other_params: dict [Optional]

Additional parameters to be passed to SentinelHubRequest.input_data method as other_args parameter.

field resampling_type: ResamplingType = ResamplingType.NEAREST

A type of downsampling and upsampling used by Sentinel Hub service

field time_period: Tuple[date, date] | None = None
Validated by:
  • optional_parse_time_period

pydantic model eogrow.pipelines.download_batch.BatchGridSchema[source]

Bases: Schema

Configuration for the batch grid.

Create a new model by parsing and validating input data from keyword arguments.

Raises ValidationError if the input data cannot be parsed to form a valid model.

Fields:
field bbox_buffer: tuple[float, float] = (0, 0)

Buffer of the bounding box in meters.

field bbox_offset: tuple[float, float] = (0, 0)

Offset of the bounding box in meters.

field bbox_size: tuple[int, int] [Required]

Size of the bounding box in meters.

field geometry_filename: str [Required]

Name of the file that defines the AoI geometry, located in the input data folder.

field image_size: tuple[int, int] | None = None

Size of the image in pixels.

field resolution: int | None = None

Resolution of the image in meters.

Validated by:
  • cannot_be_used_with_image_size

class eogrow.pipelines.download_batch.BatchDownloadPipeline(*args, **kwargs)[source]

Bases: Pipeline

Pipeline to start and monitor a Sentinel Hub Batch Process API job

The pipeline creates a custom grid using the UtmZoneSplitter under the hood and saves it to the grid location provided via the CustomGridAreaManager.

Parameters:
  • config – A dictionary with configuration parameters

  • raw_config – The configuration parameters pre-validation, for logging purposes only

  • args (Any) –

  • kwargs (Any) –

NAME_COLUMN = 'identifier'
pydantic model Schema[source]

Bases: Schema

Create a new model by parsing and validating input data from keyword arguments.

Raises ValidationError if the input data cannot be parsed to form a valid model.

Fields:
  • analysis_only (bool)

  • area (eogrow.core.area.custom_grid.CustomGridAreaManager.Schema)

  • batch_id (str)

  • batch_output_kwargs (dict)

  • evalscript_folder_key (str)

  • evalscript_path (str)

  • grid (eogrow.pipelines.download_batch.BatchGridSchema)

  • iam_role_arn (str)

  • input_patch_file (None)

  • inputs (List[eogrow.pipelines.download_batch.InputDataSchema])

  • monitoring_analysis_sleep_time (int)

  • monitoring_sleep_time (int)

  • output_folder_key (str)

  • patch_list (None)

  • save_userdata (bool)

  • skip_existing (Literal[False])

  • tiff_outputs (List[str])

field analysis_only: bool = False

If set to True it will only create a batch request and wait for analysis phase to finish. It will not start the actual batch job.

field area: CustomGridAreaManager.Schema [Required]
Validated by:
field batch_id: str = ''

An ID of a batch job for this pipeline. If it is given the pipeline will just monitor the existing batch job. If it is not given it will create a new batch job.

field batch_output_kwargs: dict [Optional]

Any other arguments to be added to a dictionary of parameters. Passed as **kwargs to the output method of BatchProcessClient during the creation process.

field evalscript_folder_key: str = 'input_data'

Storage manager key pointing to the path where the evalscript is loaded from.

Validated by:
  • validate_storage_key

field evalscript_path: str [Required]
field grid: BatchGridSchema [Required]

Configuration for the batch grid.

field iam_role_arn: str [Required]

IAM role ARN for the batch job.

field input_patch_file: None = None
field inputs: List[InputDataSchema] [Required]
field monitoring_analysis_sleep_time: int = 10

How many seconds to sleep between two consecutive queries about a status of a batch job analysis phase. It should be at least 5 seconds.

Constraints:
  • minimum = 5

field monitoring_sleep_time: int = 120

How many seconds to sleep between two consecutive queries about status of tiles in a batch job. It should be at least 60 seconds.

Constraints:
  • minimum = 60

field output_folder_key: str [Required]

Storage manager key pointing to the path where batch results will be saved.

Validated by:
  • validate_storage_key

field patch_list: None = None
field save_userdata: bool = False

A flag indicating if userdata.json should also be one of the results of the batch job.

field skip_existing: Literal[False] = False
field tiff_outputs: List[str] [Optional]

Names of TIFF outputs of a batch job

config: Schema
area_manager: CustomGridAreaManager
run_procedure()[source]

Procedure that uses Sentinel Hub batch service to download data to an S3 bucket.

Return type:

tuple[list[str], list[str]]