eogrow.pipelines.download_batch

Download pipeline that works with Sentinel Hub batch service.

pydantic model eogrow.pipelines.download_batch.InputDataSchema[source]

Bases: Schema

Parameter structure for a single data collection used in a batch request.

Create a new model by parsing and validating input data from keyword arguments.

Raises ValidationError if the input data cannot be parsed to form a valid model.

Fields:
field data_collection: DataCollection [Required]

Data collection from which data will be downloaded. See utils.validators.parse_data_collection for more info on input options.

Validated by:
field maxcc: float | None = None

Maximal cloud coverage filter.

Constraints:
  • minimum = 0

  • maximum = 1

field mosaicking_order: MosaickingOrder | None = None

The mosaicking order used by Sentinel Hub service

field other_params: dict [Optional]

Additional parameters to be passed to SentinelHubRequest.input_data method as other_args parameter.

field resampling_type: ResamplingType = ResamplingType.NEAREST

A type of downsampling and upsampling used by Sentinel Hub service

field time_period: Tuple[date, date] | None = None
Validated by:
  • optional_parse_time_period

class eogrow.pipelines.download_batch.BatchDownloadPipeline(*args, **kwargs)[source]

Bases: Pipeline

Pipeline to start and monitor a Sentinel Hub batch job

Parameters:
  • config – A dictionary with configuration parameters

  • raw_config – The configuration parameters pre-validation, for logging purposes only

  • args (Any) –

  • kwargs (Any) –

pydantic model Schema[source]

Bases: Schema

Create a new model by parsing and validating input data from keyword arguments.

Raises ValidationError if the input data cannot be parsed to form a valid model.

Fields:
  • analysis_only (bool)

  • area (eogrow.core.area.batch.BatchAreaManager.Schema)

  • batch_id (str)

  • batch_output_kwargs (dict)

  • evalscript_folder_key (str)

  • evalscript_path (str)

  • input_patch_file (None)

  • inputs (List[eogrow.pipelines.download_batch.InputDataSchema])

  • monitoring_analysis_sleep_time (int)

  • monitoring_sleep_time (int)

  • num_retries (int)

  • output_folder_key (str)

  • patch_list (None)

  • save_userdata (bool)

  • skip_existing (Literal[False])

  • tiff_outputs (List[str])

field analysis_only: bool = False

If set to True it will only create a batch request and wait for analysis phase to finish. It will not start the actual batch job.

field area: BatchAreaManager.Schema [Required]
Validated by:
field batch_id: str = ''

An ID of a batch job for this pipeline. If it is given the pipeline will just monitor the existing batch job. If it is not given it will create a new batch job.

field batch_output_kwargs: dict [Optional]

Any other arguments to be added to a dictionary of parameters. Passed as **kwargs to the output method of SentinelHubBatch during the creation process.

field evalscript_folder_key: str = 'input_data'

Storage manager key pointing to the path where the evalscript is loaded from.

Validated by:
  • validate_storage_key

field evalscript_path: str [Required]
field input_patch_file: None = None
field inputs: List[InputDataSchema] [Required]
field monitoring_analysis_sleep_time: int = 10

How many seconds to sleep between two consecutive queries about a status of a batch job analysis phase. It should be at least 5 seconds.

Constraints:
  • minimum = 5

field monitoring_sleep_time: int = 120

How many seconds to sleep between two consecutive queries about status of tiles in a batch job. It should be at least 60 seconds.

Constraints:
  • minimum = 60

field num_retries: int = 0

How many times to retry the batch job if the resulting status is PARTIAL.

Constraints:
  • minimum = 0

field output_folder_key: str [Required]

Storage manager key pointing to the path where batch results will be saved.

Validated by:
  • validate_storage_key

field patch_list: None = None
field save_userdata: bool = False

A flag indicating if userdata.json should also be one of the results of the batch job.

field skip_existing: Literal[False] = False
field tiff_outputs: List[str] [Optional]

Names of TIFF outputs of a batch job

config: Schema
area_manager: BatchAreaManager
run_procedure()[source]

Procedure that uses Sentinel Hub batch service to download data to an S3 bucket.

Return type:

tuple[list[str], list[str]]

cache_batch_area_manager_grid(request_id)[source]

This method ensures that area manager caches batch grid into the storage.

Parameters:

request_id (str) –

Return type:

None