eogrow.pipelines.download_batch

Download pipeline that works with Sentinel Hub batch service.

pydantic model eogrow.pipelines.download_batch.InputDataSchema[source]

Bases: Schema

Parameter structure for a single data collection used in a batch request.

Create a new model by parsing and validating input data from keyword arguments.

Raises ValidationError if the input data cannot be parsed to form a valid model.

Fields:

data_collection (sentinelhub.data_collections.DataCollection)
maxcc (float | None)
mosaicking_order (sentinelhub.constants.MosaickingOrder | None)
other_params (dict)
resampling_type (sentinelhub.constants.ResamplingType)
time_period (Tuple[datetime.date, datetime.date] | None)

field data_collection: DataCollection [Required]

Data collection from which data will be downloaded. See utils.validators.parse_data_collection for more info on input options.

Validated by:

parse_data_collection

field maxcc: float | None = None

Maximal cloud coverage filter.

Constraints:

minimum = 0
maximum = 1

field mosaicking_order: MosaickingOrder | None = None: The mosaicking order used by Sentinel Hub service

field other_params: dict [Optional]: Additional parameters to be passed to SentinelHubRequest.input_data method as other_args parameter.

field resampling_type: ResamplingType = ResamplingType.NEAREST: A type of downsampling and upsampling used by Sentinel Hub service

field time_period: Tuple[date, date] | None = None

Validated by:

optional_parse_time_period

class eogrow.pipelines.download_batch.BatchDownloadPipeline(*args, **kwargs)[source]

Bases: Pipeline

Pipeline to start and monitor a Sentinel Hub batch job

Parameters:

config – A dictionary with configuration parameters
raw_config – The configuration parameters pre-validation, for logging purposes only
args (Any) –
kwargs (Any) –

pydantic model Schema[source]

Bases: Schema

Create a new model by parsing and validating input data from keyword arguments.

Raises ValidationError if the input data cannot be parsed to form a valid model.

Fields:

analysis_only (bool)
area (eogrow.core.area.batch.BatchAreaManager.Schema)
batch_id (str)
batch_output_kwargs (dict)
evalscript_folder_key (str)
evalscript_path (str)
input_patch_file (None)
inputs (List[eogrow.pipelines.download_batch.InputDataSchema])
monitoring_analysis_sleep_time (int)
monitoring_sleep_time (int)
num_retries (int)
output_folder_key (str)
patch_list (None)
save_userdata (bool)
skip_existing (Literal[False])
tiff_outputs (List[str])

field analysis_only: bool = False: If set to True it will only create a batch request and wait for analysis phase to finish. It will not start the actual batch job.

field area: BatchAreaManager.Schema [Required]

Validated by:

validate_manager

field batch_id: str = '': An ID of a batch job for this pipeline. If it is given the pipeline will just monitor the existing batch job. If it is not given it will create a new batch job.

field batch_output_kwargs: dict [Optional]: Any other arguments to be added to a dictionary of parameters. Passed as **kwargs to the output method of SentinelHubBatch during the creation process.

field evalscript_folder_key: str = 'input_data'

Storage manager key pointing to the path where the evalscript is loaded from.

Validated by:

validate_storage_key

field evalscript_path: str [Required]

field input_patch_file: None = None

field inputs: List[InputDataSchema] [Required]

field monitoring_analysis_sleep_time: int = 10

How many seconds to sleep between two consecutive queries about a status of a batch job analysis phase. It should be at least 5 seconds.

Constraints:

minimum = 5

field monitoring_sleep_time: int = 120

How many seconds to sleep between two consecutive queries about status of tiles in a batch job. It should be at least 60 seconds.

Constraints:

minimum = 60

field num_retries: int = 0

How many times to retry the batch job if the resulting status is PARTIAL.

Constraints:

minimum = 0

field output_folder_key: str [Required]

Storage manager key pointing to the path where batch results will be saved.

Validated by:

validate_storage_key

field patch_list: None = None

field save_userdata: bool = False: A flag indicating if userdata.json should also be one of the results of the batch job.

field skip_existing: Literal[False] = False

field tiff_outputs: List[str] [Optional]: Names of TIFF outputs of a batch job

config: Schema

area_manager: BatchAreaManager

run_procedure()[source]

Procedure that uses Sentinel Hub batch service to download data to an S3 bucket.

Return type:: tuple[list[str], list[str]]

cache_batch_area_manager_grid(request_id)[source]

This method ensures that area manager caches batch grid into the storage.

Parameters:: request_id (str) –
Return type:: None