eo-grow Tutorial

The main objects in eo-grow package are structured like this

eo-grow flowchart

Let’s take a close look in each of them

Config

Most objects in eo-grow are configured with a pydantic model. The class of the configuration model is attached to the eo-grow object as a Schema class (e.g. Pipeline.Schema).

These configuration objects are great for parsing and validating, but the package also provides some utility functions/methods for working with dictionaries. - from_raw_config creates the object from a dictionary, e.g. Pipeline.from_raw_config(params_dict). - from_path reads the configuration from a .json file and creates the appropriate object.

[1]:
from eogrow.core.base import EOGrowObject


class TestObject(EOGrowObject):
    class Schema(EOGrowObject.Schema):
        test_param: int
        test_string: str = "default"


test_config = {"test_param": 3}
test_object = TestObject.from_raw_config(test_config)
test_object.config
[1]:
Schema(test_param=3, test_string='default')

The package also offers utility in the form of a config language. It supports variables, combining of configs, relative paths and more.

The interpretation of this language is done automatically with from_path, but not in from_raw_config. One can also load an interpreted config dictionary with the function interpret_config_from_path. This dictionary can then be adjusted and passed to the appropriate object.

[2]:
import os

from eogrow.core.config import interpret_config_from_path

CONFIG_FOLDER = os.path.join("..", "tests", "test_config_files", "other")
CONFIG_FILE = os.path.join(CONFIG_FOLDER, "simple_config.json")

config = interpret_config_from_path(CONFIG_FILE)
config
[2]:
{'pipeline': 'SimplePipeline',
 'test_param': 10,
 'test_subset': [0, 'eopatch-id-1-col-0-row-1'],
 'workers': 3,
 'logging': {'save_logs': True,
  'show_logs': True,
  'capture_warnings': True,
  'manager': 'eogrow.core.logging.LoggingManager'},
 'area': {'manager': 'eogrow.core.area.UtmZoneAreaManager',
  'area': {'filename': 'test_area.geojson', 'buffer': 0.001},
  'patch': {'size_x': 2400, 'size_y': 1100, 'buffer_x': 120, 'buffer_y': 55},
  'offset_x': 0,
  'offset_y': 0},
 'storage': {'manager': 'eogrow.core.storage.StorageManager',
  'project_folder': '/home/zluksic/Documents/Projects/eo-grow/tests/test_project',
  'structure': {'data': 'data',
   'batch_data': 'batch-data',
   'data_2019': 'data-2019',
   'data_custom_range': 'data-custom-range',
   'data_sampled': 'data-sampled',
   'features': 'features',
   'features_sampled': 'features-sampled',
   'training_data': 'training_data',
   'reference': 'reference',
   'models': 'models',
   'predictions': 'predictions',
   'predictions_to_map': 'predictions_to_map',
   'maps': 'maps',
   'temp': 'temp'}}}

StorageManager

Object in charge of folder structure of data. It contains a definition of entire folder structure

[3]:
from eogrow.core.storage import StorageManager

storage = StorageManager.from_raw_config(config["storage"])

storage
[3]:
<eogrow.core.storage.StorageManager at 0x7f2be29e50d0>

The following folders are always defined in the folder structure

[4]:
print(storage.get_input_data_folder())
print(storage.get_cache_folder())
print(storage.get_logs_folder())
input-data
cache
logs

Any other folder is custom defined

[5]:
storage.get_folder("data", full_path=True)
[5]:
'/home/zluksic/Documents/Projects/eo-grow/tests/test_project/data'

AreaManager

The object in charge of splitting and managing area of interest (AOI)

[6]:
from eogrow.core.area import UtmZoneAreaManager

area_manager = UtmZoneAreaManager.from_raw_config(config["area"], storage)
[7]:
geometry = area_manager.get_area_geometry()

geometry.geometry
[7]:
../_images/examples_basic-tutorial_13_0.svg
[9]:
from sentinelhub import CRS

grid = area_manager.get_grid()
grid[CRS(32638)]
[9]:
eopatch_name geometry
0 eopatch-id-0-col-0-row-0 POLYGON ((729480.000 4390045.000, 729480.000 4...
1 eopatch-id-1-col-0-row-1 POLYGON ((729480.000 4391145.000, 729480.000 4...
[10]:
area_manager.get_patch_list()
[10]:
[('eopatch-id-0-col-0-row-0',
  BBox(((729480.0, 4390045.0), (732120.0, 4391255.0)), crs=CRS('32638'))),
 ('eopatch-id-1-col-0-row-1',
  BBox(((729480.0, 4391145.0), (732120.0, 4392355.0)), crs=CRS('32638')))]

Pipeline

The main object in the package is Pipeline. It contains a schema for config parameters and a data-processing procedure.

[11]:
from pydantic import Field

from eogrow.core.pipeline import Pipeline


class SimplePipeline(Pipeline):
    class Schema(Pipeline.Schema):
        test_param: int = Field(..., description="Some integer")

    def run_procedure(self):
        # implement something and return which EOPatches have been successfully processed and which not

        return [], []


pipeline = SimplePipeline.from_raw_config(config)

pipeline.run()
INFO eogrow.core.pipeline:225: Running SimplePipeline
INFO eogrow.core.pipeline:237: Pipeline finished successfully!