Config language
An important part of the eo-grow
framework are configuration parameters which are kept separate from the code in a form of JSON files. In addition to the normal JSON syntax the framework implements a set of language rules defining how configuration parameters should be constructed and joined together.
Language rule |
Signature |
Description |
When is evaluated |
Use cases |
---|---|---|---|---|
Config joins |
A dictionary key that starts with |
Evaluation replaces the key with keys and values from the referenced config file. The replacement happens recursively. In case of clashes, parameters that already exist in a config have priority. The reason behind |
When config is read from a file. |
For referencing config files with parameters that are shared between pipelines. This rule aims to reduce config and parameter duplication. |
Path to the config file |
A dictionary value containing |
The signature is a replaced with a path to the current config file. The path is relative to a filesystem and doesn’t end with |
When config is read from a file. |
Can be used to reference another config file with a path that is relative to the current config location. |
Reference a variable |
A dictionary value containing |
The signature is replaced with values written in |
At a pipeline initialization phase. |
This aims to reduce the number of duplicated or correlated config parameters and simplifies config parametrization. |
Comments |
|
The comments are ignored and removed when config is loaded. |
When config is read from a file. |
To explain why a parameter is set to a certain value. |
According to these rules there are 2
stages when rules are applied:
when config is read from a file,
This step is skipped in case configuration parameters are passed to a pipeline object as a dictionary in Python.
at a pipeline initialization phase,
In case configuration is passed to a remote instance this happens on the remote instance.
Additional notes:
Dictionary keys must always be strings.
Config language interpretation supports any nested combination of dictionaries and lists.
Names of variables can only contain letters, numbers and
_
. Don’t use-
,.
or any other characters.So far, config language is not completely OS-agnostic and it might not support Windows file paths.
Pipeline chains
A typical configuration is a dictionary with pipeline parameters. However, it can also be a list of pipeline-execution dictionaries that specify:
pipeline_config
: a configuration for a single pipeline,pipeline_resources
(optional): a dictionary that is passed toray.remote
to configure which resources the main pipeline process will request from the cluster (see here for options). The pipeline requests 1 CPU by default (and nothing else).
The order of dictionaries defines the consecutive order in which pipelines will be run. Example:
[
{
"pipeline_config": {
"pipeline": "FirstPipeline",
"param1": "value1",
...
},
},
{
"pipeline_config": {
"pipeline": "SecondPipeline",
"param2": "value2",
...
},
"pipeline_resources": {"num_cpus": 2}
},
...
]
There is currently no functionality to merge multiple pipeline chains, except by manually concatenating their contents into a single file.