Skip to content

Data Spaces and Validation of port connections

The goal while defining the in- and out-data of bricks is reusability. Another is simplicity. We need to understand how data is defined to optimize resusability and simplicity and define guidelines for future Bricks.

Data Space of in-ports and out-ports

Using UJO Schema definitions to define valid data for input and output ports leads to a significant problem regarding reusablility of Bricks.

Let's define a filter Brick with input data defined as

input = < variant -> variant >;

Any possible object or map is accepted and can be processed. The filter rule is configurable. A parameter selects the field and another defines the check to be applied. This makes a general purpose filter brick.

The Brick neither knows the content and structure of a Flow Package nor does it change anything. The original Flow Package is simply passed to the next Brick. The output is consequently defined as

output = < variant -> variant >;

Herein lies the problem.

Let's suppose the next Brick requires data in the form of

input = {
    "timestamp" -> timestamp,
    "id"        -> string,
    "speed"     -> float32,
    "power"     -> float32
};

Let's say this is type B.

Type A is our filter input and output type.

Using our approach of a directed match we get a negative result and we can't connect the ports even if the runtime data would match the input of the next Brick.

A matches B = False

The match function depends on the position of the two operands. In our case the result of the operation is different if we exchange the operands.

B matches A = True

From the terminology it is not clear why we get this result. Using matches is somehow misleading. We may use is contained in to be more concise.

A is contained in B = False
B is contained in A = True

We can conclude that in this case the runtime data may match the input port definition of our next Brick.

I propose to inform users about this situation and ask to decide whether to connect the ports or not. The system can do a runtime check of incoming data to avoid malformed Flow Packages to pass.

The reusability is increased and headaches while defining ports is decreased. The False/False case is still excluded and runtime checks are only required for the False/True case.