Data Spaces and Validation of port connections
The goal while defining the in- and out-data of bricks is reusability. Another is simplicity. We need to understand how data is defined to optimize resusability and simplicity and define guidelines for future Bricks.
Data Space of in-ports and out-ports
Using UJO Schema definitions to define valid data for input and output ports leads to a significant problem regarding reusablility of Bricks.
Let's define a filter Brick with input data defined as
input = < variant -> variant >;
Any possible object or map is accepted and can be processed. The filter rule is configurable. A parameter selects the field and another defines the check to be applied. This makes a general purpose filter brick.
The Brick neither knows the content and structure of a Flow Package nor does it change anything. The original Flow Package is simply passed to the next Brick. The output is consequently defined as
output = < variant -> variant >;
Herein lies the problem.
Let's suppose the next Brick requires data in the form of
input = {
"timestamp" -> timestamp,
"id" -> string,
"speed" -> float32,
"power" -> float32
};
Let's say this is type B.
Type A is our filter input and output type.
Using our approach of a directed match we get a negative result and we can't connect the ports even if the runtime data would match the input of the next Brick.
A matches B = False
The match function depends on the position of the two operands. In our case the result of the operation is different if we exchange the operands.
B matches A = True
From the terminology it is not clear why we get this result. Using matches
is
somehow misleading. We may use is contained in
to be more concise.
A is contained in B = False
B is contained in A = True
We can conclude that in this case the runtime data may match the input port definition of our next Brick.
I propose to inform users about this situation and ask to decide whether to connect the ports or not. The system can do a runtime check of incoming data to avoid malformed Flow Packages to pass.
The reusability is increased and headaches while defining ports is decreased.
The False/False
case is still excluded and runtime checks are only required
for the False/True
case.