Hello,
I just started to use CloverETL and am having trouble with something conceptually. It doesn’t make a lot of sense to me, so I feel like I must be misunderstanding something about CloverETL philosophically that could impact things downstream.
I understand why an edge immediately after a file reader has instructions on how each record should be interpreted: is it a record with fixed length fields? Delimited? Mixed?
Once that is done, my mental model believes the record has been processed into some internal data structure. As such, why does the metadata for any downstream edges have any need of the concept of “fixed”, “delimited” or “mixed”? Indeed, why do the parameters for those matter anymore? Why is size, delimter, etc, necessary downstream?
Is my mental model broken? Are the records kept in source form as they pass through? What am I missing?
Thanks,
Brad
Can anyone confirm one way or the other if I’m missing something or I just need to adjust my mental model?
Hello Brad,
basically you’re right. All this information is needed only when parsing and formatting data and usually it doesn’t matter what metadata (fixed/delimited/mixed) and delimiters are defined on the edges that are not connected to a Reader or Writer. On such edges only number of fields and theirs types are USUALLY important. BUT there are some other components, that can use this information, e.g. SystemExecute or database bulk loaders format the incoming data according to the metadata on input edge.
But, as you wrote, on many edges this information is indeed redundant and is not used. We don’t have an abstract metadata, so even if this information is not needed Clover requires it, as it doesn’t know what you would like to use given metadata for. Good example for transformation, that precisely metadata type is really important is following graph:
READER – delimited metadata → SIMPLE_COPY – fixed metadata → WRITER
In above graph you change the delimited file to fixlength one without any transformation component.