ASCII file with 2 Level and optional delimiters

I am trying to parse an ASCII file (example below) that has 2 levels of grouping. At the first level are headers like ‘HEADING’, ‘TITLE’, ‘PARA’ and ‘FOOTNOTE’ and within each of these groups there are sub-headers like ‘author’. ‘date’ etc.
Each record always starts with ‘HEADING’ and while the sequence of these headers is fixed they may or may not be present {so after HEADING, TITLE could be missing and the next heading would then be PARA}


HEADING
data

TITLE
anchor data
file data

PARA
author data
date data
file data

FOOTNOTE
citation data

HEADING
data

TITLE
anchor data
file data
...
...

Here is how I am processing this
Step1: <> Reads the file and creates records with each first level header’s content as separate fields. Since each record always starts with ‘HEADING’ I am using that as record level delimiter (and other headings like FOOTNOTE as field level delimiter)

Step2:<> feeds to multiple <> to extract level2 data from each input record {So here one of the readers with break down say field ‘TITLE’ to a record containing fields ‘anchor’ and ‘file’, another reader will break down field ‘PARA’ to a record containing ‘author’, ‘date’ and ‘file’}

Step3: Each <> writes data to appropriate table using <>

Now I have 2 questions
Some of the headings like ‘FOOTNOTE’ may or may not be present. And when they are not present, the processing fails saying that ‘too few records’. In the metadata, I tried setting ‘nullable’ as true for such fields but the error persisted. How do I tell clover metadata that a field is optional

My 2nd level Universal Readers have input set to something like this

port:$0.field2:discrete 

but in each of these readers I also want a portion of

port:$0.field1

(which is the unique identifier). This is how to ensure that each second level record will have a unique ID identifying the parent record.

May be my whole approach is incorrect; would you be able to suggest a better one
CloverETL Designer Community Version: 3.5.0.058

Hi,

We have a special component ComplexDataReader for such non-homogenous data in commercial Designer. It serves for exactly this purpose.

http://doc.cloveretl.com/documentation/ … eader.html

It would be quite difficult and inconvenient to emulate functionality of this component with UniversalDataReaders and Reformats. I would read the input file using metadata with one string field and new line character as a record delimiter and then parse the lines with some set of rules in Reformat. You will probably also need a few global variables to store headers you already read.

Regards,