Metadata Question

Heya,

We have a series of files that we are running through Clover. We have the metadata set up so that the recordDelimiter=“\n” and the DataReader has dataPolicy=“Controlled”. Every so often, one of the source files does not have a “\n” for the final line, so Clover is dropping this row. Is there a way for me to indicate that the recordDelimiter is either “\n” or end-of-file? I can see from the wiki you can have multiple delimiters (e.g. recordDelimiter=“\n\\|\r\n”), but I do not see this case handled.

Thanks,
Anna

Hello Anna,
set eofAsDelimiter=“true” on the last field.

Heya,

This almost works the way I want it to. We are parsing the source file(s) with each row as a single field, then breaking the “field” into separate fields. When I have the FMT as:

<?xml version="1.0" encoding="UTF-8"?>

any source files that do not have an “\n” on the final row of data parses to the correct number of rows. BUT, a source file that does have a “\n” on its final data row (the last row is just the end-of-file character) now parses an extra row with one empty field.

Is there any way to configure the FMT so that both cases will parse the correct number of rows?

Thanks,
Anna

Hello Anna,
when I added missing delimiter for FIELD_INPUT_ROW_NUM field, CloverETL 2.9.2 reads data properly.

Heya,

Turns out the file I was using as a test was messed up (there were some dos-to-linux end-of-line stuff going on) - it now works the way I want it to. :slight_smile:

I did not have to alter the FMT, though. What do you mean by “I added missing delimiter for FIELD_INPUT_ROW_NUM”? We have defined the delimiter in the tag as ‘recordDelimiter=“\n”’ Am I missing something? just asking in case there’s something that’s working that shouldn’t be and could cause an issue later on…

Thanks,
Anna

Hello Anna,
the field has no delimiter, neither default field delimiter is specified. The checkConfig method reports “Graph configuration is invalid (Field delimiter for the field ‘FIELD_INPUT_ROW_NUM’ in the record element ‘RECORD_ONE_FIELD_RECORD_’ not found!).“ I can’t guarantee, that graph with such metadata will always work.

Heya,

Interesting.

I am now testing an upgrade to 2.9.2 with this test case. We run with “-skipcheckconfig” because we are auto-generating our graph and we get a small performance gain by not running that. The FMT I provided runs just fine with “-skipcheckconfig” on.

Omit the “-skipcheckconfig” and I get the error you report.

ERROR - Field delimiter for the field ‘FIELD_INPUT_ROW_NUM’ in the record element ‘RECORD_ONE_FIELD_RECORD_’ not found!

If I add delimiter=“%” to the tag or fieldDelimiter=“%” to the tag, it runs with no warnings.

This delimiter is bogus - The first field (FIELD_INPUT_ROW_NUM) is a auto-generated field, so there is only one field in the file. If I choose a character that might be in my file, it may try to parse as two fields. Is this correct behaviour?

Thanks,
Anna

Hello Anna,
if you add delimiter to the tag it is not taken into account when reading the file, so it can’t cause any problem. But if you use the metadata somewhere else for formatting the data, the graph may fail.

Heya,

OK, I will try adding “delimiter” to the tag in case we decide to run graphs without “-skipcheckconfig” I was concerned because I found that using the fieldDelimiter=“,” to the tag (and there are commas in the file) seemed to cause parsing errors.

Thanks,
Anna