Parsing error: Unexpected end of file in record

Hi,

I found out that if my data columns does not match the number of columns in the FMT file, I would get this error:

WARN [INPUT_1] - Parsing error: Unexpected end of file in record # 3 in field # 40

But Clover will not stop, still parse through my data, and give me erroneous results. Is there a way to throw an exception in this case and stop running Clover?

Thanks,
albert

If you have dataPolicy=strict execution of graph should stop. For other values it continues.

Hi Agata,

Sorry my first message was not too clear… This is the situation:

Data1.FMT:

<?xml version="1.0" encoding="UTF-8"?>

Data2.FMT:

<?xml version="1.0" encoding="UTF-8"?>

Data1.txt:

1, “albert”, “usa”
2, “charles”, “usa”, "1234567890

Data2.txt:
"
1, “ABC”, “M”
2, “XYZ”,“M”

When I try to join the 2 inputs using joinKey=ID, and set dataPolicy=Controlled, I get this join result:

1, “albert”, “usa”
2, “charles”, “usa”, “1234567890”,“ABC”, “M”

With the missing field (Phone) in Data1.txt in the first row, the “carriage-return line-feed” (\r\n) at the end of the first row and the second row (2, “charles”, “usa”, “1234567890”) are combined as the missing Phone field for the first row. This first row (1, “albert”, “usa”\r\n2, “charles”, “usa”, “1234567890”) would be joined with the first row (1, “ABC”, “M”) in Data2.txt to give us the join result above.

Is this a bug?? Maybe in this situation, the correct result should be ignoring the first row since it is missing field in Data1.txt, and join only the second row… so the correct join result might be:

2, “charles”, “usa”, "1234567890, “XYZ”,“M”

Any help/suggestion is greatly appreciated.

Thanks,
albert

Hi,
it only seems as you wrote above :wink: , but in fact you have only one record in output, but “broken” in the middle. When you put Reformat after the joiner with reformat class like this:

import org.jetel.component.DataRecordTransform;
import org.jetel.data.DataRecord;
import org.jetel.exception.TransformException;
import org.jetel.util.string.StringUtils;


public class Transform extends DataRecordTransform {

	@Override
	public boolean transform(DataRecord[] arg0, DataRecord[] arg1)
			throws TransformException {
		for (int i = 0 ; i < arg1[0].getNumFields(); i++) {
			arg1[0].getField(i).setValue(StringUtils.specCharToString(arg0[0].getField(i).toString()));
		}
		return true;
	}

}

you see that this is only one record:
1 |“albert” |“usa”\r\n2 |“charles”, “usa”, "1234567890 |“ABC” |“M”
joined by “1”. There is no joined records by “2”, because there is not such record in master input.

Hi Agata,

My Data1.txt has 2 records:

1, “albert”, “usa”
2, “charles”, “usa”, "1234567890

and my Data2.txt has 2 records:

1, “ABC”, “M”
2, “XYZ”,“M”

so shouldn’t the output have 2 records also??

The first record (ID=1) in Data1.txt has a missing Phone field at the end, so Clover read the whole row in the second record (ID=2) as the missing Phone field for the first record. Thus you see the 1 record as output:
1 |“albert” |“usa”\r\n2 |“charles”, “usa”, "1234567890 |“ABC” |“M”

This is not correct. Either we should not join ID=1 coz it has a missing Phone field, or we should join both records ID=1 and ID=2.

Thanks,
albert

Hi,
you’re right, but to achieve the result as you expect you should have following metadata

<?xml version="1.0" encoding="UTF-8"?>
<Record fieldDelimiter="," name="test" recordDelimiter="\r\n" type="delimited">
<Field name="ID" nullable="true" type="string"/>
<Field name="Name" nullable="true" type="string"/>
<Field name="Country" nullable="true" type="string"/>
<Field name="Phone" nullable="true" type="string"/>
</Record>

As you didn’t have recordDelimiter, Clover look for each field delimiter, so “\r\n” was part of the field and the next found “\r\n” was regarded as first one.

Hi Agata,

Thanks for the suggestion… It’s behaving correctly now :slight_smile:

albert

Hi Agata,

On a side note, the Data Record XML DTD (http://wiki.clovergui.net/doku.php?id=g … s:metadata) is not updated with your suggested metadata.

Your suggested metadata has fieldDelimiter attribute in the Record element, but the web page’s DTD did not have it.

albert

Hi, documentation was updated :slight_smile: