This is a follow-up to this: http://forum.cloveretl.com/viewtopic.php?f=4&t=4569
This issue continues to plague me. DataParser seems to work ok when its comma-delimited, but when I’m specifying other field & record delimiters, I am frequently unable to parse perfectly valid delimited data files.
I noticed that if I reverted back to 2.8.1, I can parse the file fine. Fails w/ 3.1.0 & 3.1.2. Situation is the same as I described in my earlier post, seems to end up eating the last record on the row along w/ the delimiter resulting an exception. Exception is usually at DataParser L437 calling parsingErrorNotFound.
Replacing that with the following logic seems to resolve the issue for me, end of record delimiters were not taking into account stuff still being in the delimiter searcher/buffer.
if (fieldBuffer.length() - delimiterSearcher.getMatchLength() > 0) {
fieldBuffer.delete(fieldBuffer.length() - delimiterSearcher.getMatchLength(), fieldBuffer.length());
break;
}
else if (fieldBuffer.length() - delimiterSearcher.getMatchLength() == 0 && fieldCounter == numFields-1) {
// last record, but empty
break;
}
else return parsingErrorFound("Unexpected record delimiter, probably record has too few fields.", record, fieldCounter);
I’d like to avoid having a forked version of clover w/ this fix included, is there a workaround I can use?
Thanks