UTF 8 Reading Error

I am reading in a csv file and having the following error. The error goes away if I switch to ISO-8859-1. But I set it to UTF-8 is because the input file has unicodes and I need to preserve them. I thought UTF-8 can read in anything? What did I do wrong?

Thanks,
Perri


  Component [UniversalDataReader:UNIVERSAL_DATA_READER4] finished with status ERROR. (Out0: 2613 recs)
   Error when parsing record #2614 field effdate value "23-N"
    Character decoding error occurred. Set correct charset. Current charset is UTF-8
     Input length = 1
------------------------------------------------------------------------------------------------------
ERROR [main] - Execution of graph failed !

Hi Perri,

Unfortunately UTF-8 does not read everything. It may happen that parser reads sequence of characters which is invalid. See http://en.wikipedia.org/wiki/UTF-8#Inva … _sequences

In CloverETL it should be possible to use lenient/controlled policy on readers and skip invalid lines. But it will be available since 4.1M1 - see https://bug.javlin.eu/browse/CLO-5043 for details.

So current workaround may be to cleanup data before processing in CloverETL by external tool.

I hope this helps.