[ParallelReader] How to skip first source row of .csv (header with fields)?

Support/help with CloverETL (4.9) and CloverDX (5.0 or newer) implementation problems

cassydeb
Posts: 7
Joined: Tue Feb 02, 2016 8:06 am

[ParallelReader] How to skip first source row of .csv (header with fields)?

Postby cassydeb » Mon Feb 15, 2016 11:13 pm

Hi,
With UniversalDataReader there is a property "Number of skipped records" so we can easily remove header with fields name of a csv file.
How to do this with ParallelReader?
Thanks for your help!


(To moderator:
Due to this policy, your post may not appear immediately but may take a day (max 2) to be processed.

I have already post friday but I still don't see it...
So it can be a dupplicate.)

cholastal
Posts: 135
Joined: Tue Sep 01, 2015 1:22 pm

Re: [ParallelReader] How to skip first source row of .csv (header with fields)?

Postby cholastal » Mon Feb 22, 2016 1:04 pm

Hi

Indeed, there is currently no such property as 'Number of skipped records' in the ParallelReader component, however, there is already an improvement proposal in our system.

This property is not available due to the nature of the component. It uses multiple threads while each is reading only one part of the input file. As the result the file is read much faster, however, the order of source records may not be preserved. Thus it is impossible to determine which row was first in the source file.

Maybe you can filter the header out with the ExtFilter component based on some distinguishable differences between the header and the regular records.

If the above isn't suitable for you, you'll need to use the UniversalDataReader until the ParallelReader is improved.

Best regards

---
Lukas Cholasta
CloverCARE Support
CloverDX

Visit us online at http://www.cloverdx.com