Split/Partition records into multiple output files

Hi,

I want to split a csv file into different output files based on a specific column.
The number of output files are determined at runtime because at design time I don’t know how many distinct keys I have.

How can I achive this?
Any ideas?

Thx,
Frank

Well, Clover supports splitting output into files by num of records or size but this requires different approach.

I see two options:

a) This might be done as multiple iterations - read data, filter only those needed, save them to file. Such template graph can be modified by using parameters and just executed so many times as is needed.

b) - again template graph with partition component and several outputs/writers - this would require the graph be dynamically generated/modified - which is achievable.

This is not easy task since it basically requires different transformation to be performed based on some external parameter. But it is doable and the option a) is relatively easy to implement.

One other thought - this could be solved by creating specialized component - variant of UniversalWriter (actually implementation of Formatter interface) which could do this directly.

If you would be interested in this solution contact me at david.pavlis centrum.cz for further discussion - I could give you some hints.