Dynamic number of partitions

Greetings,

I am looking to split a very large XML file for which I’ve already parsed the data individual records of three different types.
I want to split each of those three sets of records into a dynamic number of chunks so as to only have (for example) 1000 records in each chunk. I won’t have any idea how many records are in the original files though, so I would have to calculate the number of chunks dynamically. The Partition component doesn’t appear to let me chunk based on # of records, just on a range of values or a particular partition key, neither of which works for this scenario.

Additionally, if I don’t know how many output ports there will be, how do I bind the output to a writer?

Thanks in advance,
Anye

Hi Anye,

The use case you are describing should be solved by the attribute „Records per file“, which is available in some writers. If you would like to read more about this attribute, you can look into documentation here.

I hope this helps.

Regards,

Hi Anye,

Vladka is right. You can use “Records per file”, just make sure you also include a placeholder $ somewhere in the File URL attribute, otherwise you’ll be overwriting the same file over and over again.

Bits and pieces from online documentation on the subject:
http://doc.cloveretl.com/documentation/UserGuide/topic/com.cloveretl.gui.docs/docs/partitioning-output-into-different-output-files.html.

Perfect! Thanks, gentlemen!