new to clover. Using CloverGUI 2.1.1.
I have a graph where I read all files in a given input directory (UniversalDataReader, fileURL= ${DATAIN_DIR}/*). One of my writers is a UniversalDataWriter where I want the output filename be the same as the input, only with a different extension.
Any ideas are much appreciated.
Thanks,
Additionally, I seem to be having difficulties with the wildcard recognition for the Reader. According to the manual:
http://www.cloveretl.org/_upload/clover-gui/docs/html/manual_html_chunk/ch17.html#d0e8151
The syntax I have for fileURL should grab all files in the directory. However, this doesn’t seem to be the case. It will only grab the first file.
My graph reads all the files from the designated directory. Could you show your graph?
To your first question:
When reading you can add to your metadata field with auto_filling=“source_name” and when writing you can use attributes partition, partitionFileTag, partitionKey and partitionOutFields.
Here’s a simple one. UniversalDataReader with fileURL=${DATAIN_DIR}/*.txt, sending output to Trash.
<?xml version="1.0" encoding="UTF-8"?>
<Graph author="scndmouse" created="Wed Feb 18 19:55:05 GMT-05:00 2009" guiVersion="2.1" id="1235006073559" licenseType="Evaluation license." modified="Thu Feb 19 11:43:40 GMT-05:00 2009" modifiedBy="scndmouse" name="test2" revision="1.77">
<Global>
<Metadata id="Metadata0" previewAttachmentCharset="ISO-8859-1">
<Record fieldDelimiter=":" name="recordName1" previewAttachmentCharset="ISO-8859-1" recordDelimiter="\n" type="delimited">
<Field name="field1" type="string"/>
<Field name="field2" type="string"/>
</Record>
</Metadata>
<Property fileURL="workspace.prm" id="GraphParameter0"/>
</Global>
<Phase number="0">
<Node enabled="enabled" fileURL="${DATAIN_DIR}/*.txt" guiHeight="0" guiName="UniversalDataReader" guiWidth="0" guiX="128" guiY="160" id="DATA_READER0" numRecords="1" skipRows="3" type="DATA_READER"/>
<Edge debugMode="true" fromNode="DATA_READER0:0" guiBendpoints="" id="Edge1" inPort="Port 0 (in)" metadata="Metadata0" outPort="Port 0 (output)" toNode="TRASH0:0"/>
</Phase>
<Phase number="1">
<Node enabled="enabled" guiHeight="0" guiName="Trash" guiWidth="0" guiX="352" guiY="156" id="TRASH0" type="TRASH"/>
</Phase>
</Graph>
allow me to play dumb and ask, where do set the auto_filling attribute using cloverGUI?
Your graph read only one record, because you’ve set it (Max number of records) - if you want to read all records don’t set this parameter.
Auto filling you can set in metadata editor (see Metadata Editor)
ok. changed the parameter and verified that it does read all files. But that’s not exactly what I was hoping for. I was expecting the “Skip rows” and “Max number of rows” parameters to be applied per file. Instead it seems like the Reader is just concatenating both files into a single stream (only skipping the number of rows defined in the first file).
I guess I’m looking for a way to iterate over a set of files and apply the same transform per file.
Your request has been reported (http://bug.cloveretl.org/view.php?id=1502) and will be implemented in next version. In time being you have to use Reformat node, with your transformation skipping records from each input file.
You may also use Dedup component and set the dedup key be name of the file - field, where the name is stored. Then you can specify whether you want first one, two, three,… records from each file.