Read in a line of text into each record

This seems a simple question. Sorry if this has been answered elsewhere.

I want to load lines of text from a flat file. The data is delimited, but is not clean. So, I want to load each line as a record, and then use some reformats to clean it up and parse it into fields.

Can I do this with the Universal_data_reader component? It looks like I must set up my metadata with the correct combination of field and record delimiters, maybe?

I keep on getting the error
" Component [UniversalDataReader:UNIVERSAL_DATA_READER] finished with status ERROR. (Out0: 0 recs)
Parsing error: Field delimiter was not found in record 1, field 1 (“TextLine”), metadata “rawText”; value: ‘SimpleDataParser does not provide raw record.’

or

Component [UniversalDataReader:UNIVERSAL_DATA_READER] finished with status ERROR. (Out0: 0 recs)
Parsing error: Unexpected default field delimiter, probably record has too many fields. in record 1, field 1 (“TextLine”), metadata “rawText”; value: ‘<Raw record data is not available, please turn on verbose mode.>’

Hi Paul,

Thank you for reaching out to us. Yes, you can use the Universal Data Reader to accompish this task. I’ve attached a graph that somewhat replicates your use-case. As you can see from the graph, Paul, I read in a simple text file using the FlatFileReader; then I created a user-defined meta-data with one field (field1); I want each record to occupy field1 and thus I changed the default delimiter property from “|” to a blank value. I also changed the EOF (End of File) property to the value “true” because as a requirement, all valid CSV files read by the Universal Data Reader must contain an empty line at the end, per CSV specification. Since it’s not a necessity for our text file, we need to set the EOF as an alternative record delimiter to be able to parse the file correctly. With that being said, Paul, kindly change your meta-data configuration to reflect the one in the attached graph and please let me know if that fixes your issue.

Best Regards,
George Darvehn

I have been banging away at this for a day. This has got to be something stupid that I am missing.

My goal is to read a file that has newlines instead of NL/CR. This should be simple. Every line in the input file should come in as a single record. I will later parse these single records out.

I can’t get past the first step. Please see the included nonworking example. Any help would be greatly appreciated.

<?xml version="1.0" encoding="UTF-8"?>
<Graph author="phb05" created="Thu Dec 12 09:39:35 EST 2019" guiVersion="5.2.0.30" id="1576163409713" licenseCode="CLP1DHEALT19932837BY" name="readnewlinedelimited" showComponentDetails="true">
<Global>
<Metadata id="Metadata1">
<Record fieldDelimiter="\r\n" name="inputFormat" previewAttachmentCharset="UTF-8" recordDelimiter="\r\n" type="delimited">
<Field delimiter=" " name="outputStream" type="string"/>
</Record>
</Metadata>
<Metadata id="Metadata0">
<Record fieldDelimiter="\n" name="outputFormat" previewAttachmentCharset="UTF-8" recordDelimiter="\r\n" type="delimited">
<Field delimiter=" " name="outputStream" type="string"/>
</Record>
</Metadata>
<GraphParameters>
<GraphParameterFile fileURL="workspace.prm"/>
</GraphParameters>
<RichTextNote backgroundColor="FAF6D6" folded="false" fontSize="medium" height="275" id="Note0" textColor="444444" width="202" x="433" y="108">
<attr name="text"><![CDATA[h3. Generate a single line of text with multiple new line characters
]]></attr>
</RichTextNote>
<RichTextNote backgroundColor="FAF6D6" folded="false" fontSize="medium" height="275" id="Note1" textColor="444444" width="275" x="712" y="98">
<attr name="text"><![CDATA[h3. Try to read this in as multiple records]]></attr>
</RichTextNote>
<Dictionary/>
</Global>
<Phase number="0">
<Node fileURL="port:$0.outputStream:discrete" guiName="FlatFileReader" guiX="752" guiY="211" id="FLAT_FILE_READER" type="FLAT_FILE_READER"/>
<Node guiName="Single record" guiX="456" guiY="222" id="SINGLE_RECORD" type="GET_JOB_INPUT">
<attr name="mapping"><![CDATA[//#CTL2

// Transforms input record into output record.
function integer transform() {
	 $out.0.outputStream = "Line One"+"\n"+ "Line Two"+ "\n"+ "Line Four"+ "\n";

	return ALL;
}

// Called during component initialization.
// function boolean init() {}

// Called during each graph run before the transform is executed. May be used to allocate and initialize resources
// required by the transform. All resources allocated within this method should be released
// by the postExecute() method.
// function void preExecute() {}

// Called only if transform() throws an exception.
// function integer transformOnError(string errorMessage, string stackTrace) {}

// Called during each graph run after the entire transform was executed. Should be used to free any resources
// allocated within the preExecute() method.
// function void postExecute() {}

// Called to return a user-defined error message when an error occurs.
// function string getMessage() {}
]]></attr>
</Node>
<Node guiName="Trash" guiX="1084" guiY="171" id="TRASH" type="TRASH"/>
<Edge fromNode="FLAT_FILE_READER:0" guiBendpoints="" guiRouter="Manhattan" id="Edge3" inPort="Port 0 (in)" metadata="Metadata0" outPort="Port 0 (output)" toNode="TRASH:0"/>
<Edge fromNode="SINGLE_RECORD:0" guiBendpoints="" guiRouter="Manhattan" id="Edge0" inPort="Port 0 (input)" metadata="Metadata1" outPort="Port 0 (out)" toNode="FLAT_FILE_READER:0"/>
</Phase>
</Graph>

Hi Paul,

Yes, the devil is in the details. The issue was that you defined the “\n” as the field delimiter while you really need it as the record delimiter. That’s why the parser complained about too many fields, etc. So you you need to set the record delimiter to “\n” and the field delimiter to nothing (empty string) as you don’t want to separate any fields. PFA your modified graph to see what I changed on the metadata and let me know if it works for you.

Also, as a side note please notice this: https://doc.cloverdx.com/latest/designer/defining-non-default-delimiter-for-field.html#defining-non-default-delimiter-for-field, mainly the important block as this behavior has the potential to confuse some users.

Best regards.

Lukas,
This fixed the problem. Thank you for your help.