File with different Data Records

pep44 · May 22, 2008, 12:00am

Hallo,

I have to extract a Data from file with fix- length records. However there are in the file two, three or more exact defined records (type of records), but different (with different fields- structure and different length too). For any of the records is it possible to recognize the type of the record, reading certain attribute at certain position. However the difficulties are because there are belonging records in the file and the order (sequence) is important too. For example:

AABBBBBBBBBBBBBBBCCCCCDDDDEEEEEEEEEE
AAAABBBCCCCCCDDDDD
AAAABBBCCCCCCDDDDD
AAAAAABBBCCCCCCDDDDDD
AABBBBBBBBBBBBBBBCCCCCDDDDEEEEEEEEEE
AAAABBBCCCCCCDDDDD
AAAAAABBBCCCCCCDDDDDD
AABBBBBBBBBBBBBBBCCCCCDDDDEEEEEEEEEE
AAAABBBCCCCCCDDDDD

For this example the long records are data about a customers and the short records are a product data. The result file must have a one (fixlen) record for each product- record (in this record will come some customer- Attributes too). So I need program- logic for example after record 1): “consider all records as belonging until you read (meet) the next ‘customer- data record’ again (or EOF)…”
(For the example above the result file must be have 6 records…)

Haw can I extract this type of Data? Is it possible and easy without programming?

Thank You,
best regards,
Peter

avackova · May 23, 2008, 11:30am

Hello,
it is not possible to recognize the records during reading. All you can do is read them all (eg. as delimited records: one field, with record delimiter \n) and them filter them (eg. due to length).

pep44 · May 25, 2008, 11:19pm

Thank You for the answer. That can I use! Everything is very useful for me, because I am a beginner.

If I right understanding you, after filtering I will have in one file the “long” records (customer data) and in another file the “short” records (product data) separated. But the info about the belonging product records to the right customer record is only the sequence (order) of the records in the input file and nothing else. After separation I will lose this info….(Even if I just sort the records in the file whit some sort key I will lose that info too).

I need the logic: after recognizing a customer record all following “product” records (until record<> “product” record) are belonging to the customer record.

If the resolve of the problem is programming, can you give me an advice how I have to act: can I write a class as a sub class of FixLenDataReader or of a DelimitedDataReader ?

Thank you,
Best Regards,
Peter

avackova · May 26, 2008, 7:20am

So, you can do it by Reformat component:

<?xml version="1.0" encoding="UTF-8"?>
<Graph author="avackova" created="Tue May 06 09:12:58 CEST 2008" guiVersion="1.10" id="1210058511322" licenseType="Evaluation license." modified="Mon May 26 09:12:45 CEST 2008" modifiedBy="avackova" name="Transactions" revision="1.15">
<Global>
<Metadata id="Metadata1">
<Record fieldDelimiter="|" name="ref" recordDelimiter="\n" type="delimited">
<Field name="customer" type="string"/>
<Field name="text" type="string"/>
</Record>
</Metadata>
<Metadata id="Metadata0">
<Record fieldDelimiter="|" name="text" recordDelimiter="\n" type="delimited">
<Field name="text" type="string"/>
</Record>
</Metadata>
<Property fileURL="/home/avackova/runtime-New_configuration/test/workspace.prm" id="GraphParameter0"/>
</Global>
<Phase number="0">
<Node enabled="enabled" fileURL="data.txt" guiHeight="0" guiName="Universal Data Reader" guiWidth="0" guiX="73" guiY="66" id="DATA_READER0" type="DATA_READER"/>
<Node enabled="enabled" fileURL="out.txt" guiHeight="0" guiName="Universal Data Writer" guiWidth="0" guiX="557" guiY="64" id="DATA_WRITER0" type="DATA_WRITER"/>
<Node enabled="enabled" guiHeight="0" guiName="Reformat" guiWidth="0" guiX="306" guiY="67" id="REFORMAT0" type="REFORMAT">
<attr name="transform"><![CDATA[string customer;
function transform(){
  if (length($0.text)==36){
     customer = substring($0.text,0,2);
  }
  $0.customer := customer;
  $0.text := $0.text;
}
]]></attr>
</Node>
<Edge fromNode="DATA_READER0:0" guiBendpoints="" id="Edge0" inPort="Port 0 (in)" metadata="Metadata0" outPort="Port 0 (output)" toNode="REFORMAT0:0"/>
<Edge fromNode="REFORMAT0:0" guiBendpoints="" id="Edge1" inPort="Port 0 (in)" metadata="Metadata1" outPort="Port 0 (out)" toNode="DATA_WRITER0:0"/>
</Phase>
</Graph>

Reader reads all records from flat file as one string, then Reformat adds info about customer to each record (to product record as well as to customer record), then you can process all records as you need.

pep44 · May 26, 2008, 7:11pm

Thank You!
It works!

PS
I defined new field “if_filter” (in ),
and my “If” is something like that:
if_filter=””;
if (substring($0.Input_origin_Line, 0,2)!=“B2”){
if_filter = “to filter”;
}
So, in a next step I can filter records with if_filter=“to filter”

avackova · May 27, 2008, 6:42am

You can do it like you have written, or you can have it directly in filter:

<Node  id="EXT_FILTER0" type="EXT_FILTER" filterExpression="substring($0.Input_origin_Line, 0,2)!='B2'"/>

pep44 · June 25, 2008, 12:11pm

Hello,

which is the terminology fort he possibility to embed a code within the XML Graph description. Is that something extra Clover specific?
Is the syntax Java or Java script or only similar?
To embed: <![CDATA …?

I mean e.g.:
…

…

Thank you,

Kind regards,
P.

pep44 · June 25, 2008, 12:17pm

(again with “Disable HTML in this post”)

Hello,

which is the terminology for the possibility to embed a code within the XML Graph description. Is that something extra Clover specific?
Is the syntax Java or Java script or only similar?
To embed: <![CDATA …?

I mean e.g.:
…

<![CDATA[string customer;
function transform(){
if (length($0.text)==36){
customer = substring($0.text,0,2);
}
$0.customer := customer;
$0.text := $0.text;
}
]]>

…

Thank you,

Kind regards,
P.

avackova · June 26, 2008, 7:22am

See my reply on new topic (CDATA in graph), pls.

Topic		Replies	Views
Variable length records CloverDX Platform	2	1	January 19, 2009
Fixed length with multiple sub records file CloverDX Platform	2	3	July 10, 2008
Variant Data Structure CloverDX Platform	1	0	January 19, 2009
Format file CloverDX Platform	1	1	July 16, 2007
Record delimiter for Delimted Files CloverDX Platform	6	3	July 16, 2007

File with different Data Records

Related topics