File with different Data Records

Hallo,

I have to extract a Data from file with fix- length records. However there are in the file two, three or more exact defined records (type of records), but different (with different fields- structure and different length too). For any of the records is it possible to recognize the type of the record, reading certain attribute at certain position. However the difficulties are because there are belonging records in the file and the order (sequence) is important too. For example:

  1. AABBBBBBBBBBBBBBBCCCCCDDDDEEEEEEEEEE
  2. AAAABBBCCCCCCDDDDD
  3. AAAABBBCCCCCCDDDDD
  4. AAAAAABBBCCCCCCDDDDDD
  5. AABBBBBBBBBBBBBBBCCCCCDDDDEEEEEEEEEE
  6. AAAABBBCCCCCCDDDDD
  7. AAAAAABBBCCCCCCDDDDDD
  8. AABBBBBBBBBBBBBBBCCCCCDDDDEEEEEEEEEE
  9. AAAABBBCCCCCCDDDDD

For this example the long records are data about a customers and the short records are a product data. The result file must have a one (fixlen) record for each product- record (in this record will come some customer- Attributes too). So I need program- logic for example after record 1): “consider all records as belonging until you read (meet) the next ‘customer- data record’ again (or EOF)…”
(For the example above the result file must be have 6 records…)

Haw can I extract this type of Data? Is it possible and easy without programming?

Thank You,
best regards,
Peter

Hello,
it is not possible to recognize the records during reading. All you can do is read them all (eg. as delimited records: one field, with record delimiter \n) and them filter them (eg. due to length).

Thank You for the answer. That can I use! Everything is very useful for me, because I am a beginner.

If I right understanding you, after filtering I will have in one file the “long” records (customer data) and in another file the “short” records (product data) separated. But the info about the belonging product records to the right customer record is only the sequence (order) of the records in the input file and nothing else. After separation I will lose this info….(Even if I just sort the records in the file whit some sort key I will lose that info too).

I need the logic: after recognizing a customer record all following “product” records (until record<> “product” record) are belonging to the customer record.

If the resolve of the problem is programming, can you give me an advice how I have to act: can I write a class as a sub class of FixLenDataReader or of a DelimitedDataReader ?

Thank you,
Best Regards,
Peter

So, you can do it by Reformat component:

<?xml version="1.0" encoding="UTF-8"?>
<Graph author="avackova" created="Tue May 06 09:12:58 CEST 2008" guiVersion="1.10" id="1210058511322" licenseType="Evaluation license." modified="Mon May 26 09:12:45 CEST 2008" modifiedBy="avackova" name="Transactions" revision="1.15">
<Global>
<Metadata id="Metadata1">
<Record fieldDelimiter="|" name="ref" recordDelimiter="\n" type="delimited">
<Field name="customer" type="string"/>
<Field name="text" type="string"/>
</Record>
</Metadata>
<Metadata id="Metadata0">
<Record fieldDelimiter="|" name="text" recordDelimiter="\n" type="delimited">
<Field name="text" type="string"/>
</Record>
</Metadata>
<Property fileURL="/home/avackova/runtime-New_configuration/test/workspace.prm" id="GraphParameter0"/>
</Global>
<Phase number="0">
<Node enabled="enabled" fileURL="data.txt" guiHeight="0" guiName="Universal Data Reader" guiWidth="0" guiX="73" guiY="66" id="DATA_READER0" type="DATA_READER"/>
<Node enabled="enabled" fileURL="out.txt" guiHeight="0" guiName="Universal Data Writer" guiWidth="0" guiX="557" guiY="64" id="DATA_WRITER0" type="DATA_WRITER"/>
<Node enabled="enabled" guiHeight="0" guiName="Reformat" guiWidth="0" guiX="306" guiY="67" id="REFORMAT0" type="REFORMAT">
<attr name="transform"><![CDATA[string customer;
function transform(){
  if (length($0.text)==36){
     customer = substring($0.text,0,2);
  }
  $0.customer := customer;
  $0.text := $0.text;
}
]]></attr>
</Node>
<Edge fromNode="DATA_READER0:0" guiBendpoints="" id="Edge0" inPort="Port 0 (in)" metadata="Metadata0" outPort="Port 0 (output)" toNode="REFORMAT0:0"/>
<Edge fromNode="REFORMAT0:0" guiBendpoints="" id="Edge1" inPort="Port 0 (in)" metadata="Metadata1" outPort="Port 0 (out)" toNode="DATA_WRITER0:0"/>
</Phase>
</Graph>

Reader reads all records from flat file as one string, then Reformat adds info about customer to each record (to product record as well as to customer record), then you can process all records as you need.

Thank You!
It works!

PS
I defined new field “if_filter” (in ),
and my “If” is something like that:
if_filter=””;
if (substring($0.Input_origin_Line, 0,2)!=“B2”){
if_filter = “to filter”;
}
So, in a next step I can filter records with if_filter=“to filter”

You can do it like you have written, or you can have it directly in filter:

<Node  id="EXT_FILTER0" type="EXT_FILTER" filterExpression="substring($0.Input_origin_Line, 0,2)!='B2'"/>

Hello,

which is the terminology fort he possibility to embed a code within the XML Graph description. Is that something extra Clover specific?
Is the syntax Java or Java script or only similar?
To embed: <![CDATA …?

I mean e.g.:




Thank you,

Kind regards,
P.

(again with “Disable HTML in this post”)

Hello,

which is the terminology for the possibility to embed a code within the XML Graph description. Is that something extra Clover specific?
Is the syntax Java or Java script or only similar?
To embed: <![CDATA …?

I mean e.g.:


<![CDATA[string customer;
function transform(){
if (length($0.text)==36){
customer = substring($0.text,0,2);
}
$0.customer := customer;
$0.text := $0.text;
}
]]>

Thank you,

Kind regards,
P.

See my reply on new topic (CDATA in graph), pls.