How do I extract values from list container of variable len?

Hi, I have a field that is of type string, basically a field with substrings separated by commas. It is a full address field so it can be of variable length. I want to separate this into separate fields for each value in the list.

I’m having huge difficulty doing this and would greatly appreciate some help. I thought about defining the field in the UniversalDataReader as a list container and then mapping the individual list values to separate fields (.e.g Add1,Add2,Add3 etc) however when I do this I get the following error with the graph failing at reading the data file:

ERROR caused by: Field [name:Address, type:string, containerType:list, position:27] cannot be deserialized from string. List and map container types are not supported. when parsing record #1 field Address

Previously I had thought about using regular expressions to extract the substrings however I wasn’t sure how to appropriately write this to the output fields either, however this is roughly what I came up with for a generic solution to my issue but wasn’t sure how to apply it correctly in Clover:

Here is a regular expression that will identify each of these substrings:
(?<=“)(([^”,]+?)(,{1} {1}))([^“,]+?)(,{1} {1})(([^”,]+?)(,{1} {1}))?(([^",]+?)(,{1} {1}))?

If I could then just output the following:
$2=Add1
$4=Add2
$7=Add3
$10=Add4
$13=Add5
$16=Add6
$18=Add7

Of course one of the other issues that needs dealing with is some of the records will have 3 values in the list while others may have up to 7 as it varies according to the address.

Hoping for a little light to be thrown on the solution because I’ve been wracking my brain over this and not been able to come up with a suitable solution that works. I’m hoping you’re a brighter soul than I and feel sympathetic enough of my plight to help me solve this.

Many thanks,

Sam

Hi Sam,

Basically, in order to get your data in better structured way, it would be good to load these list fields as ordinary strings. If you do it so, you can use Normalizer to split the data into more records (ID1, add1; ID1, add2,…ID1, addN). If you wish to gather these data back together into one record, you may use Pivot component which groups data sharing a particular key item into one record. Using list fields when you do not know the size of the list is not suitable, since you never know how many records of list you are supposed to handle.

More information about Pivot (http://doc.cloveretl.com/documentation/UserGuide/index.jsp?topic=/com.cloveretl.gui.docs/docs/pivot.html) and Normalizer (http://doc.cloveretl.com/documentation/UserGuide/index.jsp?topic=/com.cloveretl.gui.docs/docs/normalizer.html), please see our User’s Guide.