Normalize denormalized CSV

Hello,

i try to extract some data from a denormalized CSV-file.

Example:

name_first|name_last|data1,data2,data3,|\n

I want to normalize the data to this form:

name_first  name_last  data1
name_first  name_last  data2
name_first  name_last  data3

i don’t know exactly how to use the NORMALIZER tool. :cry:

Can you help me …?!

Greetings from germany

You have to “implement” these two functions inside Normalizer component (assuming you will use CTL):

function count() {
   return 3;
}

Function count() is called once for each input row/record. Its task is to tell component how many times it should call “transform()” function on input row. Each time, counter (counting from 0…) is passed into “transform()” and it has to produce one output record.

function transform(idx) {
                 $0.name_first:=$0.name_first;
                 $0.name_last:=$0.name_last;
                 switch (idx) {
                    case 0: $0.data=$0.data1;
                    case 1: $0.data=$0.data2;
                    case 2: $0.data=$0.data3;
                  }
 }

I assume that output metadata of Normalizer contains 3 fields (first_name,last_name,data).

Thanks a lot for the fast answer.

I tried it, but the there is an other problem. The number of data in the denormalized value varies between 1 and 10.

In the count function i try to get the number of data values and in the transform function i try to cut out the related data.

function count() {
	int i = 0;
	int pos = 0;
	string myString = $0.M_ACTORS;
	while( pos = index_of( myString,",",pos ) ){
		i++;
		pos++;
	}
	return i;
}

function transform(idx) {
	int posStart = 0;
	int posEnd = 0;
	int i = 0;
	for( i; i < idx; ++i) {
		posStart = posEnd;
		posEnd = index_of( $0.M_ACTORS,",",posStart+1 );
	}
	$0.M_NAME := $0.M_NAME;
	$0.M_STUDIO := $0.M_STUDIO;
	$0.M_ACTOR := substring( $0.M_ACTORS,posStart,posEnd-posStart );
	$0.M_CLASS := $0.M_CLASS;
}

Do you have any experience in this problem ?

Andreas Mack

You may do something like this:


list values;

function count() {
   trunc(values);
   values=split($0.M_ACTORS,",");
   return length(values);
}

function transform(idx) {
   $0.M_NAME := $0.M_NAME;
   $0.M_STUDIO := $0.M_STUDIO;
   $0.M_ACTOR := values[idx]; 
   $0.M_CLASS := $0.M_CLASS;
}

[/code]

Thanks a lot … it works ! :lol: