EXT_SORT question

Heya,

Our legacy ETL tool has the ability to do a “stable sort” and I am trying to fgure out if Clover has the same.

For example, if I have a set of data:

PCL,Owner
2,Fred
1, Mary
2,Joe
1,Stan

and I tell it to “stable sort” on pcl, it will always sort as:

PCL,Owner
1, Mary
1,Stan
2,Fred
2,Joe

That is, the sorted set always keeps the raw row order within the set, so you would never end up with a sorted set like:

PCL,Owner
1,Stan
1, Mary
2,Fred
2,Joe

because the “Mary” row came before the “Stan” row in the raw input. We are using the EXT_SORT node - does this perform a “stable sort” or could the order within the sorted rows vary?

Thanks,
Anna

Hello Anna,
to guarantee that order of output records is always the same you have to use compound sort key; then each set, sorted due to primary key is, additionally, sorted due to secondary key.

Heya,

There is no secondary key.

I guess you could say the original raw input “row number” is that secondary key. Occasionally our business folks get a file that they want to sort on a primary key, but then they want to make sure that within a set of the primary key that the rows retain their original relative position within the file to each other. Going back to my example, if I have:

PCL,Name
2,Joe
1,Mary
2,Fred
1,Stan

and I sort on PCL, I would always want to get:

PCL,Name
1,Mary
1,Stan
2,Joe
2,Fred

There’s no “secondary key” that would get me that sort, right?

The problem comes because our business folks may add a sequenced field (after the sort) to the input, like:

PCL,Name,Seq
1,Mary,1
1,Stan,2
2,Joe,1
2,Fred,2

and when the file rolls around next year (the source always has the same raw order), they would like the same sequence assigned to the same row as last year.

The only workaround I can see is if I add a reformat node to add the “raw” row number upon input, then use that row in my EXT_SORT key. Does that seem right?

Thanks,
Anna

Yes, you’re right, but the better (faster) way to add such raw number is to use auto filling feature.

Heya,

Thanks! I will try that.

Anna