MERGE_JOIN : slave record out of ordrer?

Hi,

I am back from holydays. Happy new year to all of you…

Ok, I am still experimenting and get an interresting case :

- I have a graph that queries two tables in a database, then passes the results to a MERGE_JOIN that outputs to a table in a different database.

Running on given month, it works perfectly well.
On another month (same query, but more data), it fails with this exception :

Node MERGE_JOIN0 finished with fatal error: org.jetel.exception.JetelException : Slave record out of order!

What can be the meaning of this ?

Franck

PS : should I post graphs on this forum, or part of them, or is it considered noise ?

Hi Martin,

yes, my slave data is ordered by join keys :

Select on slave (in:1) is :
select * from oeccp where chmoeccp = ‘STR’
order by etsoeccp, cgroeccp

and on master (in:0) :
select * from obbud where butobbud = ‘MB’
and etsobbud = ‘04TFN’ and moiobbud = ‘20051101’
order by etsobbud, cgrobbud

Join node is :

Hi Franck!
See documentation to mergeJoin component http://cloveretl.berlios.de/docs/Clover … tml#0_0_75

Are you sure that input slave data are sorted by join key?

Martin

Hi David,

I have found the issue, it’s a Postgresql one and not a Clover one indeed.

In Postgresql, order by clause is sensitive to the locale the database was created with, and collate order in locale fr_FR ignores spaces. So my data is not returned in ascii order but rather in dictionary order…

I must find a Postgresql workaround for that…

Thanks,
Franck

Hi !

Are you sure that those two different sets of fields contain the same data ?
Even small difference in char versus varchar may cause this. Also char versus number/integer - it gives different results if you compare “09” and “9” as strings and numbers.

The error basically means that when processing data, Clover encountered record (on slave port) which compared with previous one is less then → i.e. should precede the previous one - data is not in ascending order.

David.

Hello Franck !

In this case, you may try to sort the data inside Clover - use EXT_SORT component for this. It may be slower than sorting it on DB side, but would probably solve your locale issue.

We have in our back-log support for collators defined for fields in Clover. I hope this will be implemented within next two releases of Clover.

David.

Adding SORT component does the trick.

It might be a little slower, but not that much : it took 42 sec to join 34000 and 97000 tuples and output them to a flat table, versus 39 sec to get the failure without the sort component…

Thanks for the tip.