Question On HASH JOIN

Hi ,

I am new to clover etl and I am making changes to an existing clover ETL application and here is where I was getting stuck :

I have two table A and B and I am extracting columns from the database A to the Datamart table A_DM using HASH JOIN :



${out.0.EMP_PHONE_NUM} = ${in.1.EMP_PHONE};

How does the HASH JOIN know how to join the 2 tables , what I mean by that is does the type HASH JOIN know the primary keys , foreign keys ? If so how does it know ?

Hello,
you have to define keys in your Joiner: attribute joinKey should look like $EmployeeID=$EmployeeID;#, where the left side defines primary key and the right side - foreign key. When the key name is in both cases the same, you can use only joinKey=EmployeeID like in your example. For complete documentation see HashJoin component; checking HashJoin example could be useful too.

Thanks Agata , that did help me !!

Further more the issue I am dealing with is is that I want to Join three tables and all the three tables do not have any one key in common , for example :

Tables :

  1. Employee : Key : Employee_ID
  2. Customer : Key : Employee_ID, ORDER_ID
  3. Order : Key : Order_ID

Can i do some thing like this which would join these 3 tables :


Does this provider a proper join since only one table has common keys, in this case that is the “Customer” table.

If not I was planning on Joining the Employee & Customer and then take the result of this join and join it with Order ?

Second part of the question :
I want to use the following type DB_INPUT_TABLE based on the previous join :

<Node id=“UPD_FACT_TABLE”
type=“DB_INPUT_TABLE”
commit=“12345”
batchMode=“Y”
dbConnection=“ORACLE_DATAMART”
dbColumns=“ORDER_ID:EMPLOYEE_ID”
sourceTable=“MY_DB.TRACK_ORDER”
sqlQuery="select TRACKING_TYPE from TRACK_ORDER where ORDER_ID = ? and EMPLOYEED_ID = ? />

How does the ‘?’ know that they have to be replaced buy ORDER_ID and EMPLOYEE_ID from the previous HASH_JOIN ??

Please let me know, I am trying various options and unable to get any thing working.

Hello,
your key for three tables is proper or you can write it more explicitly: joinKey=$Employee_ID=$Employee_ID;#$ORDER_ID=$Order_ID;#". For joining data from database you can’t use DBInputTable - this only for reading data; for joining data use DBJoin component. See also examples: DBJoin example and DBUnloadParametrized example.[/i]

Hi Agata,

I am using version 2.1 and it only offers HASH_JOIN and it does not support 3 way join. so I tried doing some thing like this.

THE EDGE looks like this

where the superset involves the metadata of all the three tables : employee,customer,order.

The issue is JOIN_TABLE_1_&_2-WITH_3 is refusing to accept the join key: ORDER_ID=Order_ID and throws a message that

java.lang.RuntimeException: Field name specified as a key doesn’t exist: ORDER_ID=Order_ID
at org.jetel.data.RecordKey.init(RecordKey.java:117)
at org.jetel.component.HashJoin.init(HashJoin.java:299)
at org.jetel.graph.Phase.init(Phase.java:150)

In such old version use joinKey attribute for primary key and slaveOverrideKey for foreign key. By the way why do use so old unsupported version?

I am working on an existing application, and I have a very specific change and the time line doe not allow me to change the version and retest the entire application. That is the reason I am stuck to the older version.

Hi Agata,

I did a work around the joins, what I did was:

  1. created a View in the data base which select all the columns.

  2. Call the recordTransformDefaults guy and the I ma trying to insert.

But along the path when I reach a particular point it throws he this error :

component.DBOutputTable (run:395) - RESULT_ERROR:
java.sql.SQLException: Incompatible Clover & JDBC field types Clover type: string

I checked the metadata for the particular column its of the type string in both the places i.e from and to coloumn.

Any Ideas ?

Try to extract metadata from db (in GUI or by AnalyzeDB) and then check fields’ types.