Hi,
I want to load some data in my database from a text file. The data base already have some data loaded. This new file has some incremental data out of which some are new and some are just update of previous data. For example:
Current database
------------------------
Currency_Code|Description
USD|US Dollar
INR|Indian Rupee
Incremental File
-------------------------
Currency_Code|Description
USD|United State Dollar
JPY|Japanese Yen
EUR| Euro
Here description of USD is changed which I need to update in database and the new rows needs to be added. How can I achieve this using clover graph.
Any help will be appreciated.
Thanks!
Amit
Hello,
following graph resolves the issue:
<?xml version="1.0" encoding="UTF-8"?>
<Graph author="avackova" created="Thu Jun 25 11:52:59 CEST 2009" guiVersion="0.0.0.devel" id="1245923673107" licenseType="Evaluation license." modified="Thu Jun 25 12:01:06 CEST 2009" modifiedBy="avackova" name="incrementalData" revision="1.10">
<Global>
<Metadata id="Metadata0">
<Record fieldDelimiter="|" name="data" recordDelimiter="\n" type="delimited">
<Field name="Currency_Code" type="string"/>
<Field name="Description" type="string"/>
</Record>
</Metadata>
<Connection dbConfig="conn/postgre.cfg" id="Connection0" type="JDBC"/>
<Property fileURL="workspace.prm" id="GraphParameter0"/>
</Global>
<Phase number="0">
<Node enabled="enabled" fileURL="${DATAIN_DIR}/test_data.txt" guiHeight="0" guiName="UniversalDataReader" guiWidth="0" guiX="66" guiY="89" id="DATA_READER0" type="DATA_READER"/>
<Node dbConnection="Connection0" enabled="enabled" guiHeight="0" guiName="DBJoin" guiWidth="0" guiX="278" guiY="88" id="DBJOIN0" joinKey="Currency_Code" metadata="Metadata0" sqlQuery="select * from mytable where Currency_Code=?" type="DBJOIN">
<attr name="transform"><![CDATA[//#TL
// Transforms input record into output record.
function transform() {
$0.Currency_Code := $1.Currency_Code;
$0.Description := $1.Description;
}
// Called during component initialization.
// function init() {}
// Called after the component finishes.
// function finished() {}
]]></attr>
</Node>
<Node dbConnection="Connection0" enabled="enabled" guiHeight="0" guiName="DBOutputTable" guiWidth="0" guiX="507" guiY="42" id="DB_OUTPUT_TABLE0" sqlQuery="update mytable set Description=$Description where Currency_Code=$Currency_Code" type="DB_OUTPUT_TABLE"/>
<Node dbConnection="Connection0" dbTable="mytable" enabled="enabled" guiHeight="0" guiName="DBOutputTable" guiWidth="0" guiX="509" guiY="142" id="DB_OUTPUT_TABLE1" type="DB_OUTPUT_TABLE"/>
<Edge fromNode="DATA_READER0:0" guiBendpoints="" id="Edge0" inPort="Port 0 (in)" metadata="Metadata0" outPort="Port 0 (output)" toNode="DBJOIN0:0"/>
<Edge fromNode="DBJOIN0:0" guiBendpoints="" id="Edge1" inPort="Port 0 (in)" metadata="Metadata0" outPort="Port 0 (joined records)" toNode="DB_OUTPUT_TABLE0:0"/>
<Edge fromNode="DBJOIN0:1" guiBendpoints="" id="Edge2" inPort="Port 0 (in)" metadata="Metadata0" outPort="Port 1 (skipped records)" toNode="DB_OUTPUT_TABLE1:0"/>
</Phase>
</Graph>
Hi,
Thanks for your reply. I am using DB2 data base. Also I am not using clover GUI.
I have modified my graph accordingly and tried running it.
<?xml version="1.0" encoding="UTF-8"?>
<![CDATA[//#TL
function transform() {
$0.CODE := $1.CODE;
$0.DESCRIPTION := $1.DESCRIPTION;
}
]]>
It is giving me an error message shown below:
DB_OUTPUT_TABLE1 …FAILED !
at org.jetel.graph.Phase.init(Phase.java:161)
at org.jetel.graph.runtime.WatchDog.executePhase(WatchDog.java:565)
at org.jetel.graph.runtime.WatchDog.call(WatchDog.java:155)
at org.jetel.graph.runtime.WatchDog.call(WatchDog.java:70)
at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:269)
at java.util.concurrent.FutureTask.run(FutureTask.java:123)
at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExec
utor.java:650)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor
.java:675)
at java.lang.Thread.run(Thread.java:595)
Caused by: Element [DB_OUTPUT_TABLE1:]-The load module name for the stored proce
dure on the server is not found. Contact your DBA.
at org.jetel.component.DBOutputTable.init(DBOutputTable.java:525)
at org.jetel.graph.Phase.init(Phase.java:158)
… 8 more
ERROR [WatchDog] - !!! Phase finished with error - stopping graph run !!!
INFO [WatchDog] - -----------------------** Summary of Phases execution **-----
----------------
INFO [WatchDog] - Phase# Finished Status RunTime(sec) Mem
oryAllocation(KB)
INFO [WatchDog] - 0 N/A
INFO [WatchDog] - ------------------------------** End of Summary **-----------
----------------
INFO [WatchDog] - WatchDog thread finished - total execution time: 14 (sec)
Freeing graph resources.
Execution of graph failed !
Can you please explain me the reason for this. Appreciate your help.
Regards,
Amit
Hello, it seems that there is problem with SAO_EXPERIMENT table in your database. Try to set query instead of dbTable in your DBOutputTable component.
What CloverETL version do you use?
Hi
After changing the DB_OUTPUT_TABLE1 to SQLQuery my new records are getting uploaded successfully but old records are still not getting updated.
<?xml version="1.0" encoding="UTF-8"?>
<Graph name="SAO_EXPERIMENT">
<Global>
<Connection dbConfig="../conn/db2.cfg" id="conn" type="JDBC"/>
<Metadata id="SAO_EXPERIMENT_META">
<Record fieldDelimiter="|" name="SAO_EXPERIMENT_META" recordDelimiter="\r\n" type="delimited">
<Field name="CODE" size="3" type="string" />
<Field name="DESCRIPTION" size="25" type="string" />
</Record>
</Metadata>
<Property fileURL="../files/upload/EXPERIMENT.dat" id="GraphParameter0"/>
</Global>
<Phase number="0">
<Node enabled="enabled" fileURL="../files/upload/EXPERIMENT.dat" id="DATA_READER0" type="DATA_READER"/>
<Node id="DBJOIN0" dbConnection="conn" enabled="enabled" joinKey="CODE" metadata="SAO_EXPERIMENT_META" sqlQuery="select * from SAO_EXPERIMENT where CODE=?" type="DBJOIN">
<attr name="transform"><![CDATA[//#TL
function transform() {
$0.CODE := $1.CODE;
$0.DESCRIPTION := $1.DESCRIPTION;
}
]]></attr>
</Node>
<Node id="DB_OUTPUT_TABLE0" dbConnection="conn" enabled="enabled" sqlQuery="update SAO_EXPERIMENT set DESCRIPTION=$DESCRIPTION where CODE=$CODE" type="DB_OUTPUT_TABLE"/>
<Node id="DB_OUTPUT_TABLE1" dbConnection="conn" dbTable="SAO_EXPERIMENT" sqlQuery = "insert into SAO_EXPERIMENT(CODE,DESCRIPTION) values(?,?)" cloverFields="CODE;DESCRIPTION" type="DB_OUTPUT_TABLE"/>
<Edge fromNode="DATA_READER0:0" id="Edge0" inPort="Port 0 (in)" metadata="SAO_EXPERIMENT_META" outPort="Port 0 (output)" toNode="DBJOIN0:0"/>
<Edge fromNode="DBJOIN0:0" id="Edge1" inPort="Port 0 (in)" metadata="SAO_EXPERIMENT_META" outPort="Port 0 (joined records)" toNode="DB_OUTPUT_TABLE0:0"/>
<Edge fromNode="DBJOIN0:1" id="Edge2" inPort="Port 0 (in)" metadata="SAO_EXPERIMENT_META" outPort="Port 1 (skipped records)" toNode="DB_OUTPUT_TABLE1:0"/>
</Phase>
</Graph>
For example I have values in Data base as
ABC|Old ABC
CDF|Old CDF
My text file contains:
ABC|Updated ABC
XYZ|New XYZ
After running the graph I am getting below result:
ABC|Old ABC
CDF|Old CDF
XYZ|New XYZ
which means the skipping existing data is working but some problem in update.
I am using CloverETL framework/transformation graph runner ver 2.4
Truly appreciate your help.
Regards,
Amit
2.4 is really old and already unsupported CloverETL version. The best way would be to update CloverETL to newest (2.7.2) version. With this old version you can try change your update query to:
update SAO_EXPERIMENT set DESCRIPTION=? where CODE=?
and use cloverFields attribute (cloverFields=DESCRIPTION;CODE)
Thank you for your help. I have changed my graph accordingly as shown below:
<?xml version="1.0" encoding="UTF-8"?>
<Graph name="SAO_EXPERIMENT">
<Global>
<Connection dbConfig="../conn/db2.cfg" id="conn" type="JDBC"/>
<Metadata id="SAO_EXPERIMENT_META">
<Record fieldDelimiter="|" name="SAO_EXPERIMENT_META" recordDelimiter="\r\n" type="delimited">
<Field name="CODE" size="3" type="string" />
<Field name="DESCRIPTION" size="25" type="string" />
</Record>
</Metadata>
<Property fileURL="../files/upload/EXPERIMENT.dat" id="GraphParameter0"/>
</Global>
<Phase number="0">
<Node enabled="enabled" fileURL="../files/upload/EXPERIMENT.dat" id="DATA_READER0" type="DATA_READER"/>
<Node id="DBJOIN0" dbConnection="conn" enabled="enabled" joinKey="CODE" metadata="SAO_EXPERIMENT_META" sqlQuery="select * from SAO_EXPERIMENT where CODE=?" type="DBJOIN">
<attr name="transform"><![CDATA[//#TL
function transform() {
$0.CODE := $1.CODE;
$0.DESCRIPTION := $1.DESCRIPTION;
}
]]></attr>
</Node>
<Node id="DB_OUTPUT_TABLE0" dbConnection="conn" enabled="enabled" sqlQuery="update SAO_EXPERIMENT set DESCRIPTION=? where CODE=?" cloverFields="DESCRIPTION;CODE" type="DB_OUTPUT_TABLE"/>
<Node id="DB_OUTPUT_TABLE1" dbConnection="conn" dbTable="SAO_EXPERIMENT" sqlQuery = "insert into SAO_EXPERIMENT(CODE,DESCRIPTION) values(?,?)" cloverFields="CODE;DESCRIPTION" type="DB_OUTPUT_TABLE"/>
<Edge fromNode="DATA_READER0:0" id="Edge0" inPort="Port 0 (in)" metadata="SAO_EXPERIMENT_META" outPort="Port 0 (output)" toNode="DBJOIN0:0"/>
<Edge fromNode="DBJOIN0:0" id="Edge1" inPort="Port 0 (in)" metadata="SAO_EXPERIMENT_META" outPort="Port 0 (joined records)" toNode="DB_OUTPUT_TABLE0:0"/>
<Edge fromNode="DBJOIN0:1" id="Edge2" inPort="Port 0 (in)" metadata="SAO_EXPERIMENT_META" outPort="Port 1 (skipped records)" toNode="DB_OUTPUT_TABLE1:0"/>
</Phase>
</Graph>
The graph is running successfully. But the old data is still not getting updated. I am not sure where i am making the mistake.
INFO [WatchDog] - ----------------------** Final tracking Log for phase [0] **-
--------------------
INFO [WatchDog] - Time: 29/06/09 18:14:40
INFO [WatchDog] - Node Status Port #Records
#KB Rec/s KB/s
INFO [WatchDog] - -------------------------------------------------------------
---------------------
INFO [WatchDog] - DATA_READER0 FINISHED_OK
INFO [WatchDog] - %cpu:.. Out:0 4
0 0 0
INFO [WatchDog] - DBJOIN0 FINISHED_OK
INFO [WatchDog] - %cpu:0.01 In:0 4
0 0 0
INFO [WatchDog] - Out:0 3
0 0 0
INFO [WatchDog] - Out:1 1
0 0 0
INFO [WatchDog] - DB_OUTPUT_TABLE0 FINISHED_OK
INFO [WatchDog] - %cpu:.. In:0 3
0 0 0
INFO [WatchDog] - DB_OUTPUT_TABLE1 FINISHED_OK
INFO [WatchDog] - %cpu:.. In:0 1
0 0 0
INFO [WatchDog] - ---------------------------------** End of Log **------------
--------------------
INFO [WatchDog] - -----------------------** Summary of Phases execution **-----
----------------
INFO [WatchDog] - Phase# Finished Status RunTime(sec) Mem
oryAllocation(KB)
INFO [WatchDog] - 0 FINISHED_OK 1
3058
INFO [WatchDog] - ------------------------------** End of Summary **-----------
----------------
INFO [WatchDog] - WatchDog thread finished - total execution time: 14 (sec)
Freeing graph resources.
Execution of graph successful !
I believe the join function is unable to filter out the results. Can you suggest how can I change the function so that it work in my old version.
Regards,
Amit
Hello Amit,
your transformation function is wrong. You assign to the output records values from database instead of input record. So your function should be:
function transform() {
$0.CODE := $0.CODE;
$0.DESCRIPTION := $0.DESCRIPTION;
}
or, as it is default function, you can skip this attribute at all.
Hi,
Thanks a lot for your support. Finally I am able to run the graph successfully. Truly appreciate your help.