Parallel Inserts HELP

Greetings!

I have an issue with load performance in my Oracle (10G) database.

Source table has 200 millions rows. A simple graph with the following load logic:
DB_INPUT (select * from table where product in ‘A’,‘B’)
SIMPLE COPY
DB_OUTPUT (target table1)

This runs well but VERY SLOW: 2 (two) hours with Clover when I can do the same load using SQL*Loader in 1 (one) hour.

I want to perform 2 (two) parallel INSERT statements on the same database table at a time. I run via shell script 2 graphs with the following load strategy:

DB_INPUT (select * from table where product=A)
SIMPLE COPY
DB_OUTPUT (target table 1)

DB_INPUT (select * from table where product=B)
SIMPLE COPY
DB_OUTPUT (target table 1)

At random, one of the scripts constantly fail with a DEADLOCK error from my Oracle database, which makes sense.

What can be done to speed up the load?

I welcome any advice! Thank you.

Alan.

Hi David,

I did some more testing and got a great improvement!!

I used “FetchSize=…” for DBINPUT and removed the SimpleCopy. The execution time went from 2 hours down to 1 hour!!!

Working with my DBA, I can confirm what you stated: the JDBC values in my CONFIG file are not in use when giving parameters to my components.

This makes more confident about not having to tinker with JDBC properties. I like to rely on Clover instead.

Thank you so much for taking the time to reply to our emails.

Thank you for your hard work.

Alan.

PS: FYI, I am using “CloverETL framework - 1.1.7”

Hello Alan !

To speed up the process (a bit), you can remove the SimpleCopy - it should not be necessary. Also, by using latest development release of Clover, the process could improve in speed - there has been implemented new component-component connection.

Giving the fact, that your graph is using JDBC connection to both databases, the double time of Clover versus SQLLoader is not so bad :wink:

Other than that, you can only experiment with several JDBC driver’s parameters.

The solution would be implementing component which would go to Oracle directly instead through JDBC.

David.