I have multiple copy jobs. I want to execute some of them in parallel. I also want to execute some of them sequentially. During the sequential execution phase, if any job fails, I want to stop completely.
How would you do that? Perhaps you would write a Java program or a shell script. Does CloverETL have this kind of feature built in?
If you wanted to build a flow chart, is there an open source project for that?
-Carl
worng thread…please ignore my message.
O.K. Now I understand it a bit more. Well, what You can do is put each graph/copying into independent phase. By doing this, you run them in sequence and if first phase fails, the second won’t be executed. The other possibility is to crate two separate graphs and some shell wrapper which would check the return value of executed application. If the first indicates problem, then don’t execute the second.
Additional possibility is to use environment-like variables in your graph definition. It would allow to have some kind of template which can be easily filled filled from external metadata - users could simply assemble list of tables to be transferred and your application would just take the list and for each item execute the template.
As for the visual graph definition - there is one commercial application tailored for Clover - developed by company named NETQI. (www.netqi.com). Recently, I have received info from somebody trying to develop open source variant of the gui.
As my knowledge goes, there are several Java packages/libraries (free) which offer creating flow-chart like applications - one is JGraph, the other is GEF.
I want to do lots of things so it’s hard to give you one concrete picture that conveys everything. Just suppose I have 2 tables in a source database and I want to move them to a target database. So I create 2 simple graphs. One for each table. I’m calling them copy jobs because there is no transformation, just copying each row from source to target.
Now suppose I want to run the first graph and then “log” the number of records that were transferred. Then I want to run the next graph.
And the tricky part is if the first graph fails, I don’t want to run the second graph.
I could easily build a Java program or something to do this. I’m a software engineer, so that’s not a big deal for me.
But, I want to make it easy for sales engineers or even end customers to add their own graphs in case there are other tables besides the “builtin” tables.
So, it would be nice if the sales engineer or end user could use a nice GUI to edit a “flowchart” or some such thing. Wouldn’t it be nice if they could drag a “graph” node onto a chart and hook it up at the end my flowchart?
A flowchart for this kind of thing is really overkill. I mean it would be simple enough to just use a text file (maybe XML format). But even XML might be overkill for this kind of thing.
But, really a flowchart would be perfect. The question is where can I find a flowchart designer for free?
And I wanted to know what other Clover users do for this kind of situation.
I have used AbInitio, one of the best high perfromance ETL tool out there (very very expensive), that’s where I used such a component.
Thanks
Akhil
Hi Carl !
If I understand it correctly, You want to do some simple “cp file1 file2” command - withou looking at the content of the file.
Well, I don’t thing Clover is proper tool for this. You can execute things in parallel & in seriall (combining parallel flows with sequential/serial stages) but, for this You would need to create special component which would execute shell script or simple run a program → too much hassle for simple thing (probably by writing simple shell script wrapper you could achieve the same).
As for the flow chart modelling - I am not sure. There are several projects which will allow you to put together graphical app. wich nodes conencted using edges, but I am not aware of any which would support the notion of data flow (but this doesn’t meen there is none).
May-be if you describe what actually you are trying to accomplish, I could be of better help.
Sincerely,
David.