Pass Metadata and Filename as Parameter in Designer

Support/help with CloverETL (4.9) and CloverDX (5.0 or newer) implementation problems

pfield
Posts: 20
Joined: Thu Mar 19, 2020 3:46 pm

Pass Metadata and Filename as Parameter in Designer

Postby pfield » Wed Mar 25, 2020 6:22 pm

Hi,

I would like to somehow pass in a filename and metadata to a graph via parameters. I did this before using Server but am now restricted to Designer and want to know if this is possible?

What I would like to happen is to have a parent graph feeding the child 2 parameters:
1) File Name (to be passed into the Reader URL)
2) Metadata (Metadata is externalized and would be used by the edge coming out of the Reader)

This way, i can use the same graph no matter how many files I need to read. Hoping there is a solution...

Thanks,
Paul

dpavlis
Posts: 189
Joined: Sat Mar 10, 2007 8:12 pm

Re: Pass Metadata and Filename as Parameter in Designer

Postby dpavlis » Thu Mar 26, 2020 6:29 pm

Hi Paul,

This is exactly what Server and JobFlow components are good for. If you need this to work in Designer without Server then you need to use parameters (external parameter file) and re-create the content of the parameter file each time with new values before you run the child graph. There is also no direct way how to execute child graph from within parent graph on Designer - again something easily done on Server through JobFlow, but with Designer only this means manually execute parent graph (which modifies some parameter file) and then manually execute child graph.
David Pavlis
CloverCARE Support
CloverDX | Rapid Data Integration

Visit us online at http://www.cloverdx.com

pfield
Posts: 20
Joined: Thu Mar 19, 2020 3:46 pm

Re: Pass Metadata and Filename as Parameter in Designer

Postby pfield » Thu Mar 26, 2020 8:14 pm

Thanks for the reply David.

Would there be anyway of passing a parameter via RunGraph using command line arguments? See related post below. This would allow for some form of synchronous execution.

https://urldefense.proofpoint.com/v2/url?u=https-3A__forum.cloverdx.com_viewtopic.php-3Ft-3D4576&d=DwMGaQ&c=tq9bLrSQ8zIr87VusnUS9yAL0Jw_xnDiPuZjNR4EDIQ&r=G3WmcsdBDGCmkCl7b1Ipe-ZSjXFSn856hSZy51xAqz4&m=6RzOjMN5sNIrM8LbBA70rnco_QwzrkUqfBP44NPpY7w&s=1LvATZPpTuf86Do2IPzHGOY5_RaQIJ31jAHtx1jJNN8&e=

Paul

dpavlis
Posts: 189
Joined: Sat Mar 10, 2007 8:12 pm

Re: Pass Metadata and Filename as Parameter in Designer

Postby dpavlis » Thu Apr 02, 2020 1:33 pm

Hello Paul,

If you define a parameter in your parent graph (where you RunGraph component is) and then define the same parameter (name) in your child graph, then there is a way to pass the value of parameter from parent graph to child.
You can use property "Graph parameters to pass" (multiple names divided by semicolon might be specified). This holds if "The same JVM" parameter is set to true.

If you want run those two in separate JMVs, then parameters have to be specified as "Command line arguments".

I have attached 2 simple examples which illustrate that.
Attachments
Parent_Child_RunGraph_SeparateJVM.zip
(2.06 KiB) Downloaded 121 times
Parent_Child_RunGraph_SameJVM.zip
(1.96 KiB) Downloaded 115 times
David Pavlis
CloverCARE Support
CloverDX | Rapid Data Integration

Visit us online at http://www.cloverdx.com

pfield
Posts: 20
Joined: Thu Mar 19, 2020 3:46 pm

Re: Pass Metadata and Filename as Parameter in Designer

Postby pfield » Mon Jan 18, 2021 1:58 pm

Hi David,

Using a different JVM works well as I can pass different values through "cloverCmdLineArgs" on RunGraph input. One drawback I can see is that because it's in a different JVM I can follow the process in the execution log. Is there a way to navigate to the graph in the execution log/access the alternate JVM used?

Also, I see the "-P:Parameter Name=Parameter Value" is used to pass in the parameter value, so wondered what else can be passed into the JVM via command line arguments? A link to any documentation would be much appreciated.

Thanks!
Paul

dpavlis
Posts: 189
Joined: Sat Mar 10, 2007 8:12 pm

Re: Pass Metadata and Filename as Parameter in Designer

Postby dpavlis » Thu Jan 21, 2021 9:29 am

Hi Paul,

Not sure what are you trying to achieve. If you would like the transformation graph executed in the separate JVM to log info into a specific file then you can use "Log file URL" parameter of the RunGraph component.

The "-P:" command line parameter is not for JVM but for the second(other) Clover runtime you end up executing. Various JVM parameters can be found in official Java/JVM documentation - look for "java command". The most useful is probably -Xmx<size> (e.g. -Xmx1024M) setting the maximum (heap) memory the JVM will use. This may be useful if your transformation executed through RunGraph needs a lot of memory to process all the data.

Nonetheless, as stated previously this is not the officially supported way of orchestrating several data transformation jobs, and it has many drawbacks.
David Pavlis
CloverCARE Support
CloverDX | Rapid Data Integration

Visit us online at http://www.cloverdx.com

pfield
Posts: 20
Joined: Thu Mar 19, 2020 3:46 pm

Re: Pass Metadata and Filename as Parameter in Designer

Postby pfield » Thu Jan 21, 2021 1:46 pm

Hi David,

Thank you for the response - this is helpful.

When I launch a graph via RunGraph using the same JVM I can track all graphs executed in the Execution Tab, click into them, see what data passed through the edge etc. When using a different JVM I cannot; The Execution Tab only shows the "parent" graph that uses the RunGraph. The Log File URL attribute on the RunGraph is definitely helpful so thank you for referencing it.

Appreciate using Clover Server is the best way to orchestrate synchronous processing and is a far better solution... however for my use case I need to work within the capabilities of the Designer product.

Thanks!
Paul

dpavlis
Posts: 189
Joined: Sat Mar 10, 2007 8:12 pm

Re: Pass Metadata and Filename as Parameter in Designer

Postby dpavlis » Fri Jan 22, 2021 11:21 am

Hi Paul,

Understand your situation. The problem with RunGraph executed from Designer (with "separate JVM" switched on) is that Clover ends up executing a totally independent process at your operating system level. Thus, it has no idea that there is some other CloverDX transformation running which produces tracking statistics etc. It only sees an OS level process and approaches it as if you executed any other program.
The small advantage over using SystemExecute component in this scenario is that as you are using RunGraph component, Clover knows that the OS level process will be another Clover transformation and understands/helps with passing some control data/parameters. But once the separate JVM and transformation is launched, Clover looses much of the control and can approach it only as if you executed for example shell-script. There are STDIN/STDOUT/STDERR and information whether the process finished and if so, what was the status code.
Therefore, the Execution Tab lacks info which is otherwise available (if you run within the same JVM or on CloverDX Server).
David Pavlis
CloverCARE Support
CloverDX | Rapid Data Integration

Visit us online at http://www.cloverdx.com

pfield
Posts: 20
Joined: Thu Mar 19, 2020 3:46 pm

Re: Pass Metadata and Filename as Parameter in Designer

Postby pfield » Sun Apr 25, 2021 9:07 pm

Hi David,

I have a follow up question relevant to this thread...

When passing parameters into a RunGraph (via Command line arguments) calling a graph using a subgraph, I am getting the following error on the Log file:

CloverDX license for subgraphs is expired or not available


My question is, when not running in the same JVM, do we need to pass in any additional setup parameters linking to licenses etc?


Thanks,
Paul

pfield
Posts: 20
Joined: Thu Mar 19, 2020 3:46 pm

Re: Pass Metadata and Filename as Parameter in Designer

Postby pfield » Mon Apr 26, 2021 10:58 am

Hi,

Shortly after my prior post I observed in the documentation for RunGraph that you cannot run a graph with a subgraph when using a separate JVM. This would explain my issue.

However, when removing the subgraph, I got some more errors:

Secure parameters are supported only in CloverDX Server environment


This appears to be caused by using a secure parameter in the graph being executed. Is this also unsupported? For testing purposes, I removed the secure parameter and then tried to execute again, but got the following error

Cannot find class: jk.StreamFile


The StreamFile class is used by a CustomJavaTransformer component within the graph. Is this also unsupported?


Thanks,
Paul

jandikovae
Posts: 64
Joined: Fri Nov 04, 2016 8:51 am

Re: Pass Metadata and Filename as Parameter in Designer

Postby jandikovae » Thu May 13, 2021 6:56 am

Hi Paul,
The Secure Parameter functionality is available only in the Server Environment. It requires setting the Master Password in the Server console and the Secure parameters are then automatically decrypted by the Server in graph runtime.

Regarding your other issue, I would appreciate some details about your use case and why do you need to run the graph using the "separate JVM" function in the first place. As was mentioned before, this setup has certain limitations and I believe that there might be some workaround for your use case.

However, if you insist on having the RunGraph "The same JVM" parameter set to "false", you might want to try a different setup of your CustomJavaTransformer component. Unfortunately, I was not able to recreate the exact error message as you can see, but I also bumped into some issues when having CustomJavaTransformer being called by RunGraph with separate JVMs. This kind of behavior would suggest my custom class might not be in the classpath, for example. In my example, I was able to resolve the situation by setting up the "Algorithm" property, instead of "Algorithm class" in the CustomJavaTransformer component. Please give it a try and let me know if it worked for you as well.

Best Regards,
Eva
---
Eva Jandikova
CloverCARE Support
CloverDX

Visit us online at http://www.cloverdx.com

pfield
Posts: 20
Joined: Thu Mar 19, 2020 3:46 pm

Re: Pass Metadata and Filename as Parameter in Designer

Postby pfield » Thu May 13, 2021 6:43 pm

Hi Eva,

Your recommendation of using the "Algorithm" property worked great - THANK YOU!

Regarding my use-case, let me explain as best I can and keen to hear if there is a better approach...

I have a graph which collects files from a large directory (100GB+) and converts them into base64 before passing them to a web service (via HTTPConnector) for processing. The files that I need to convert are only a subset (say 200k of 2m), and if I were to try and read in all those files into the graph runtime my machine would definitely hit heap space problems.
To get around the problem, I used a CustomJavaTransformer to accept a filepath and then access and read (in bytes) the file then I convert to base64 - this way I am only ever storing in the files I need in runtime. This works well up to around 10-15k files then I start to hit heap space issue.
To solve that problem, I created a parent graph to batch up the files that need processing and then feed a RunGraph to execute the child graph, passing in Batch and Batch Size parameters (e.g. Batch=1, Batch_Size=10000 means the first 10000 files would get processed). The parent graph then just works it way through the batches without me having to intervene. Having "The Same JVM" set to false means I can dynamically pass in my Batch and Batch_Size parameters via the "Command line arguments" parameter, meaning the parent just figures out how many batches are needed and the rest is done for me.

Hopefully that makes sense, let me know if you need any more details.

Thanks,
Paul

jandikovae
Posts: 64
Joined: Fri Nov 04, 2016 8:51 am

Re: Pass Metadata and Filename as Parameter in Designer

Postby jandikovae » Wed May 26, 2021 9:15 am

Hi Paul,

Your project is quite challenging and we don't have many options in case we would like to rely on the Designer only. I think it is safe to say, that you should keep the solution as it is right now (as long as it works with the "Algorithm" property). I am aware that the RunGraph component has its limitations, but I have been informed that the trend is rather to keep the option to run consecutive jobs with the Server and not adjust and improve this option in the Designer, I am afraid. In fact, the RunGraph component is going to be deprecated soon and removed entirely in the future (it is assumed that you can use ExecuteGraph in the jobflow instead).

Anyway, I have at least one good piece of news for you. The Secure Parameters functionality is, in fact, available in the Designer without the CloverDX Server environment. I am going to research in more detail why the error message said otherwise. To use it, you should just set the Master Password on the Designer Runtime level. For more information see the following link:
https://doc.cloverdx.com/latest/designe ... sword.html
I am very sorry that I haven't found this one before.

Best Regards,
Eva
---
Eva Jandikova
CloverCARE Support
CloverDX

Visit us online at http://www.cloverdx.com


cron