Im trying to join two data readers in one but sometimes one or both could be empty, how can i prevent error when file is not found?
CloverError.png
Hello Rickymartin06,
based on your attached screenshot, I would suggest an approach of making the UniversalDataReader components read an empty file in case the original file was not found, thus avoiding a potential graph fail on the reader and/or the join component. This can be achieved by placing an ExecuteScript component before the UniversalDataReader component in order to get the desired input file URL. Here is an example script for a Windows OS:
@echo off
if exist POLL04.DOS echo ${DATAOUT_DIR}/POLL04.DOS
if not exist POLL04.DOS echo ${DATAOUT_DIR}/POLLEmpty.DOS
What this script does is that it checks if the POLL04.DOS file actually exists (in the working directory). If it does exist, it will output its file URL to the standard output ($in.1.stdOut). If it does not exist, it will output a file URL of a designated empty file. Note: you would want to define the working directory property (e.g. ${DATAOUT_DIR}) and create the empty file (POLLEmpty.DOS) beforehand. Then you would connect the ExecuteScript output port with the UniversalDataReader input port and set up the input file URL to read from the port (similarly to what you did with the DataGenerator and the second reader in your screenshot).
The result of this setup is that no data will ultimately flow into the UniversalDataWriter if one or both of the input files are not found, yet, the graph will not fail.
Note: if you were taking advantage of the CloverETL Server, you could achieve the desired result by using a ListFiles component to list input files in the given folder.
Regards,
That worked , i had to move my graph to a jobflow to use the execute script but now im facing a different problem, i need to pass the script trought an input to the execute script and is not working i think it need to pass the [Enter] in every command (@echo off [enter] echo …).
In the Data Generator im doing the folowing:
$out.0.TargetURL = "@echo off if exist ./data-out/mnt-tmp/POLL04.DOS echo ./data-out/mnt-tmp/POLL04.DOS if not exist ./data-out/mnt-tmp/POLL04.DOS echo ./data-out/POLLEmpty.DOS";
In the Execute Script i have the following in the Input Mapping:
function integer transform() {
$out.0.script = $in.0.TargetURL;
return ALL;
}
Thanks,
Ricardo
Managed to fix the issue with the following “&”
"@echo off & if exist ./data-out/mnt-tmp/POLL04.DOS echo ./data-out/mnt-tmp/POLL04.DOS"
Ended using List Files with a Trash to do the graph.
Capture.PNG
Hello Ricardo,
thank you for the feedback. Moving your data transformation to a jobflow is certainly a more elegant and straightforward way to go. Having said that, let me provide 2 additional recommendations that might be worth implementing as they are considered to be a best practice with potential to prevent major issues in the future:
-
Generally speaking, it is not recommended to process large volumes of records in jobflows due to a potential significant increase in memory consumption. Suggested is to wrap the components that are placed after ListFiles into a graph and replace them with an ExecuteGraph component in your jobflow. Then, the output URL of the ListFiles can be easily mapped to an input parameter of the executed graph. From looking at your screenshot, it will not make much of a difference at the level of hundreds of records, but it would make a lot of sense when you need to process hundreds of thousands or millions of records.
-
The ListFiles component is generally used to provide a list of files in a directory (as opposed to referencing one specific file). Typically, the component is used with a wildcard character (such as an asterisk symbol *), for example, the URL: ‘${DATAIN_DIR}/*.DOS’ would make ListFiles list all the files with a ‘DOS’ suffix. The advantage of this approach is that even though no file matched the URL pattern (so no file was found) the job would not fail. Instead, the ListFiles component would simply send zero records to the ExecuteGraph component. On the contrary, in a situation when you are referencing a specific file (‘${DATAIN_DIR}/ POLL04.DOS’) and the file is not found, the job will fail.
Best,