Error handling about reading XML files

Hi CloverETL, I am a new at cloverETL(Oracle Endeca Information Discovery 3.1).

There are many xml files in a folder, I need to read these xml files using XML reader components(XML Extract, XML Reader, XMLXPathReader component).

Some of XML files have invalid xml tag,
- start tag and end tag are not the same, such as bongki

In this case, the whole graph is stopped,

I have to read all xml files, so I want to skip this invalid xml files and to continue next files.

Would you guide me? how to handle this problem?

I attached the sample project,

Thank you,

Bongki.

Hi Bongki,

I am not sure about Oracle Endeca Information Discovery but CloverETL Corporate Server supports jobflows. In a jobflow, you are able to list files from a directory and execute a graph for each file individually. Component ExecuteGraph has property Stop processing on fail. This property set to false behaves exactly in the way you described - the valid files are processed and the invalid ones can be either thrown away or logged or handled separately.

Regards,

Hi Lubos,

Thank you for your reply,

Actually, I am not sure about exact CloverETL version which the Endeca Information Discovery 3.1 use.

In my pallette, I could see just 4 components in JobFlow below.
[Job Control]
- Fail
- GetJobInput
- SetJobOutput
- Success

Lubos, these components are enough to solove this issue as you said.

Could you share any sample graph about it?

I attached the installation detail screen shot.

Thank you,

Bongki.

The four components you mentioned are unfortunately not enough to solve your issue. As I wrote before, you need ExecuteGraph http://doc.cloveretl.com/documentation/ … graph.html and ListFiles http://doc.cloveretl.com/documentation/ … files.html

I am unfortunately not familiar with Oracle lincensing policy so I do not know whether you have the appropriate license to use those components or whether they are even present in your version of Oracle software. The only thing I can do is to describe the solution in pure CloverETL as I did in my previous post. Please try to ask on Oracle forum, they are definitely more competent to solve your issue with the missing components.

Thanks for your understanding.

Hi Lubos,

I really thank you for your reply, and I understand your meaning.

Can I ask one more thing about data policy?(strict, controlled)

In universal reader component case, if I set controlled in data policy.

it skiped the error records and continued the next records.

In my case, the xml souce data is stored in database as CLOB data type.

what is the purpose of the data policy in XML reader component?

Thank you,

Bongki.

Sure you can. And that is a very good question. The short answer is that data policy in XML readers does not work properly at the moment. I have already raised a new issue in our bug tracker based on your original question, see https://bug.javlin.eu/browse/CLO-3715

Regards,

Thank you, I hope this bug will be solved as soon as possible,

CloverETL version which Endeca Information Discovery 3.1 use is 3.4.

Thank you very much,

Best Regards,

Bongki.