Hi there,
We recently upgraded to CloverDX 5.2.0 server (from CloverETL 4.0.4) and since then have been having a problem with “Too many open files”. The scenario is: a specific graph (Users.Get.grf) fails after about 30 minutes of running, returning this error message. At this point, the worker process is practically useless, because every graph would fail after this. Meanwhile, the server keeps accepting requests from clients, not knowing that the worker is unable to process anything.
Is there a config setting to restart the worker when this occurs? I noticed that when the problem is with memory, the worker is restarted automatically and everything keeps going. Can we do the same thing for open files? Or at least, for when the graphs have been consistently failing for the last x runs.
If there’s nothing like that, do you have guys have some groovy code for the Universal Listener, to watch for this stuck condition? Or, how do you handle this “always fail” condition?
Thanks so much for your help!
Jus
From the run log:
2019-06-11 03:29:11,006 ERROR 80957 [WatchDog_80957] Component [Get User Data:GET_USER_DATA] finished with status ERROR.
Execution of graph './graph/Users.Get.grf' failed!
Component [GETUserRoles:GETUSER_ROLES] finished with status ERROR. (In0: 3972 recs, Out0: 3971 recs)
Too many open files
2019-06-11 03:29:11,007 ERROR 80957 [WatchDog_80957] Error details:
org.jetel.exception.JetelRuntimeException: Component [Get User Data:GET_USER_DATA] finished with status ERROR.
at org.jetel.graph.Node.createNodeException(Node.java:655)
at org.jetel.graph.Node.run(Node.java:620)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
Caused by: org.jetel.exception.JetelRuntimeException: Execution of graph './graph/Users.Get.grf' failed!
at org.jetel.component.RunGraph.logError(RunGraph.java:546)
at org.jetel.component.RunGraph.runGraphThisInstance(RunGraph.java:539)
at org.jetel.component.RunGraph.runSingleGraph(RunGraph.java:407)
at org.jetel.component.RunGraph.execute(RunGraph.java:283)
at org.jetel.graph.Node.run(Node.java:580)
... 3 more
Caused by: org.jetel.exception.JetelRuntimeException: Component [GETUserRoles:GETUSER_ROLES] finished with status ERROR. (In0: 3972 recs, Out0: 3971 recs)
Too many open files
... 7 more