Hello!
I read two files with two data readers, merge join them then write the result to file with data writer. The driver file contains 10 million records, the slave - 60 million. I use inner join. The problem is - when the joiner finds out there’s no more records on input port 0, it broadcasts the EOF message to other components down the graph and then stops. The data writer connected to the joiner stops as it should. But data reader which feeds the slave file into the joiner keeps pumping away until the entire input file is consumed. As a result the execution time increases significantly if all the records from a driver file have matches somwhere in the beginning of a slave file.
So the question is:
Is there a way to stop the reader or graph after the joiner has finished its job in this particular case?
I think we need a more generic solution to this problem - when a Node is about to stop, it must broadcast a message not only to child nodes, but to parent nodes also. The message essentially means “I don’t need your services anymore”. Then the parent Node must decide for itself what to do. If nobody needs it’s services, it must stop and notify it’s own parents.