Clover 2.8.0 vs 2.7.1 Performance

Heya,

I’ve been refactoring our graph file to try and minimize run times. We recently downloaded 2.8.0 and find that our example graph runs about 2-4 minutes slower when we run it under 2.8.0 than when we run it using 2.7.1. These are munual runs using the command line. I started eliminating various nodes and running the graph under both versions to see if particular nodes made a difference. It seems from my tests that removing the FIXLEN_DATA_WRITER node (by switching it to a DELIMITED_DATA_WRITER) made the biggest difference - the run times seemed to even out. I know these nodes are deprecated, but we were under the impression that they were performing better than the Universal Writer.

Was there something changed in the fixed length data writer that makes it slower? If we switch to the Universal Data Writer, will it make a difference?

I can post the graph if it is useful.

Thanks,
Anna

Hi Anna,
that is definitely interesting information for us. Can I ask you for the graph, where the performance leak occurred. How many percentage of runtime are these 2-4 minutes?

I’m little bit confused by described situation with the component switching. Which component was originally used to write data. The performance of delimited data writer can be faster, because the resulted file is probably smaller. Please send us the graph, where was the performance leak encountered.

Thanks a lot, Martin.

Probably we will be able to help you with performance tweaking of the given graph :wink:

Martin

Heya Martin,

Thank you for your reply! I’ve attached a zip file with our example graphs plus an excel sheet with the running times we’ve seen when running each case.

We autogenerate the graph file for our users. The component switching comes about because we are comparing against a legacy application that is running the same input data through an ETL tool. There are certain things we do that the legacy application does not, so we takes those components out for time comparison. The ADDITIONAL_TRANSFORM node runs additional transforms and narrows the target data down from approximately 2100 fields (if it wrote this file, it would be a delimited file) to approximately 100 fields and creates a fixed width file. We also record all the orphaned rows, which requires extra nodes off of the DEDUP nodes.

When I remove the ADDTIONAL_TRANSFORM node, I have to switch the writer to a delimited writer, but it is writing out 2100 (mostly empty) fields versus 100 fields for a fixed width file. When you look at the times, 2.7.1 and 2.8.0 seem to consistantly get the same times except when running the full, original graph.

The forum has been very helpful to me with trying to get our graph optimized. I’ve really appreciated all of the help! If there any other type of stats I can run that would help expose any issues, let me know.

Thanks,
Anna

Unfortunately there is too many options what could cause the performance leak. I made few tests of the FixlenDataWriter and no deterioration appears. Would it be possible to do some more tests where you split the graph by CloverReader and CloverWriter to separated ‘phases’ a compare the elapsed time? Via this testing method you should be able to focus our attention to right component or set of component where is the problem.

Regarding overall performance I guess the main part of runtime is spent in Sorting components. I would recommend you to try to tweak the Sorting component. I would try to increase the buffer size by example to 50000 (probably you must allocate more memory via -Xmx1000M, this depends on wideness of your records).

Probably the most performance improvement you can reach by using of our FastSort (http://wiki.cloveretl.org/doku.php?id=c … s#fastsort) - try it you will be surprised :slight_smile:

Heya,

I will try and split up the phases to see if I can get a recorded time. I think someone from our group has a trial license for the commercial version of the Clover engine (they are evaluating it), otherwise I cannot try the FastSort, right?

Thanks,
Anna

Heya,

I have to wait until the weekend to try the two phases suggestion (that’s when the test box is free), but I was able to have the group evaluating Clover Server (the version before 2.8.0) run the graph and then change out the ExtSort for FastSort. They left all of the properties at the defaults (They just changed the type from ExtSort to FastSort).

The original graph took about 20 minutes to run. With the FastSort switch out, they killed it after it had been running for 1 hour. Are there some settings we are missing or is there something we should look for to see why it wouldn’t finish?

Thanks,
Anna

Hello Anna,

regarding your FastSort experience: Do you have access to the logs that the unsuccessful run generated? It would be interesting for us to have a look at it so that we know what goes wrong.

My first suggestion would be out-of-memory problem or exceeded disk quota. FastSort is quite greedy for system resources by design - much more that ExtSort. So having limited memory often causes FastSort to fail. Another thing to look at is the disk quota-FastSort generates quite a lot of temporary files and exceed limit for maxium number of files. On Windows it is usually no problem, but running such graph as normal Linux user might be a problem. Try checking “ulimit” if it is a Unix system.

FastSort performance is also dependent on settings, mainly the “Run size” attribute. Default value is good for up to few millions of records, but especially with big records (you mention hundreds, even thousands of fields) you may want to tweak it a little. Lower values (1k to 50k) generate more temporary files (quota!), require less memory and usually are good for speed. High values (>100k) are a must for large datasets when lower values fail, requires more memory and improve speed for large datasets.
Also, there is a more human friendly attribute “Estimated record count” which lets you set approximate number of records you are going to sort. Internally, “Run size” is a function of “Estimated record count”.

Hope this helps,
Pavel

Heya,

I ran tests where I broke the graph into 2 phases - the writer was the only thing in the second phase. The results were pretty mixed, so I think I’m gonna chalk the difference between 2.8.0 and 2.7.1 down to variances on the box unless I can find something concrete to bring back to you guys. Thanks for the testing suggestions!

I will ask about the log on the FastSort issue. The test box is running Solaris, so I suspect the limit on files is probably what happened because we didn’t get an out-of-memory…

Thanks!
Anna