I just wrapped up the first part of a tool selection project for an old client of mine, who wants to introduce open source data integration on a pilot project.
Working with the client, we have short listed the two open source solutions we deem to be the most mature: Pentaho’s Kettle and Talend’s Open Studio. Their respective positioning is interesting: Pentaho being an open source BI vendor focuses exclusively on ETL for the BI market, whereas Talend say they address more than BI but also operational data integration.
There are a number of other open source ETL projects out there, but none of them has the backing of a “real” company. Not to say these projects are bad, but my client is just tip-toeing into open source and they wanted to feel reassured about tech support, viability, etc. So have just been looking at these 2 vendors.
A big part of the evaluation was about performance. I have run Kettle and TOS on identical scenarios to see how they perform.
I did not only try the latest versions, but also went back one version. It’s interesting to see how both products have made significant performance improvements in their latest builds.
Anyway, I thought I’d share the results of this benchmark: benchmark-tos-vs-kettle.pdf
I’ll be posting more info as my work with this client progresses. Right now they have not entirely confirmed their choice, they want to look at other criteria beyond pure performance. But the scale if clearly leaning toward one side… (check the benchmark if you want to know which one!)