There was a new thing that Talend said they supported in July: Slowly Changing Dimensions. I guess they were playing catchup, because as far as I know this has been supported by Kettle for a while. Never mind, I thought I would give it a try and compare how well both tools support SCDs.
Bottom line: booth tools make SCD management super easy. Congratulations guys, you made a pretty difficult concept easy to implement. Clearly, Talend’s implementation is still young, it is missing some features such as surrogate keys or specifying the end date. Kettle has a more thorough functional coverage.
Something that’s missing from both tools however: Type 3 SCDs. OK, I’ll grant you this – in my years of consulting, I have never had to implement a Type 3 SCD. But still, it would be good to have it, just in case you need it 🙂
From the performance standpoint, Talend clearly makes up for its functional gaps. I ran a test with 25,000 source records. When creating the dimension, TOS went through the process in 8.7 seconds but it took Kettle 675 seconds! Updating the dimension, a much more resource consuming process, took TOS 512 seconds and Kettle 1,323 seconds.
Which tells me another thing: no vendor can claim to always be 50 or 100 times faster than others! Performance comparisons depend so much on which test you run. In my case, TOS is 78 times faster than Kettle in the first test, but only 2.6 times faster in the second one.