Improve ETL Automation and Performance with TimeXtender 2014

TimeXtender has been an integral tool for rapid ETL development and data warehouse automation. TimeXtender is a user friendly “point and click” tool that provides the basis for a BI solution, including data extraction, data validation, data transformation, and the building of data warehouses and OLAP cubes without the need for hard coding the SQL code. Furthermore, TimeXtender works seamlessly with Microsoft SQL Server, MS Analysis Server and other Microsoft BI offerings.

Recently they have released the TX2014 Service Release 1 (SR1) with many new features and benefits from previous releases. Many of the things that I felt could help improve the product have really shined in this release. I first want to talk about the benefits of the TimeXtender in general and then I will discuss the new features.

Benefit 1 – TimeXtender Visual Interface
TimeXtender has a nice visual interface showing you the Source, Stage, and DW areas of you database warehouse all on one screen. It’s a very easy tool to navigate through.

Data Warehouse Automation

Benefit 2 – Data Warehouse Creation Made Easy
You can add 3 source tables and 3 stage tables and 3 Data warehouse tables in a few clicks. The formatting and transformations of the data is where you will spend your time but you can have that done in a few hours. If you were to build 3 tables from source to stage to DW it would take you a few days in traditional SSIS.

Benefit 3 – Type 2 Slowly Changing Dimensions in a Jiffy
Historical Data is one of the hardest ETL designs in data warehouses. Requires a lot of development in SSIS. You can build a Type 2 Slowly changing dimension that will record historical data in a few hours where as in traditional SSIS it would take a few days.

New Features 1 – Set Based Transformations
The previous versions of TimeXtender used a user defined function to do a transformation. This was done by creating a view from the stage table and adding the function to that view which caused performance issues. So with the new TX2014 SR1 release they added a Set based design to the transformations which added a case when statement to the view. The next few screen shots show the old and new designs.

UDF Method (old method):
Customer table has a transformation on the StoreID column:


The “Used Legacy Transformations” setting at the table level if checked will use the UDF transforms otherwise it will be a Set Based transformation.

Use Legacy Transformation

Customer View with the UDF:

Customer View with UDF

User defined function for StoreID column:

User Defined Function

Set Based Method (New Method):
Product table has a transformation on the Color column:


Product view with the case when statement:

Set Based Transformation

There is one check box that is set at the table level that gives you the choice to run legacy UDF transforms or Set based case when transforms. The new Set Based transformation will gives us better performance then the legacy method.

New Features 2 Managed Parallel Package Execution Threads and Table Dependencies
This is a great new feature because before we only could run the packages in serial. TX2014 Service Release 1 (SR1) has added a new Execution overview log that will allow us to see the threads and which type of execution was run. So we can see the load to source package time of execution as well as the transform load to clean times. One feature that keeps this organized is the table dependencies addition which says that these tables need to run on the same thread because they depend on each other. With that being said you still need to order them in TimeXtender by their dependency. So we want to run a lookup table first then the main table that uses the lookup next but all on the same thread.

Product table for example has a relationship with Product Subcategory and Product Subcategory has a relationship with Product Category. So we want them on the same thread and ordered properly in TimeXtender. We can add a dependency of Product Subcategory and Product Category on the Product table so they run on the same thread.

Table Dependencies

We can set up the Execution design to have many threads which is a new feature in TimeXtender. In this example we set up 3 threads and have 6 tables on those threads.

Execution Package

TimeXtender will manage which threads to put each table based on dependencies that are set and ordering. They have added in a cool new Gannt Chart with the log information. Each bar represents a package run and it shows the thread it was run on.

Managed Thread Execution

Also in the Execution log is the execution order of each of the packages.

TX Execute Project

New Feature 3 – Table Execution Log
The last new feature that I will share is the execution log from the individual table. We can look at the tables execution time for every batch that is run. This is an example of the Product table which had 3 batches run and shows us the load to source and load to clean packages and if you roll over the bar it shows the time it took to execute.

Execution Overview log

The Execution log features are something I have wanted to see where we can get a better glimpse of what is happening under the hood. This gives us more confidence of how the application is running and possible methods of what maybe causing performance issues.

If you are interested in learning more or seeing a demo, email my colleague, Sally Phillips and she’ll get right back to you.