Improve ETL Automation and Performance with TimeXtender 2014

TimeXtender has been an integral tool for rapid ETL development and data warehouse automation. TimeXtender is a user friendly “point and click” tool that provides the basis for a BI solution, including data extraction, data validation, data transformation, and the building of data warehouses and OLAP cubes without the need for hard coding the SQL code. Furthermore, TimeXtender works seamlessly with Microsoft SQL Server, MS Analysis Server and other Microsoft BI offerings.

Recently they have released the TX2014 Service Release 1 (SR1) with many new features and benefits from previous releases. Many of the things that I felt could help improve the product have really shined in this release. I first want to talk about the benefits of the TimeXtender in general and then I will discuss the new features.

Benefit 1 – TimeXtender Visual Interface
TimeXtender has a nice visual interface showing you the Source, Stage, and DW areas of you database warehouse all on one screen. It’s a very easy tool to navigate through.

Data Warehouse Automation

Benefit 2 – Data Warehouse Creation Made Easy
You can add 3 source tables and 3 stage tables and 3 Data warehouse tables in a few clicks. The formatting and transformations of the data is where you will spend your time but you can have that done in a few hours. If you were to build 3 tables from source to stage to DW it would take you a few days in traditional SSIS.

Benefit 3 – Type 2 Slowly Changing Dimensions in a Jiffy
Historical Data is one of the hardest ETL designs in data warehouses. Requires a lot of development in SSIS. You can build a Type 2 Slowly changing dimension that will record historical data in a few hours where as in traditional SSIS it would take a few days.


New Features 1 – Set Based Transformations
The previous versions of TimeXtender used a user defined function to do a transformation. This was done by creating a view from the stage table and adding the function to that view which caused performance issues. So with the new TX2014 SR1 release they added a Set based design to the transformations which added a case when statement to the view. The next few screen shots show the old and new designs.

UDF Method (old method):
Customer table has a transformation on the StoreID column:

Transformation

The “Used Legacy Transformations” setting at the table level if checked will use the UDF transforms otherwise it will be a Set Based transformation.

Use Legacy Transformation

Customer View with the UDF:

Customer View with UDF

User defined function for StoreID column:

User Defined Function

Set Based Method (New Method):
Product table has a transformation on the Color column:

Transformation

Product view with the case when statement:

Set Based Transformation

There is one check box that is set at the table level that gives you the choice to run legacy UDF transforms or Set based case when transforms. The new Set Based transformation will gives us better performance then the legacy method.

New Features 2 Managed Parallel Package Execution Threads and Table Dependencies
This is a great new feature because before we only could run the packages in serial. TX2014 Service Release 1 (SR1) has added a new Execution overview log that will allow us to see the threads and which type of execution was run. So we can see the load to source package time of execution as well as the transform load to clean times. One feature that keeps this organized is the table dependencies addition which says that these tables need to run on the same thread because they depend on each other. With that being said you still need to order them in TimeXtender by their dependency. So we want to run a lookup table first then the main table that uses the lookup next but all on the same thread.

Product table for example has a relationship with Product Subcategory and Product Subcategory has a relationship with Product Category. So we want them on the same thread and ordered properly in TimeXtender. We can add a dependency of Product Subcategory and Product Category on the Product table so they run on the same thread.

Table Dependencies

We can set up the Execution design to have many threads which is a new feature in TimeXtender. In this example we set up 3 threads and have 6 tables on those threads.

Execution Package

TimeXtender will manage which threads to put each table based on dependencies that are set and ordering. They have added in a cool new Gannt Chart with the log information. Each bar represents a package run and it shows the thread it was run on.

Managed Thread Execution

Also in the Execution log is the execution order of each of the packages.

TX Execute Project

New Feature 3 – Table Execution Log
The last new feature that I will share is the execution log from the individual table. We can look at the tables execution time for every batch that is run. This is an example of the Product table which had 3 batches run and shows us the load to source and load to clean packages and if you roll over the bar it shows the time it took to execute.

Execution Overview log

The Execution log features are something I have wanted to see where we can get a better glimpse of what is happening under the hood. This gives us more confidence of how the application is running and possible methods of what maybe causing performance issues.


If you are interested in learning more or seeing a demo, email my colleague, Sally Phillips and she’ll get right back to you.

TimeXtender – Lookups and Relationships

TimeXtender is a new tool that I’ve been using to rapidly increase our ETL Development. Check out the site: TimeXtender

There are two ways you can add a lookup and have relationships configured in TimeXtender:

Method 1 is to drag the column down from the lookup table to the main table and set the join by the keys that match between the 2 tables. For example we have ProductSubcategory and Product where ProductSubcategoryID is the key between the lookup table ProductSubcategory and the main table Product.This is considered a Left outer Join which will send rows that don’t match into the Warning table.

lookup

Method 2 is to drag the key from the lookup table to the foreign key of the main table. So for this example we drag the ProductCategoryID from ProductCategory table to the ProductCategoryID in the ProductSubCategory table. This will give an option to create a relation.

relation

We can see the relationship created for the ProductCategoryID below:

created_relation

Now we want to drag the Name from ProductCategory to the ProductSubCategory table and it will ask us if we want to use that relation that we just created to join the table for this field.

catnamerelation

So we can see that join uses the same Keys that the relation has and this now gives us some leverage to change how that joins acts.

finalcatnamejoin

Lastly we can change the way that the join behaves by a right click on the relationship to change the options as follows:
(Click to enlarge image)

RelationView

So we need to define the 3 relationship settings in terms of joins:
1. Error (Inner Join)
2. Warning (Left Outer Join)
3. Relation (Full Outer Join)