BimlFlex Data Mart Automation using Polybase and Azure SQL Data Warehouse

Written by Peter Avenant on 10.23.2017

TAGS:

Share

Investing into designing and implementing a substantial Data Warehouse creates a foundation for a successful Data Platform architecture. Using a configurable Data Warehouse Automation solution that support all the best bits of Azure SQL Data Warehouse as standard is essential. For more information on why we use this approach please also read this blog post by Roelant Vos Embrace your Persistent Staging Area for Eventual Consistency.

Azure SQL Data Warehousing

Leverage metadata-driven data warehouse automation and data transformation optimized specifically for all Microsoft Azure SQL Data Warehouse options. The ability to extract, compress and prepare data at source is critical to delivering an optimized solution. Using Polybase with parallel files you can improve the data warehouse loads well over ten times from traditional SSIS packages.

We demonstrate extracting data from a source system taht can be staged or presisted into tables or loaded directly into type 1, 2 or 6 dimensions and facts.

Webinar

In the previous webinar, we touched on data warehousing using Azure SQL Data Warehouse and will go into detail showing the parallelism and transformation using Polybase. Traditionally most of the project time is spent on connecting to the source systems configuring CDC and parameters to extract data. We will look at how easy BimlFlex implements scaling out your data ingest by creating parallel threads and multiple files. This approach is vital for optimal performance as explained by James Serra in the following blog post. James Serra. PolyBase explained.

BimlFlex data warehouse automation, especially when combined with Azure SQL Data Warehousing, is worth investigating if you are about to embark on a modern data warehouse project.

If you have any queries or would like to discuss how BimlFlex Data Warehouse Automation can benefit your project please please email us at sales@varigence.com.

Comments

Written by Guest on 11/22/2017 11:58:30 PM

How is the source data distributed into multiple files when loading into Azure Blob Storage?

Written by PeterAvenant on 11/23/2017 11:35:58 PM

We are using the SSIS Balance Distributor and you can configure the number of files for the entire batch and also override it per object.