Dev Diary - Adding Transformations to Mapping Data Flows

Written by Roelant Vos on 2.1.2022

TAGS: Biml,BimlFlex,adf,DataFlowMapping

Share

The Mapping Data Flows ('data flows') feature of Azure Data Factory (ADF) provides a visual editor to define complex data logistics and integration processes. Data flows provides a variety of components to direct the way the data should be manipulated, including a visual expression editor that supports a large number of functions.

Work is now nearing completion to make sure BimlFlex can incorporate bespoke logic using this expression language, and generate the corresponding Mapping Data Flows.

For every object in every layer of the designed solution architecture, it is possible to add derived columns. Derived columns are columns that are not part of the 'source' selection in a source-to-target data logistics context. Instead, they are derived in data logistics process itself.

This is done using Derived Column Transformations in Mapping Data Flows, in a way that is very similar to how this works in SQL Server Integration Services (SSIS).

In BimlFlex, you can define complex transformation logic this way using the data flow expression syntax. The resulting code will be added to the selected Mapping Data Flow patterns, and visible as Derived Column Transformations. You can also define dependencies between derived columns, for example that the output of a calculation is used as input for the next one.

Consider the screenshot below.

alt text here…

The column 'MyDerivedColumn' is defined as part of the 'Account' object. The 'IsDerived' checkbox is checked, and the 'Dataflow Expression' is provided. This metadata configuration will generate a Derived Column Transformation with this column name, and the expression.

The resulting column is added to the output dataset.

alt text here…

In the first screenshot, the 'solve order' property was also set - with a value of 0.

The solve order directs BimlFlex in generating the logic in a certain (incremental) order. So a lower solve order will be created as an earlier Derived Colum Transformation, and a higher solve order later. The exact numbers do not matter, only that some are higher or lower than others.

If multiple derived columns are defined with the same solve order, they will be generated in the same Derived Column Transformation. This makes it possible to break apart complex logic in separate steps, and use them for different purposes.

This is one way to allow potentially complex logic to be defined in the metadata, and use this to generate consistent output that meets the requirements.

Comments

Written by Charmaine on 2/21/2024 2:52:43 AM

Thank you for sharing! Leveraging the Mapping Data Streams bitlife feature and its support for derived columns allows users to define and execute complex data integration processes efficiently, delivering consistent results and being Reliable, meeting business requirements.

Written by kalyl on 2/29/2024 4:18:11 AM

BimlFlex aims to incorporate bespoke logic using Infinite Craft Unblocked the expression language supported by Mapping Data Flows in Azure Data Factory.

Written by Betty on 3/12/2024 6:51:37 AM

I'm glad to see that you have a unique way of writing the post. Now it's easy for me to understand the idea and put it into practice io games