Dax is the heart of the Power BI solution. This blog will discuss how we can achieve Dax Performance Tuning. DAX Studio can be used to monitor the performance of DAX, and the same tool was used to see the performance differences of different DAXs in this blog.
You can get more information related to Power BI Optimization here. You can also read the Microsoft blog for some tips related to performance tuning.
Power BI Dax Performance Tuning
Here is the list of some Dax examples and suggestions that can be used for DAX performance tuning:
Use of Divide Function
The divide is an optimized function to perform division instead of/with the IF statement to validate the denominator.
Avoid Calculated Columns
Use calculated columns only when necessary. Calculated columns will slow down the Dataset Refresh time, and it can also cause low compression.
Avoid Overused of Calculate Function
Calculate is a beneficial function in Power BI Dax; however, it can slow down the performance if overused.
Use of Filter Conditions
Instead of using the table in the filter argument, use a column to filter the CALCULATE and CALCULATETABLE functions.
[GSV with Table Filter] = CALCULATE([Invoice – Value (GSV)], FILTER ( FactInvoice, FactInvoice[Cases] > 1000 ))
/*Execution Time = 107 ms (Slow)*/
[GSV with Column Filter] = CALCULATE([Invoice – Value (GSV)], KEEPFILTERS ( FILTER ( ALL(FactInvoice[Cases]), FactInvoice[Cases] > 1000 )))
/*Execution Time = 67 ms (Fast)*/
In the above example, [GSV with Column Filter] Dax performs 40% better than [GSV with Table Filter] by using a column filter instead of a table filter in Calculate function.
Reducing Context Transitions
The storage engine can only compute more specific grouping, such as aggregations and grouping over columns; however, Formula Engine is responsible for all other computations. Suppose the expression computed during the iteration is simple for the storage engine. In that case, performance is typically good otherwise large data cache and high frequency of calls between the storage and formula engines. These calls can slow down the storage engine execution, and it disables the use of the storage engine cache for the data cache produced. Removing this can optimize the query execution.
Distributor Discount (Slow) = SUMX(PRODUCT, [TY_Sales_PCS] * (2* PRODUCT[UOM_LISTPRC]/100))
/*Execution Time: 418 ms */
Distributor Discount (Fast) = SUMX(VALUES(PRODUCT[UOM_LISTPRC]), [TY_Sales_PCS] * (2* PRODUCT[UOM_LISTPRC]/100))
/*Execution Time = 242 ms */
In the above example, Dax [Distributor Discount (Slow)] is slow due to the transition between the storage engine and formula engine for each row of the product table; however, in the [Distributor Discount (Fast)] measure, the transition has occurred for only distinct values of column [UOM_LISTPRC]. We are gaining 42% performance by reducing the number of transitions between Storage and Formula Engine.
Avoid Multiple Executions by using VAR.
When the same sub-expression must be evaluated numerous times in DAX Expression, it is best to store the result of the sub-expression in a variable and reference the variable in DAX Expression.
GSV YOY (Slow) =
IF (
NOT ISBLANK ( [TY_GSV] )
&& NOT ISBLANK ( [LY_GSV] ),
[TY_GSV] - [LY_GSV]
)
/* Execution Time: 343 ms */
GSV YOY (Fast) =
VAR TYGSV = [TY_GSV]
VAR LYGSV = [LY_GSV]
RETURN IF ( NOT ISBLANK(TYGSV) && NOT ISBLANK(LYGSV), TYGSV - LYGSV)
/* Execution Time: 70 ms */
We get 80% performance using VAR to calculate the YOY measure in the above example.
Avoid Nested Iterators
Only the innermost iterator can be executed in a single storage engine query when there are nested iterators in DAX. The outer iterators require additional storage queries or larger materialization, so we should try to minimize the nested iterators using summarize and other functions.
Distributor Discount (Slow) =
SUMX (
'Customer',
SUMX (
'Product',
SUMX (
RELATEDTABLE ( FactInvoice ),
FactInvoice[Pieces]
* (
2
+ IF ( Customer[NTN Flag] == "Yes", 1, 0 )
) * PRODUCT[UOM_LISTPRC] / 100
)
)
)
/* Execution Time: 1757 ms */
Distributor Discount (Fast) =
SUMX (
SUMMARIZE ( 'FactInvoice', Customer[NTN Flag], PRODUCT[UOM_LISTPRC] ),
CALCULATE ( SUM ( FactInvoice[Pieces] ) )
* (
2
+ IF ( Customer[NTN Flag] == "Yes", 1, 0 )
) * PRODUCT[UOM_LISTPRC] / 100
)
/* Execution Time: 74 ms */
We get the distributor discount in the [Distributor Discount (Slow)] measure by using Nested Dax operators; however, in the [Distributor Discount (Fast)], we remove the nested iterators by using the Summarize function and get 99% faster execution. Please remember that this percentage can vary depending on the number of entries in your data mart’s dimensions, fact table, and filters.
Avoid IF in Iterators
When used in a larger iterator, the IF statement will swap calls between the storage and formula engines, which may be costly for total query execution. Please keep in mind that FE always runs the IF function.
We used the IF statement to evaluate the Dax expression in the previous example. We converted the logic in the preceding example to replace the IF line. Please remember that the execution duration in the preceding example has been increased. Still, we should attempt to remove the IF statement, compare the results on various parameters, and then make a decision.
You can read the Graph Database technology by clicking here.