3. Marco Pozzan
Work with SQL server from 2000 version
Work in BI since 2005
President of the community 1nn0va
(www.innovazionefvg.net)
Project manager at Servizi CGN (www.cgn.it)
SQL Server and DWH Consultant
References:
twitter: @marcopozzan
email: info@marcopozzan.it
site: www.marcopozzan.it
5. What is Powerpivot?
Free AddIn for Microsoft Excel 2010 e 2013
Different versions for 32/64 bit (4GB limit)
Do not need SQL Server or other prerequisities
Very powerful analysis engine based on SSAS
di SQL Server 2012
No API available to control it
No security available
Always impersonates current user
6. Versions
installed:
Client-side: inside to Excel
Server-side: Built on SharePoint 2012 o SQL
Server 2012(Tabular)
the SSAS engine client-side version the engine runs
in-process with Excel
7. SSAS 2012
Use Vertipaq engine which is a columnar
database high compression
Work completely in memory
No I/O, aggregates, or other…
IMBI = New way of thinking about the
algorithms
9. Advantages (PowerPivot)
Fast
No ETL (Power Query)
Metadata (model)
Integration of heterogeneous sources
Sharing
Especially with Sharepoint
Expressiveness
Relation and Dax
11. What is DAX?
Designed to work within a PivotTable
Programming language of Tabular and
PowerPivot
Resembles Excel (say )
No concept of «row» and «column»
Different Type System
Mix between MDX, SQL, EXCEL
12. Dax Types
Non numerical:
String
Binary Objects (Power View)
Numerical:
Currency
Integer
Real
DateTime
(integer: dd 30/12/1899, decimal: fraction of a day)
Boolean
13. Type Handling
Operators are not strongly typed ("1"+1)
Operator Overloading (warning )
Example
1 & 2 = "12"
"1" + "2" = 3
14. Columns in DAX 1/2
'TableName’[ColumnName]
=FactInternetSales[OrderDate]
Quotes can be omitted if the tablename does
not contain spaces (Don’t do it )
15. Columns in DAX 2/2
TableName can be omitted and then will look
in the current table
not to do it as it is hard to understand the
formulas
=[OrderDate]
Brackets cannot be omitted
16. Calculated Columns
Computed using DAX and persisted in the
database
Use another columns
Always computed for the current row
FactInternetSales[OrderDate] means
The value of the OrderDate column
In the FactInternetSales table
For the current row
Different for each row
17. Measures (Calculated Fields)
Do not work row by row
Written using DAX
Not stored on the database
Use tables and aggregators
Do not have the «current row»
I can not write the following formula
=FactInternetSales[OrderDate]
:=SUM(FactInternetSales[OrderDate])
18. Define the right name of the column
If you change the name of the columns must
be changed manually in the measures
So, immediately defined the right names
19. Calculated column e Measures
Suppose you want to calculate the margin
with a calculated column:
=FactInternetSales[SalesAmount]-FactInternetSales[TotalProductCost]
I Can aggregate margin column with a
measure
SUMofMargin:=SUM(FactInternetSales[Margin])
20. Calculated column e Measures
margin compared to sales (margine%)
=FactInternetSales[Margin] / FactInternetSales[SalesAmount]
This expression is not correct if he come
aggregate
I must use this
Margine%:=SUM(FactInternetSales[Margin]) /
SUM(FactInternetSales[SalesAmount])
21. Measures rules and convention
Define the name of the table to which it
belongs
The measures are global to the model
There may be two measures with the same name
in different tables
You can move from one table to another, this can
not be done with computed columns
Do not refer to a measure with table name is
confused with calculated columns
22. Summary 1/2
Columns consume memory and measures
consume CPU
Are calculated at different times
They have different purposes
They are structured differently
Are managed in different ways
23. Summary 2/2
Use measures (90%)
Calculate ratios
Calculate percentages
Need complex aggregations
Use column when (10%)
It requires slicer or filter values
The expression is calculated on the current row
24. Counting Values
COUNTROWS: rows in a table
COUNTBLANK: counts blanks
COUNTA: counts anything but not blanks
COUNT : only for numeric columns
Compatibility with Excel
DISTINCTCOUNT: performs distinct count
Multidimensional -> measure group with a
distinctcount measure .. Slow like a snail
25. Errors in DAX 1/2
1+2 always works
[SalesAmount]/[Margin] might fail
Causes of errors
Conversion errors
Arithmetical operations
Empty or missing values
ISERROR (Expression) returns true or
false, depending on the presence of an error
during evaluation
26. Errors in DAX 2/2
IFERROR (Expression, Alternative) in case of
error returns Alternative true. Useful to avoid
writing expression twice
Both IFERROR and ISERROR are very slow
so be careful how you use computed
columns
27. Aggregation Functions
Work only on numeric columns
Aggregation functions:
SUM
AVERAGE
MIN
MAX
Aggregate columns only not expression
SUM(Order[Quantity])
SUM(Order[Quantity]) * Orders[Quantity])
28. The X aggregation functions 1/2
Iterate on the table and evaluate the
expression for each row
Always get two parameters: the table to
iterate and the formula to evaluate
SUMX,AVERAGEX,MINX,MAXX
SUMX (
Sales,
Sales[Price] * Sales[Quantity]
)
29. The X aggregation functions 2/2
First calculate the internal parameters and
then makes the sum
The columns must all be on the same table
or use RELATED (if there is a relationship)
They are very slow but I do not use memory
30. Alternatively, the X functions
An alternative to the X functions.
create a calculated column
aggregate on that column
very fast but use memory
33. Information Function
Completely useless (do not take the
expressions but only columns)
ISNUMBER
ISTEXT
ISNONTEXT
Useful
ISBLANK
ISERROR
But if we do not know us (we created the
column) if it is a number or a text those who
should know (by Alberto Ferrari)
34. DIVIDE Function
check that the denominator is not 0
IF( Sales[Price] <> 0, Sales[Quantity] /
Sales[Price],0)
DIVIDE(Sales[Quantity], Sales[Price],0)
35. Date Function
Many useful functions:
DATE ,DATEVALUE, DAY, EDATE,EMONTH
,HOUR, MINUTE, MONTH, NOW, SECOND, TI
ME, TIMEVALUE, TODAY
(interesting!!!), WEEKDAY, WEEKNUM, YEAR,
YEARFRANC
Time intelligence functions
36. Evaluation Context 1/3
Characterizes DAX from any other language
They are similar to the “where clause” of the
MDX query in SSAS
Contexts under which a formula is evaluated
Filter Context , RowContext
37. Evaluation context 2/3
Filter Context:
Set of active rows for the computation
The filter that comes from the PivotTable
Defined by slicers, filters, columns, rows
One for each cell of the PivotTable
38. Evaluation context 3/3
Row Context:
Contains a singles row
Current row during iterations
Define by X function or Calculate column
definition not by pivot tables
This concept is new among MDX
because not working leaf by leaf, but only on the
context.
39. The two context are always
Filter context:
Filter tables
Might be empty (All the tables are visible)
It is used by aggregate functions
In calculated column is all the tables because
there is not pivot table
Row context:
Iterate the active row in the filter context
Might be empty (There is no iteration running)
41. Evaluation Context
Filter context:
Is propagated through relationships from one to many
The direction of the relationships is very important. Is
different from SQL (inner,left,...)
Applies only once (+ performance)
Row context:
Does not propagate over relationships
Use RELATED (open a new row context on the
target)
Apply for each row (- performance)
43. Table Function
FILTER (adding new conditons. Is an iterator!!!)
ALL
(Remove all conditions from a table. Returns all rows
from a table)
Useful to calculate ratios and percentages
Removes all filters from the specified columns in the
table
VALUES (valori di una colonna compresi i blank)
RELATEDTABLE (tutti i valori collegati alla riga corrente)
All function returns a table
47. VALUES
Return to the table with a single column
containing all possible values of the column
visible in the current context
SelectedYear:=COUNTROWS(VALUES(Dati[Year]))
When the result is a column and a row can
be used as scalar
48. RELATEDTABLE
Return only row of sales (Dati) related with
the current store (Store)
=COUNTROWS (RELATEDTABLE(Dati))
49. Considerations
we have seen that we can:
Can add a filter on a column
Remove filter on the full table
Mixing filter
…..but:
ignore only a part of the filter context and not all
add a condition to the context filter or modify an
existing condition
50. Calculate
The most simple but complex to understand
CALCULATE(
Expression,
Filter1,
….
FiltroN
)
Computed before the filter (AND) and then the
expression
All filters are processed in parallel and are
independent of each other
Replace the filter context (replace whole table
or a single column)
52. Calculate with filter
So this formula is not correct
ProductLMC :=
CALCULATE(
SUM(FactInternetSales[SalesAmount]);
DimProduct[ListPrice] > DimProduct[StandardCost] ))
Use FILTER
ProductLMC :=
CALCULATE(
SUM(FactInternetSales[SalesAmount]);
FILTER(DimProduct,
DimProduct[ListPrice] > DimProduct[StandardCost] ))
The filter is a boolean condition
that works on a single column
(Ex: DimProduct[Color] = "White"
or DimProduct[ListPrice] > 1000)
In this case there are too many
columns in the filter (ListPrice and
StandardCost)
53. Calculate – pay attention to the filter context
ProductM100:=
CALCULATE (
SUM(FactInternetSales[SalesAmount]),
FILTER(
DimProduct,
DimProduct[ListPrice] >= 100
)
)
Color = silver
Color = silver
ListPrice >= 100
Filter Context
The DimProduct is evaluated in the original filter
context before evaluate CALCULATE
54. Calculate – pay attention to the context filter
ProductM100_Bis:=
CALCULATE (
SUM(FactInternetSales[SalesAmount]),
FILTER(
ALL(DimProduct),
DimProduct[ListPrice] >= 100
)
)
Color = silver
ListPrice >= 100
All column …..
Filter Context
The new context of filter will be the SUM are all the
row because “color = silver” was removed
55. Earlier
Returns a value from the previous row
context:
=SUMX(
FILTER(Sales;
Sales[Date]<=EARLIER(Sales[Date]) &&
YEAR(Sales[Date]) = YEAR(EARLIER(Sales[Date]))
)
;Sales[Value]
)
In row contex we have only 1 variables available
FOR A = 1 TO 5
FOR B = 1 TO 5
IF A < B THEN
NEXT
NEXT
FOR = 1 TO 5
FOR A = 1 TO 5
IF IEARLEIER ( ) < A THEN
NEXT
NEXT
56. Calculate – Context transition
In DimProduct the two expressions are the same?
= SUM(FactInternetSales[SalesAmount]);
= CALCULATE(SUM(FactInternetSales[SalesAmount]));
57. ABC and Pareto Analysis
80% of effects come from 20% of the causes
L’80% of sales come from 20% of customers
Pareto analysis is the basis of the classification
ABC
Class A contains items for >=70% of total value
Class B contains items for >=20% and <70% of total value
Class C contains items for <20% of total value
58. ABC and Pareto Analysis
For each row calculate the TotalSales
=CALCULATE( SUM(FactInternetSales[SalesAmount]))
Calculate all products with total sales greater
than the selling of the row RunningTotalSales
= SUMX(
FILTER(
DimProduct;
DimProduct[TotalSales] >=
EARLIER(DimProduct[TotalSales])
);
DimProduct[TotalSales]
)
59. Analisi di Pareto e l’analisi ABC
calculate the percentage of sales by product
of the total sales
=DimProduct[RunningTotalSales] / SUM(DimProduct[TotalSales])
visualize the labels A, B, C
=IF(
DimProduct[RunningPct] <= 0.7;
"A";
IF(
DimProduct[RunningPct] < =0.9;
"B";
"C";
)
)
60. ABC and Pareto Analysis
the number of products that generate those
sales
=COUNTROWS(DimProduct)
62. Link and Book
PowerPivot
http://www.powerpivot.com
SQLBI
http://www.sqlbi.com
WebCast (Powerpivot 1.0)
http://www.presentation.ialweb.it/p29261115/
Book