There are times throughout the year when we need to keep up with the fluctuations of our organization in terms of sales or profits. For example, for companies from retail or eCommerce field, the winter holidays or Black Friday represent the periods in which the highest sales growth is registered. On the other hand, for IT or service companies, the holidays or vacations often generate a decrease in terms of attracting new customers and concluding new contracts.
To have an overview of our results, it is necessary to evaluate our recorded data. To achieve this, we must consider the average sales or receipts, their distribution and the deviations that have taken place. Thus, we will be able to observe if our activity generated or not the desired results.
Tableau Software provides users with different types of analysis so they can visualize more easily their recorded values. One of the most common analysis used to observe the deviations that take place is the Histogram with normal distribution. This type of analysis is represented by the presence of a curve, also called the bell curve. The normal distribution is also known as the Gaussian distribution and represents the distribution of symmetric probabilities compared to the mean. Thus, data that are closest to the higher point of the curve have a higher frequency of occurrence than those that are further away.
Histogram with normal distribution allows users to analyze the values in a data set and to observe whether or not they had a normal distribution. The histogram displays the sum of the values recorded in the analyzed data set. In order to have a normal distribution of our values, they must be concentrated at the highest point of the curve. The advantage that the curve offers is given by the fact that highlights how the values in the dataset were distributed and whether the distribution has a normal behavior.
Next, we will show you how to create a histogram with normal distribution by analyzing sales values from a data set.
→ In Tableau Desktop, connect to Superstore sample data provided by Tableau.
→ Create a calculated field named Customer Count with the formula:
COUNTD([Customer Name])
→ Create a LOD expression to get the total amount of Sales by Customer, with the formula:
{ FIXED [Customer Name]: SUM([Sales]) }
→ Create a calculated field named Mean with the formula:
{ AVG([Sales by Customer]) }
We have no dimension as we want the mean to be calculated across the entire data set.
→ Create a calculated field in order to calculate the Standard Deviation with the formula:
{ STDEV([Sales by Customer]) }
→ Create a parameter named Size of Sales (bin). From the Data type area select Integer and for the Current Value type in the value 500.
→ Create a new calculated field named Sales(bin) with the formula:
INT([Sales by Customer] / [Size of Sales (bin)]) * [Size of Sales (bin)]
→ Create a calculated field named Normal Curve in order to calculate the normal distribution, with the formula:
(
1/MAX([Standard Deviation])*SQRT(2*PI())
)
*
EXP
(
-SQUARE(MAX([Sales (bin)]) – MAX([Mean]))
/
(2 * SQUARE(MAX([Standard Deviation])))
)
→ Drag the Sales (bin) onto the Column and change the visualization type into Bar. Right click on it and convert this to a Dimension.
→ Drag the Customer Count onto the Rows.
→ Drag the Normal Curve onto the Rows and change the visualization to Line.
→ Right click on the second axis and select Dual Axis. Do not Synchronize Axis.
→ Adjust the tooltip.
By Adelina Popescu
How to Retrieve and Process JSON Data from a REST API in Talend Are you working as a Data Engineer, and have you started using Talend? Maybe in one of your projects, you need to retrieve data from a REST […]
💡 Unlock the Power of Tableau for Smarter Data Decisions Are you ready to transform the way you see and use data? Tableau is more than just a data visualisation tool, it’s your gateway to interactive, intuitive, and impactful business […]
You are working at a Data Analytics company, and in one of your projects, you need to load data from a source into a target table using Talend. However, while running the Talend Job, the data will fail to load […]