Business Intelligence Components and How They Relate to Power BI

Business Intelligence Components and How They Relate to Power BI

When I decided to write this blog post, I thought it would be a good idea to learn a bit about the history of Business Intelligence. I searched on the internet, and I found this page on Wikipedia. The term Business Intelligence as we know it today was coined by an IBM computer science researcher, Hans Peter Luhn, in 1958, who wrote a paper in the IBM Systems journal titled A Business Intelligence System as a specific process in data science. In the Objectives and principles section of his paper, Luhn defines the business as “a collection of activities carried on for whatever purpose, be it science, technology, commerce, industry, law, government, defense, et cetera.” and an intelligence system as “the communication facility serving the conduct of a business (in the broad sense)”. Then he refers to Webster’s dictionary’s definition of the word Intelligence as the ability to apprehend the interrelationships of presented facts in such a way as to guide action towards a desired goal”.

It is fascinating to see how a fantastic idea in the past sets a concrete future that can help us have a better life. Isn’t it precisely what we do in our daily BI processes as Luhn described of a Business Intelligence System for the first time? How cool is that?

When we talk about the term BI today, we refer to a specific and scientific set of processes of transforming the raw data into valuable and understandable information for various business sectors (such as sales, inventory, law, etc…). These processes will help businesses to make data-driven decisions based on the existing hidden facts in the data.

Like everything else, the BI processes improved a lot during its life. I will try to make some sensible links between today’s BI Components and Power BI in this post.

Generic Components of Business Intelligence Solutions

Generally speaking, a BI solution contains various components and tools that may vary in different solutions depending on the business requirements, data culture and the organisation’s maturity in analytics. But the processes are very similar to the following:

  • We usually have multiple source systems with different technologies containing the raw data, such as SQL Server, Excel, JSON, Parquet files etc…
  • We integrate the raw data into a central repository to reduce the risk of making any interruptions to the source systems by constantly connecting to them. We usually load the data from the data sources into the central repository.
  • We transform the data to optimise it for reporting and analytical purposes, and we load it into another storage. We aim to keep the historical data in this storage.
  • We pre-aggregate the data into certain levels based on the business requirements and load the data into another storage. We usually do not keep the whole historical data in this storage; instead, we only keep the data required to be analysed or reported.
  • We create reports and dashboards to turn the data into useful information

With the above processes in mind, a BI solution consists of the following components:

  • Data Sources
  • Staging
  • Data Warehouse/Data Mart(s)
  • Extract, Transform and Load (ETL)
  • Semantic Layer
  • Data Visualisation

Data Sources

One of the main goals of running a BI project is to enable organisations to make data-driven decisions. An organisation might have multiple departments using various tools to collect the relevant data every day, such as sales, inventory, marketing, finance, health and safety etc.

The data generated by the business tools are stored somewhere using different technologies. A sales system might store the data in an Oracle database, while the finance system stores the data in a SQL Server database in the cloud. The finance team also generate some data stored in Excel files.

The data generated by different systems are the source for a BI solution.

Staging

We usually have multiple data sources contributing to the data analysis in real-world scenarios. To be able to analyse all the data sources, we require a mechanism to load the data into a central repository. The main reason for that is the business tools required to constantly store data in the underlying storage. Therefore, frequent connections to the source systems can put our production systems at risk of being unresponsive or performing poorly. The central repository where we store the data from various data sources is called Staging. We usually store the data in the staging with no or minor changes compared to the data in the data sources. Therefore, the quality of the data stored in the staging is usually low and requires cleansing in the subsequent phases of the data journey. In many BI solutions, we use Staging as a temporary environment, so we delete the Staging data regularly after it is successfully transferred to the next stage, the data warehouse or data marts.

If we want to indicate the data quality with colours, it is fair to say the data quality in staging is Bronze.

Data Warehouse/Data Mart(s)

As mentioned before, the data in the staging is not in its best shape and format. Multiple data sources disparately generate the data. So, analysing the data and creating reports on top of the data in staging would be challenging, time-consuming and expensive. So we require to find out the links between the data sources, cleanse, reshape and transform the data and make it more optimised for data analysis and reporting activities. We store the current and historical data in a data warehouse. So it is pretty normal to have hundreds of millions or even billions of rows of data over a long period. Depending on the overall architecture, the data warehouse might contain encapsulated business-specific data in a data mart or a collection of data marts. In data warehousing, we use different modelling approaches such as Star Schema. As mentioned earlier, one of the primary purposes of having a data warehouse is to keep the history of the data. This is a massive benefit of having a data warehouse, but this strength comes with a cost. As the volume of the data in the data warehouse grows, it makes it more expensive to analyse the data. The data quality in the data warehouse or data marts is Silver.

Extract, Transfrom and Load (ETL)

In the previous sections, we mentioned that we integrate the data from the data sources in the staging area, then we cleanse, reshape and transform the data and load it into a data warehouse. To do so, we follow a process called Extract, Transform and Load or, in short, ETL. As you can imagine, the ETL processes are usually pretty complex and expensive, but they are an essential part of every BI solution.

Continue reading “Business Intelligence Components and How They Relate to Power BI”

Exporting Power BI Desktop Data Using Power BI Exporter

Exporting Power BI Desktop Data Using Power BI Exporter
Exporting Power BI Desktop Data Using Power BI Exporter

I am excited to announce that our amazing team at Data Vizioner has developed a lightweight and easy to use external tool for Power BI Desktop named Power BI Exporter. The Power BI Exporter is a free community tool available for download here. With External Tools General Availability, you can expect to see more and more External Tools built by our amazing community members. I personally stunned to see many amazing ideas turn into useful tools. Learn more about Power BI announcements about External Tools GA here. I am humbled to be named in the announcement though. So thank you to Microsoft for that.

Let me start with a little background. In the past few years, I wrote a series of blog posts about exporting data from Power BI Desktop and Power BI Service. Those posts are quite popular. They are in the top 10 most visited posts on my website, contributing to about 15% of my website visitors. Here are the previous posts I wrote on this specific topic:

Apart from my website statistics showing many people want to export data from Power BI Desktop, some of my customers asked the same question. They wanted to export the curated data from their data model within Power BI Desktop to CSV format and make the curated data available for their other platforms. While all the methods mentioned in my previous blog posts are working, some users still find them complex. So I thought, we can make it better. We can make a straightforward tool that exports the data with only two clicks. So we started building the Power BI Exporter as a micro-project. We added some more ideas to the original idea of only exporting the data. We thought it is good to export the data along with the table names, column names and relationships. Having that information handy, we can quickly build the same data model as the one we exported its data but using the CSV files as the data sources. The other idea was to pack everything in a ZIP file on the fly, so we have a single ZIP file, including the tables, columns, and relationships. As a result, the first version of the Power BI Exporter is born. In this post I explain how it works.

Downloading and Installing Power BI Exported

You can download Power BI Exporter from its official webpage from Data Vizioner website. You require to enter your email address then click the Download button as shown in the following image:

Downloading Power BI Exporter from Data Vizioner Website
Downloading Power BI Exporter from Data Vizioner Website
Continue reading “Exporting Power BI Desktop Data Using Power BI Exporter”

Power BI Governance, Good Practices, Part 1: Multiple Environments, Data Source Certification and Documentation

Power BI Governance, Good Practices, Part 1: Multiple Environments, Data Source Certification and Documentation

Power BI is taking off, and it’s fast becoming the most popular business intelligence platform on the market. It’s easy to engage with and get professional results quickly, making it the perfect tool for organisations looking to beef up their BI prowess and make data driven decisions through-out the organisation.

Gartner 2020 Magic Quadrant for Analytics and Business Intelligence Platforms
Did you know that Gartner named Microsoft as the 2020 leader in their Magic Quadrant for Analytics and Business Intelligence Platforms?

In this post we’re going to look at three good practices for implementation and give you the tips you need to make sure you avoid common pitfalls so you are on the fast track to success with Power BI on your organisation.

1. Setup multiple environments

When working on a Power BI implementation project, it’s wise to have multiple environments to manage the lifecycle of your BI assets. Below we’ve listed several environments that should be considered depending on the complexity of the project and your organisation’s needs.

Development (aka Dev)

Being able to keep on top of the many reports you’re testing, and having the ability to track changes that occur, is essential as you get setup. Without a specific Dev environment, your production environment will quickly become overwhelmed with assets, making it hard to maintain and manage.  

When working in the dev environment, make sure that you have data sources specifically for development. We’ve seen production data used in dev on many occasions which can lead to serious privacy and data sovereignty issues. Your dev data sources shouldn’t contain sensitive data. 

These development environments can be on your local network or in cloud storage (like OneDrive for Business or GitHub). It is recommended to have separate Workspaces in Power BI Service for each environment.

Tip: The data sources of all published reports to Power BI Service must be sufficient for development use only and should avoid including confidential data.

User Acceptance Testing (aka UAT) 

The people who will be using the reports daily are the ones who should be testing them – they know the business best, and will be able to identify opportunities and gaps that the development team may not be able to identify themselves. By making sure the user is brought into the process early on, it maximises the value added to the business.

User acceptance testing is the last phase of testing. The UAT environment should only be created once the solution has been fully tested in Dev and approved by senior Power BI developers.

Continue reading “Power BI Governance, Good Practices, Part 1: Multiple Environments, Data Source Certification and Documentation”

Highlighting Below Avg Sales per Hierarchy Level with SWITCH() and ISINSCOPE() DAX Functions in Power BI

Highlighting Below Avg Sales per Hierarchy Level with SWITCH() and ISINSCOPE() DAX Functions in Power BI

I was working on a project a wee bit ago that the customer had conditional formatting requirement on a Column Chart.
They wanted to format the columns in the chart conditionally based on the average value based on the level of hierarchy you are at.
Here is the scenario, I have a Calendar hierarchy as below:

  • Calendar Hierarchy:
    • Year
    • Semester
    • Quarter
    • Month
    • Day

I use “Adventure Works DW2017, Internet Sales” Excel as my source in Power BI Desktop. If I want to visualise “Total Sales” over the above “Calendar Hierarchy” I get something like this:

Line Chart in Power BI, Total Sales by Year

Now I activate “Average Line” from “Analytics” tab of the Line chart.

Adding Average Line to Line Chart in Power BI

When I drill down in the line chart the Average line shows the average of that particular hierarchy level that I am in. This is quite cool that I get the average base on the level that I’m in code free.

Power BI, Drilling Donw in Line Chart

Easy, right?

Now, the requirement is to show the above behaviour in a “Column Chart” (yes! visualising time series with column chart, that’s what the customer wants) and highlight the columns with values below average amount in Orange and leave the rest in default theme colour.

So, I need to create Measures to conditionally format the column chart. I also need to add a bit of intelligent in the measures to:

  • Detect which hierarchy level I am in
  • Calculate the average of sales for that particular hierarchy level
  • Change the colour of the columns that are below the average amount

Let’s get it done!

Detecting Hierarchy Level with ISINSCOPE() DAX Function

Microsoft introduced ISINSCOPE() DAX function in the November 2018 release of Power BI Desktop. Soon after the announcement “Kasper de Jonge” wrote a concise blogpost about it.

So I try to keep it as simple as possible. Here is how is works, the ISINSCOPE() function returns “True” when a specified column is in a level of a hierarchy. As stated earlier, we have a “Calendar Hierarchy” including the following 5 levels:

  • Year
  • Semester
  • Quarter
  • Month
  • Day

So, to determine if we are in each of the above hierarchy levels we just need to create DAX measures like below:

ISINSCOPE Year		=	ISINSCOPE('Date'[Year])
ISINSCOPE Semester	=	ISINSCOPE('Date'[Semester])
ISINSCOPE Quarter	=	ISINSCOPE('Date'[Quarter])
ISINSCOPE Month		=	ISINSCOPE('Date'[Month])
ISINSCOPE Day		=	ISINSCOPE('Date'[Day])

Now let’s do an easy experiment.

  • Put a Matrix on the canvas
  • Put the “Calendar Hierarchy” to “Rows”
  • Put the above measures in “Values”
Detecting Year, Semester, Quarter, Month and Day hierarchy levels with ISINSCOPE in Power BI Desktop

As you see the “ISINSCOPE Year” shows “True” for the “Year” level. Let’s expand to the to the next level and see how the other measures work:

Continue reading “Highlighting Below Avg Sales per Hierarchy Level with SWITCH() and ISINSCOPE() DAX Functions in Power BI”