AdventureWorks Sales Report (End-to-end ETL process and Data Visualization using Power BI)

Explore the dashboard here!

Truc Phan
9 min readJan 22, 2024

Introduction

As an aspiring Data Analyst with two years of experience using Tableau, my transition to the United States brought to light a significant shift in the industry’s preference towards Power BI. Currently 97% of Fortune 500 companies use Power BI for data visualization, making proficiency in this tool a key requirement for most Data Analyst positions.

Reflecting on my experience from two years ago, where it took me a month to master Tableau and create my first Tableau dashboard, I set a new challenge upon my arrival in the US. This time, my goal was to become proficient in Power BI and develop my first Power BI dashboard within just two weeks. While it may sound ambitious initially, dedicating 100% of my time, effort, and determination allowed me to successfully achieve this milestone. I am proud to present the culmination of this effort — the completion of my first Power BI Report project, featuring the Adventure Works Data set, a bike manufacturer and seller.

In this article, I will guide you through an end-to-end ETL (Extract-Transform-Load) and data visualization process using Power BI, with a touch of Snowflake integration.

Understand the Business Requests

Understanding the context of data visualization is paramount, as highlighted in the book “Storytelling with Data” by Cole (2015). This emphasizes that comprehending the reasons behind the communication need is a crucial step, often surpassing the importance of the visualization process itself. Therefore, prior to delving into the intricacies of data manipulation, it is vital to understand relevant business requests from key stakeholders such as the Sales Manager, Customer Success Manager, and Product Manager at AdventureWorks, as outlined in the table below:

These requests serve as the foundation for the subsequent data analysis and visualization processes, ensuring that the generated insights align with the specific needs and objectives of each stakeholder. This proactive approach fosters a more targeted and impactful data visualization outcome.

Data set

The AdventureWorks dataset, generously offered by Microsoft as a freely accessible public database, can be downloaded here. This dataset is presented in the format of an Excel workbook, comprising seven distinct tables or tabs. Notably, the Sales table takes on the role of a fact or numeric table, while the remaining tables — Product, Customer, Sales Order, Sales Territory, Reseller, and Calendar Date — function as dimension or descriptive tables.

Sales_data table preview of the database as in Excel workbook.

Extract

The initial phase of the ETL (Extract-Transform-Load) process involves extracting data from diverse sources. This entails accessing the data sources, establishing connections, and retrieving the necessary data. In the scope of this project, all the data is conveniently housed within a single Excel file. However, in real-world scenarios, data often resides in multiple databases and files, spanning platforms like Microsoft SQL Server, Microsoft Excel, and cloud services such as Snowflake or AWS. Despite this diversity in sources, the data remains interrelated.

Leveraging the capabilities of Power BI, we can seamlessly connect to and extract data from various databases and files. The Get Data image below illustrates the array of connectors offered by Power BI, facilitating connections to a multitude of sources.

In the current project, the raw dataset is initially available as an Excel workbook, and Power BI simplifies the extraction process by directly importing the data using the Excel Workbook option. This results in the importation of seven distinct files, each manifesting as its own table within Power BI. This structured approach sets the stage for subsequent transformations and analysis of the interconnected data.

Get Data in Power BI

Here’s an example illustrating the integration of Snowflake cloud with Power BI in the context of the AdventureWorks dataset. After I upload the data from the provided Excel file to Snowflake, I establish a seamless connection between Power BI and Snowflake for extracting the uploaded datasets. Power BI offers a dedicated connector for Snowflake, streamlining the process of accessing and incorporating data from this cloud-based database.

Connecting Snowflake and Power BI

This integration ensures that the AdventureWorks dataset, now stored in Snowflake, becomes directly accessible within Power BI. This dynamic connection facilitates real-time updates and analysis, allowing for a comprehensive exploration of the data within the Power BI environment.

Transform

After establishing the connection in Power BI Desktop, the Power Query Editor provides a range of capabilities for refining, cleaning and transforming the imported data.

Here I have the flexibility to perform the data cleaning tasks like modifying column or table names to make sure they are titled appropriately, converting text to numeric values, date or correct data type, looking for duplicates or missing values (which there were none in this dataset). Shaping the data correctly is crucial to tailor it to the requirements and make it suitable for reporting purposes.

After checking each individual table, I look for and set up the relationships between the tables using their primary key and foreign key. Each table has one or more columns with the word “Key” in it and the same key has the same name in the tables, thus it is very easy to manage the relationships. Power BI makes it even easier when the relationships are automatically established once the tables are loaded.

Load

The final step, load, includes sending the transformed data into the data model. After this step, I can start to use data for analytical and reporting purposes.

Data Modeling

Once my data is now accurate and consistent, I start creating a data modeling for my tables. Data modeling is a visual representation of data structures, relationships, and business rules. It is a crucial step in database design to ensure readability and efficiency.

In this project, the data modelling has the Star scheme, which represents different tables, with dimension tables connected to a central facts table. In a star schema, the facts table contains the quantitative data, while the dimension tables are supplementary descriptive information about the data in the facts table. As can be seen in a picture below, the Sales facts table is connected to six dimension tables (Product, Customer, Sales Order, Sales Territory, Reseller and Calendar Date). Most relationships between tables are one-to-many (with one on the dimension table and many on the facts table), and only one one-to-one relationship is between Sales and Sales Order table.

Data Model in Power BI

DAX Functions

After setting up my table relationships, I initiate the application of various DAX functions to analyze the dataset. I begin by formulating fundamental Key Performance Indicators (KPIs) like Sales, Profit, and Total Orders, Total Customers, laying the groundwork for calculating more advanced metrics later. For example, I use the following DAX functions.

  • Iterator Functions (SUMX): These are formulas that assess an expression for each row and then aggregate the outcomes.
  • CALCULATE(): Serving as an overriding filter, it establishes a new filter context. This was particularly valuable for determining metrics such as Previous Months Orders, Revenue, Profit, Returns, and Overall Average Price.
  • Date Functions (DateAdd, DATESINPERIOD): These date-related functions proved beneficial in establishing a 90 Day Rolling Profit, as well as determining metrics for the Previous Months Order, Profit, Revenue, and Returns.

Data Visualization

Finally, four visualization dashboards were meticulously crafted to meet the business requests elicited from Sales, Customer Success, and Product Managers. The initial page provides a comprehensive sales overview, while the subsequent two pages are dedicated to essential details and visualizations, showcasing sales over time, per customers, and per products. The final page features a map visualization illustrating sales distribution across different countries.

1. Sales Overview

This dashboard serves as the hub for company-wide KPIs, encompassing Sales, Profit, Total Orders, Return on Sales, along with detailed lists highlighting the top-selling products and the top 10 customers contributing the most to sales. Designed with the intention to offer a broad understanding of overall company performance, this dashboard provides accessibility for individuals at various levels within the organization.

Situated in the top center, a Slicer Panel enhances user interactivity, allowing for nuanced visual filtering based on parameters such as year, month, customer city, product name, and category. This feature offers a more detailed and granular examination of the crucial KPIs, tailoring the insights to specific areas of interest within the broader company performance context.

2. Customer Details

This page is a repository of multiple visuals illustrating the intricate relationship between customers and crucial business metrics. Total sales and the overall number of customers are highlighted, with year, month, customer name, customer city, and product category serving as dynamic filters.

  • The Top 10 stack bar chart unveils valuable insights, showcasing that buyers from London and Paris have been the most significant contributors to bike sales, amounting to $780K and $520K, respectively, spanning the years 2018 to 2020. Notably, the Bikes category emerges as the primary driver of sales.
  • Presented through two matrix tables are detailed breakdowns of sales figures per customer per month and per product category. These tables serve as actionable tools for the Customer Success team, enabling targeted follow-ups. For instance, customers like Jordan Turner, who consistently rank among the top buyers, can be prioritized for personalized engagement.
  • Simultaneously, customers with minimal sales can be contacted to gather feedback and understand the factors influencing their purchasing decisions, contributing to a more informed and strategic customer outreach strategy.

3. Product Details

This page offers a comprehensive and interactive exploration of the intricate relationship between Adventure Works’ products, showcasing their respective sales amounts and profits.

An innovative feature on this dashboard is the AI-embedded Q&A chat box, allowing stakeholders to pose questions and receive automatic responses generated by Power BI.

Highlighted on this page are the top products by orders and profitability, providing valuable insights for the Product team to prioritize in both production and development efforts.

Additionally, the product sales by month matrix table and the decomposition tree further empower the Product team with a nuanced understanding of the best-selling products across different months and categories. These visualizations serve as essential tools for strategic decision-making within the Product team, facilitating informed choices in product development and optimization.

4. Sales Map

This is an interactive visual of the company’s order distribution throughout the world broken down by continent.

  • Some key insights found from this were that while the United States makes up the majority of Adventure Work’s Orders (154K), Canada shares a significant share of the total orders as well (50K).
  • Additionally, no sales have taken place in Asia, Africa, or South America indicating a potential for new markets to break into.

Conclusion

In conclusion, the end-to-end journey of working on this project has been immensely gratifying. Witnessing the entire process unfold, from the project’s inception to its completion, has offered a distinct perspective on both the visuals and the underlying data. The satisfaction derived from this experience fuels my enthusiasm to undertake more such projects, delving into increasingly complex datasets in the future.

I extend my sincere appreciation to you for taking the time to read this article. Your insights and feedback are invaluable, so please feel free to share any thoughts or suggestions you may have!

--

--

Truc Phan
Truc Phan

Responses (1)