From Learning to Doing: The Need for Tableau Projects – IT Exams Training

You have studied the fundamentals of Tableau. You understand the difference between dimensions and measures, you can build a bar chart, and you know how to drag and drop fields onto a worksheet. But what comes next? There is a significant gap between understanding the tools and applying them to solve a real-world problem. The next, most critical step in your journey is to apply your skills by working on case studies that you can build, analyze, and add to your professional data analyst or business intelligence portfolio. Companies are not just looking for people who know the software; they are looking for people with experience in managing an analysis from start to finish. This involves the full lifecycle of a project: preparing the data, creating the visualizations, and, most importantly, using your new knowledge to tell a compelling story and derive actionable insights.

Why a Project Portfolio Matters

A portfolio is your single greatest asset when job hunting. It is tangible proof of your abilities. A resume can claim you “know Tableau,” but a portfolio shows you can use it. It demonstrates to a potential employer that you can take a raw dataset, clean it, model it, analyze it, and present a coherent dashboard that answers a business question. This end-to-end experience is exactly what hiring managers are looking for. These projects prove you have the technical skills, the analytical mindset, and the business acumen to turn data into value. We have many case studies to offer, and this series will walk you through six of them in great detail. These projects are designed to build your skills progressively, from beginner to intermediate, and to create exceptional pieces for your portfolio.

Beginner Project 1: Analyzing Customer Churn

Our first beginner project is a classic and essential business problem: customer churn analysis. For this case study, you will put on the hat of an analyst at a fictional telecommunications company called “Databel.” Like any subscription-based business, Databel’s top priority is to reduce customer churn, which is the rate at which customers cancel their service. Your task is to build a dashboard that helps the company understand what churn is, what its current churn rate is, and, most critically, why customers are unsubscribing. Your final deliverable will be a dashboard that not only visualizes these trends but also provides clear recommendations for how to reduce churn. This project is perfect for beginners because it touches on all the fundamental skills of a BI analyst.

The Business Context: Why Churn is Critical

In any business, it is far more expensive to acquire a new customer than it is to retain an existing one. High customer churn is a “leaky bucket” that can sink a company, no matter how fast it acquires new users. For Databel, understanding the drivers of churn is the first step in fixing it. Does churn happen in the first few months? Are customers on a specific contract type leaving more often? Are they leaving because of price, or because they are not using specific services? Your analysis will provide the data-driven answers to these questions, moving the company from “gut feelings” about why people leave to a state of having concrete evidence. This evidence will allow the marketing and product teams to create targeted interventions, such as offering discounts, promoting specific services, or improving the onboarding process.

Understanding the Fictional Dataset: “Databel”

For this project, imagine you are given a dataset with information about every customer. This data is likely in a single table or a set of related tables. A typical customer table would include columns like CustomerID, Gender, Age, Tenure_in_Months, Contract_Type (e.g., Month-to-Month, One Year, Two Year), Monthly_Charges, Total_Charges, and the most important column: Churn_Status (e.g., ‘Yes’ or ‘No’). You might also have information on what services each customer has subscribed to, such as Phone_Service, Internet_Service, Online_Security, Tech_Support, and Streaming_TV. This rich dataset provides all the raw material you need to investigate the demographic, financial, and service-based factors that might be influencing a customer’s decision to leave the company.

Step 1: Data Preparation and Calculated Fields

Your first task in Tableau is to connect to this data. Once connected, you will need to perform some light data preparation. This project’s core challenge is to create calculated fields for more in-depth analysis. The most important calculated field you will create is the churn rate itself. The raw data only tells you if a customer churned (Churn_Status), not the overall rate. You will create a new calculated field named Churn_Rate using a formula like: SUM(IF [Churn_Status] = ‘Yes’ THEN 1 ELSE 0 END) / COUNT([CustomerID]). This new measure is the foundation of your entire analysis. You might also want to create “bins” for Tenure_in_Months (e.g., 0-12 months, 13-24 months) to group customers by their time with the company, which can be done by right-clicking the Tenure field and selecting Create > Bins.

Step 2: Answering Key Questions with Visuals

With your data ready, you can start building worksheets to answer specific questions. Each worksheet should focus on one question. What is the overall churn rate? Create a new sheet and simply show your Churn_Rate measure as a large text number, formatted as a percentage. This is your primary Key Performance Indicator (KPI). Next, how does churn rate differ by contract type? Create a bar chart with Contract_Type on the columns shelf and your Churn_Rate measure on the rows shelf. You will likely see that “Month-to-Month” customers churn at a much higher rate. You should create similar bar charts to visualize churn rate by Internet_Service type, Tech_Support status, and Online_Security status. These visuals will start to tell a story about why people are leaving, highlighting that customers with fewer “sticky” services are more likely to churn.

Step 3: Building the Churn Dashboard

After you have your collection of worksheets, it is time to combine them into an attractive and easy-to-understand dashboard. Create a new dashboard. Drag your main Churn_Rate KPI to the top as a title. Then, arrange your bar charts in a logical flow. You could have a section for contract-related insights and another for service-related insights. Add filters to your dashboard, such as a filter for Contract_Type or Gender, to allow users to interactively explore the data themselves. As you build, feel free to get creative, but prioritize clarity. Use clear titles for each chart, ensure your tooltips are clean, and use color intentionally. For example, you could color all your churn rate bars by the Churn_Rate measure, making the highest-churning categories stand out with a darker, more saturated color.

Step 4: From Data to Recommendations

The final step, and the one that truly demonstrates your value as an analyst, is to provide recommendations for reducing churn. Your dashboard is the evidence. Your analysis of the visualizations should lead you to several key insights. For example, your Churn_Rate by Contract_Type chart is your “smoking gun.” Your recommendation would be: “Our analysis shows that customers on month-to-month contracts churn at a rate three times higher than those on annual contracts. We recommend developing a targeted marketing campaign to provide a special offer for these high-risk customers to upgrade to a one-year contract.” Another recommendation could be: “Customers who do not subscribe to ‘Online Security’ or ‘Tech Support’ are more likely to churn. We should consider bundling a free 3-month trial of these services to all new customers to increase product stickiness.”

Skills Gained and Next Steps

By completing this project, you will have successfully demonstrated a range of fundamental skills. You will have learned how to take a business problem, translate it into analytical questions, and prepare data for analysis. You will have mastered the creation of one of the most important types of calculated fields: rates and ratios. You will have practiced visualizing different dimensions and measures, using bar charts and KPI cards to tell a story. Finally, you will have combined these visualizations into a professional, interactive dashboard and, most importantly, used it to provide data-driven recommendations. This end-to-end project is a perfect addition to a beginner’s portfolio and serves as a strong foundation for the more complex projects to come.

Building on Your Foundational Skills

In the first project, you built a complete dashboard to analyze customer churn. You mastered basic calculated fields and combined worksheets into a coherent story. Our second beginner project, “Analyzing Job Market Data with Tableau,” will build directly on those skills. This case study will challenge you to work with a real-world job posting dataset, which is often messier than clean, fictional company data. This project will require you to perform more complex data preparation and exploratory analysis. It will also introduce you to more advanced calculated fields, specifically those involving string manipulation and text parsing. This is a critical skill, as much of the world’s data is unstructured text.

Beginner Project 2: Analyzing Job Market Data

For this case study, you will be an analyst for a fictional recruitment company called “DataSearch.” DataSearch wants to use data to provide better advice to its two main client groups: the companies hiring for data roles and the candidates looking for those jobs. Your mission is to explore a real-world job posting dataset to uncover insights. You will need to determine the skills most in demand for data scientists, data analysts, and data engineers. Your final dashboard will help DataSearch answer questions like “What skills should our candidates be learning?” and “What are the realistic salary ranges our clients should be offering?” This project is perfect for your portfolio as it shows you can work with messy, public data and extract valuable, real-world insights.

The Business Context: A Fictional Recruiter “DataSearch”

DataSearch operates in a highly competitive market. Their reputation depends on their expertise. When a company comes to them to find a “Data Scientist,” DataSearch needs to be able to advise them on what a realistic salary is and what skills the top candidates possess. When a job seeker comes to them, they need to provide guidance on which skills to highlight on their resume to be most competitive. Currently, their recruiters are operating on “gut feel” and personal experience. Your dashboard will replace this anecdotal evidence with hard data. This will allow DataSearch to become a true thought leader in the data recruitment space, giving them a competitive advantage and building trust with their clients.

Exploring the Real-World Job Posting Dataset

The dataset for this project would be a scrape from a job board. Imagine a large spreadsheet with columns like JobID, Job_Title, Company_Name, Location, Job_Description, and Salary_Range. Right away, you will notice challenges. The Job_Title column will be very messy, with entries like “Data Scientist,” “Sr. Data Scientist,” “Data Scientist II,” and “Lead DS.” The Location field might be similarly messy. The Job_Description will be a large block of text. The Salary_Range will be a string, like “$120,000 – $140,000” or “80k-95k,” not a number. This messy data is the core challenge and learning opportunity of this project. Your first task is not to visualize, but to clean.

Step 1: Data Cleaning and Preparation

In Tableau, you will use the Data Source tab and calculated fields to clean this data. The Job_Title field is your first priority. You need to normalize it. You will create a calculated field called Job_Role_Normalized using conditional logic. This might look like: IF CONTAINS(UPPER([Job_Title]), “SCIENTIST”) THEN “Data Scientist” ELSEIF CONTAINS(UPPER([Job_Title]), “ANALYST”) THEN “Data Analyst” ELSEIF CONTAINS(UPPER([Job_Title]), “ENGINEER”) THEN “Data Engineer” ELSE “Other” END. This calculation groups the messy titles into clean, analyzable categories. You will need to do something similar for Location, perhaps by extracting a state or major city. This step is 90% of the work and is a crucial skill.

Step 2: Creating Calculated Fields for Deeper Analysis

The real magic comes from parsing the Job_Description and Salary_Range fields. To find the most in-demand skills, you must search the text. You will create a series of boolean or numeric calculated fields. For example, you can create a field called Skill_Python with the formula: IF CONTAINS(UPPER([Job_Description]), “PYTHON”) THEN 1 ELSE 0 END. You will repeat this for “R,” “SQL,” “TABLEAU,” “POWER BI,” “SPARK,” “AWS,” and “EXCEL.” This technique, while simple, is incredibly powerful. It turns unstructured text into structured, analyzable data. For the Salary_Range, you will need to use string functions like SPLIT, TRIM, and FLOAT to extract the low and high ends of the range into new numeric fields called Min_Salary and Max_Salary. This is a significant, intermediate-level challenge.

Step 3: Visualizing Key Insights

Now that your data is clean and enriched, you can create your worksheets. What are the most in-demand roles? Create a bar chart using your Job_Role_Normalized field and COUNTD([JobID]). Where are the jobs? Create a map visualization. Drag Location to the Marks card, set it to “Map,” and then drag COUNTD([JobID]) to the “Size” card to create a symbol map. What are the most in-demand skills? Create a bar chart. Drag the names of your new skill fields (Skill_Python, Skill_SQL, etc.) to the rows shelf and their SUM to the columns shelf. This will show a ranked list of the most requested skills. What is the salary distribution? Create a histogram using your new Min_Salary field to see the salary landscape.

Step 4: Assembling the Job Market Dashboard

You can now assemble your visuals into an interactive dashboard. This project is a great candidate for a multi-tab dashboard. The first tab could be a “Market Overview,” showing the map of job locations and the bar chart of the most in-demand roles. The second tab could be a “Skill Deep Dive.” On this tab, you would place your bar chart of in-demand skills. You would then add a filter for your Job_Role_Normalized field. This creates a powerful, interactive tool: a user can select “Data Scientist” from the filter and watch the skills bar chart instantly update to show only the skills required for that specific role. A third tab could show the salary histograms and be filterable by role.

Step 5: Telling the Story to Stakeholders

Your dashboard is complete, and now you must present your findings to the DataSearch recruiters. Your story is no longer a guess; it is backed by data. You can confidently state, “For candidates seeking a Data Analyst role, our data shows that SQL and Tableau are the two most critical skills to have. We should advise them to feature these prominently.” You can also tell your clients, “To hire a Data Scientist, you need to be competitive. Our analysis of postings shows the median minimum salary for this role is $110,000. Clients offering $80,000 are unlikely to attract qualified candidates.” This ability to provide specific, actionable, data-backed advice is what separates a dashboard builder from a true data analyst.

Skills Gained and Portfolio Value

This project significantly enhances your portfolio. You have demonstrated that you can handle messy, real-world text data, which is a highly sought-after skill. You have mastered complex string manipulation and data normalization techniques. You have created calculated fields that turn unstructured text into quantitative metrics. You have also built a sophisticated, interactive, multi-tab dashboard that serves as a powerful analytical tool for its users. This project moves you beyond basic reporting and into the realm of true data analysis, showing that you can find the “signal in the noise” and deliver valuable insights from a complex dataset.

Consolidating Your Beginner Skills

In our first two projects, you analyzed customer churn and job market data. You learned to create calculated fields, clean messy text data, and build interactive dashboards. Our third and final beginner project, “HR Analytics in Tableau,” will serve as a capstone to consolidate these skills. This case study will introduce a new, crucial concept: working with a relational data model. Instead of analyzing a single flat file, you will explore the HR data of a fictional software company called “Atlas Labs” by connecting multiple tables. The primary objective is to summarize information on key employee metrics and delve deeper into areas such as employee performance and, once again, attrition. This project is the final step in mastering the fundamentals of Tableau.

Beginner Project 3: HR Analytics in Tableau

In this case study, you will be an analyst for Atlas Labs. The HR department wants to understand its workforce better. They have two main concerns: employee attrition (which is high) and employee performance. Your task is to build a dashboard that gives them a complete overview of their employees and helps them understand the factors contributing to attrition. This will involve identifying appropriate actions to improve employee retention. This project will require you to create relationships between multiple data tables, create calculated fields to understand the drivers of attrition, and build a dashboard that is ready to be distributed to the HR leadership team. This is a very common and high-impact use case for business intelligence in any company.

The Business Context: “Atlas Labs” Software Company

Atlas Labs is a fast-growing software company, but it is struggling to retain its talent. Employee turnover is expensive, as it costs a lot of time and money to recruit and train new employees. It also damages team morale and productivity. The HR department has a lot of data, but it is all in different spreadsheets and systems. They have no single, unified view. They need your help to connect these disparate data sources and build a “single source of truth.” They have specific questions: Are we losing our best performers? Which departments have the biggest problem? How long do employees typically stay before they leave? Your dashboard will be the tool that provides these answers and guides their new retention strategy.

Understanding the Data Model: Relationships

This is the key new skill for this project. Your data is not in one file. Instead, you have been given three separate files. The first is an Employee_Info table with columns like EmployeeID, Name, Age, Gender, Hire_Date, and Department. The second is an Employee_Performance table with EmployeeID, Last_Performance_Rating (e.g., 1-5), and Promotion_Last_5Years (Yes/No). The third is an Employee_Attrition table with EmployeeID, Attrition_Status (Active/Terminated), and Date_of_Leave. In Tableau, you will not join these manually. You will go to the Data Source tab, drag all three tables onto the canvas, and Tableau will allow you to create “relationships” between them, using the EmployeeID field as the common key. This modern “noodle” connection is flexible and powerful, letting Tableau query the tables as needed without creating one massive, slow table.

Step 1: Data Preparation and Joining Tables

With your data model in place, your first step is to create the foundational worksheets. Thanks to the relationships you defined, you can now pull fields from any of the tables, and Tableau will automatically handle the joins in the background. For example, you can drag Department (from Employee_Info) to rows and COUNTD(EmployeeID) (from any table) to columns to see the headcount by department. This is the power of a relational model. You can immediately start mixing and matching fields from all your sources to explore the data. This simple setup allows you to analyze how performance ratings (from one table) and department (from another table) both affect attrition (from a third table).

Step 2: Key Metrics and Calculated Fields

This project requires several key calculated fields. First, just like in the churn project, you will need an Attrition_Rate. This will be SUM(IF [Attrition_Status] = ‘Terminated’ THEN 1 ELSE 0 END) / COUNTD([EmployeeID]). Second, you will need to calculate Employee_Tenure. You can do this with a DATEDIFF function: DATEDIFF(‘month’, [Hire_Date], TODAY()). This will give you the number of months each employee has worked at the company. You can then right-click this new Employee_Tenure field to create “bins” (e.g., 0-1 years, 1-2 years, 3-5 years) to group employees by their tenure. These calculated fields are the core metrics the HR department needs to track.

Step 3: Answering Core HR Questions with Visuals

Now you build your worksheets to answer the key business questions. What is the company’s overall headcount and attrition rate? Create two KPI cards with these numbers. Which departments have the highest attrition? Create a bar chart with Department on the rows shelf and your Attrition_Rate measure on the columns shelf. This will instantly show which parts of the business are “leaking” talent. Is there a link between performance and attrition? This is the “money” chart. Create a bar chart with Last_Performance_Rating on the rows shelf and Attrition_Rate on the columns. The results might surprise you. Often, both the lowest-performing and the highest-performing employees have high attrition, the latter being poached by competitors. Finally, create a line chart showing Attrition_Rate by Employee_Tenure bins to see when people leave.

Step 4: Designing the HR Attrition Dashboard

As you progress through this beginner project, you will eventually create a comprehensive dashboard. This dashboard will be the central hub for the HR team. At the top, you should place your main KPIs: Total Headcount, Overall Attrition Rate, and Average Employee Tenure. Below these, you can arrange your key charts. The Attrition_Rate by Department bar chart is essential. The Attrition_Rate by Performance_Rating chart is your key insight. You could also include a demographic breakdown of the workforce (e.g., a pie chart for gender or a bar chart for age bins). On the right-hand side, add interactive filters for Department, Gender, and Promotion_Last_5Years, allowing the HR team to slice the data and explore their own questions.

Step 5: Providing Actionable Retention Strategies

Your dashboard is now a powerful tool for change. Based on your visualizations, you can provide specific, data-driven recommendations to the Atlas Labs HR team. For example: “Our analysis reveals two critical problems. First, the ‘Sales’ department has an attrition rate of 35%, which is double the company average. We recommend investigating the management and compensation structure in this department. Second, our ‘Outstanding’ performers (Rating 5) are leaving at a 30% rate. This indicates our top talent is at high risk. We must immediately review the compensation and career progression paths for these key employees before we lose more of them.” This kind of insight is invaluable and demonstrates your ability to think like a strategic partner.

From Beginner to Intermediate: What This Project Unlocks

By completing this capstone beginner project, you have solidified your fundamental Tableau knowledge. You have mastered calculated fields, including rates and date functions. You have built clean, interactive, and professional dashboards. Most importantly, you have learned how to work with a relational data model by creating relationships between multiple tables, a skill that is absolutely essential in any real-world BI role. You have now completed the beginner’s journey. Your portfolio contains three distinct, end-to-end projects that showcase your ability to solve business problems in a variety of domains. You are now ready to move on to the intermediate level, where you will tackle more complex calculations and industry-specific analyses.

Entering the Intermediate Level

You have successfully completed the beginner projects and built a solid portfolio. You are comfortable with calculated fields, dashboard design, and even data modeling with relationships. Now, you are ready to transition to intermediate-level projects. What defines this transition? Intermediate projects move beyond just reporting what happened and focus deeply on analyzing why it happened and prescribing what to do next. They involve more complex, domain-specific calculations and require you to have a deeper understanding of the business logic. These case studies will help you manage an end-to-end process that not only deepens your Tableau skills but also your industry-specific knowledge. Our first intermediate project is a deep dive into inventory analysis.

Intermediate Project 1: Inventory Analysis

In this case study, you will put your skills to the test with a real-world challenge based on a fictional retailer called “WarmeHands Inc.” This company sells seasonal goods, and its success is entirely dependent on smart inventory management. If they stock too much, they will have a warehouse full of unsold goods and tied-up cash at the end of the season. If they stock too little, they will sell out of popular items, lose sales, and disappoint customers. The primary goal of this case study is to identify potential improvements to their inventory management and purchasing strategies. Your analysis will focus on two key inventory concepts: inventory turnover and ABC analysis. Your dashboard will help “WarmeHands Inc.” prioritize the right products.

The Business Context: “WarmeHands Inc.” Retailer

The core problem for any retailer is managing cash flow. Inventory is cash that is sitting on a shelf. The faster you can sell that inventory (turnover), the healthier your business. “WarmeHands Inc.” is struggling with this. Their purchasing manager is ordering products based on gut feel, and the finance department is worried that too much cash is tied up in slow-moving products. They have no clear, data-driven way to see which products are their “winners” (high sales, good turnover) and which are their “losers” (low sales, sitting in the warehouse for months). Your dashboard will be the first tool that gives them this visibility, allowing them to make purchasing decisions that are both profitable and capital-efficient.

The Data Challenge: From Transactions to Inventory

The dataset for this project will be more complex than the beginner projects. You will likely have multiple files: a Sales table with transactional data (e.g., Date, ProductID, Units_Sold), an Inventory_Snapshot table (e.g., Date, ProductID, Units_in_Stock), and a Products table (e.g., ProductID, Product_Name, Category, Cost_per_Unit, Price_per_Unit). The challenge here is that your Sales data is a continuous flow of transactions, while your Inventory data is a periodic snapshot (e.g., taken on the first of every month). You cannot simply join them. You will need to aggregate both tables to a common level of grain, such as Month, before you can analyze them together. This data modeling and aggregation step is a key intermediate challenge.

Step 1: Data Modeling and Preparation

In Tableau, you will connect to all three tables. You will need to aggregate your Sales data to get Total_Units_Sold per ProductID per Month. You will also need to aggregate your Inventory_Snapshot data to get Average_Units_in_Stock per ProductID per Month. Once you have these two aggregated datasets, you can join them with your Products table. This preparation phase is where you will spend a significant amount of your time. You will need to be careful with your levels of detail to ensure your numbers are correct. This step moves you from simple “drag and drop” to a more deliberate data engineering mindset.

Step 2: The Core of the Analysis: Inventory Turnover

Now you will create the core domain-specific calculations. The most important metric is the Inventory_Turnover_Ratio. This metric is not in your raw data; you must build it. First, you will calculate the Cost_of_Goods_Sold (COGS) for a period. This is a calculated field: [Total_Units_Sold] * [Cost_per_Unit]. Second, you will calculate the Average_Inventory_Value for that same period: [Average_Units_in_Stock] * [Cost_per_Unit]. Finally, you will create the Inventory_Turnover_Ratio with the formula: SUM([COGS]) / SUM([Average_Inventory_Value]). This single, non-trivial, multi-step calculation is a perfect example of an intermediate-level skill. It shows you can translate a complex business formula into a functional Tableau calculation.

Step 3: Advanced Analysis: ABC Classification

The second part of your analysis is an ABC classification. This is an inventory prioritization technique based on the Pareto principle (the 80/20 rule). “A” products are the small number of items that generate the vast majority of revenue (e.S., top 80%). “B” products are the next group (e.g., next 15%), and “C” products are the long tail of items that generate very little revenue (e.g., bottom 5%). To perform this analysis, you must use Table Calculations. First, you will calculate Total_Revenue for each product ([Total_Units_Sold] * [Price_per_Unit]). Then, you will create a Percent_of_Total_Revenue field. Next, you will create a Running_Total_Revenue field using the RUNNING_SUM() table calculation. Finally, you will create an ABC_Class field using this running total: IF [Running_Total_Revenue] <= 0.8 THEN “A” ELSEIF [Running_Total_Revenue] <= 0.95 THEN “B” ELSE “C” END.

Step 4: Building the Inventory Management Dashboard

Your dashboard should be a multi-tab analytical tool. The first tab, “Inventory Turnover,” will show your overall Inventory_Turnover_Ratio as a KPI. It should also include a bar chart showing the turnover ratio by Product_Category, allowing managers to see which categories are moving fast and which are slow. The second tab, “ABC Analysis,” will be your showpiece. You should create a Pareto chart by building a dual-axis chart: a bar chart of Total_Revenue by Product_Name (sorted descending), and a line chart on the second axis showing the Running_Total_Revenue. This visual perfectly illustrates the 80/20 rule. You will also include a detailed table that lists every product along with its ABC_Class, Total_Revenue, and Inventory_Turnover_Ratio.

Step 5: Strategic Recommendations for Purchasing

This dashboard provides immediate, actionable insights for the “WarmeHands Inc.” purchasing team. Your recommendations will be sharp and data-driven. “Our analysis shows that our ‘Class A’ products, while only 20% of our product count, drive 80% of our revenue. We must ensure these products are never out of stock, as a stockout here is devastating to our bottom line. Conversely, our ‘Class C’ products make up 50% of our product lines but contribute only 5% of revenue. Many of these products have an inventory turnover ratio below 1.0, meaning they sit on the shelf for more than a year. We recommend a strategy of delisting the worst offenders in Class C and re-investing that cash into ensuring our Class A products are always available.”

Skills Gained: Moving Beyond Reporting to Analysis

This project is a massive step up for your portfolio. You have proven that you can handle a complex data model with different levels of granularity and time. You have demonstrated your ability to translate a complex, domain-specific business formula (Inventory_Turnover_Ratio) into a tangible, multi-step calculated field. You have also mastered one of the most powerful features in Tableau: Table Calculations (RUNNING_SUM()) to perform a sophisticated ABC analysis. This project shows you are no longer just a “dashboard creator”; you are a true analyst who can provide deep, strategic insights that directly impact a company’s financial health.

Deepening Your Analytical Expertise

After mastering the inventory analysis project, you have demonstrated your ability to handle complex, domain-specific calculations and table calculations. You are now comfortable in the intermediate tier. Our next project, “Supply Chain Analysis in Tableau,” builds on this by introducing even greater data complexity. Instead of just looking at static inventory, you will now analyze the entire flow of goods, from the moment an order is placed to the moment it arrives at the customer’s door. This case study will challenge you to delve into several rich datasets, tracking items across multiple stages and time. Your final dashboard will be a tool for stakeholders to gain a deep understanding of their supply chain’s efficiency and to identify costly bottlenecks.

Intermediate Project 2: Supply Chain Analysis

In this case study, you will act as a supply chain analyst. Your company is struggling with customer complaints about late deliveries and is unsure where the delays are coming from. Is the warehouse too slow to pack orders? Are the shipping carriers unreliable? Are products frequently out of stock, leading to replenishment delays? The data is available, but it is siloed in different systems. Your task is to connect these datasets to develop a final dashboard that provides a single, unified view of the supply chain. You will analyze order and shipment information, inventory files, and supplier lead times to provide deep insights into stock replenishment and fulfillment efficiency.

The Business Context: From Order to Delivery

A modern supply chain is a complex web of processes. An order is placed by a customer. The order is processed by the system. The warehouse must pick and pack the items. A shipping carrier is scheduled. The package is shipped. Finally, it is delivered. A delay at any one of these steps can result in a late delivery and an unhappy customer. The business has very little visibility into this process. They do not know their “On-Time Delivery Rate” or their average “Time to Ship.” Without this data, they cannot hold their suppliers, carriers, or internal teams accountable. Your dashboard will be the first tool that shines a light on this entire process, identifying exactly where the bottlenecks are and allowing managers to make targeted improvements.

Understanding the Multi-Table Data Model

This project features the most complex data model so far. You will be working with several tables that are all related. First, an Orders table with OrderID, CustomerID, and Order_Date. Second, a Shipments table with ShipmentID, OrderID, Ship_Date, Promised_Delivery_Date, and Actual_Delivery_Date. Third, an Inventory_Logs table with LogID, ProductID, Stock_on_Hand, and Date. Fourth, a Replenishment_Times table with ProductID, SupplierID, and Lead_Time_Days. You will need to connect all of these tables in Tableau’s Data Source tab. Orders will connect to Shipments on OrderID. The other tables will likely connect based on ProductID. This relational model is the foundation of your analysis, allowing you to track an order’s journey and check inventory status.

Step 1: Data Modeling and Calculated Fields

Your first task is to leverage this data model to create the key metrics. This will involve several DATEDIFF calculations to analyze the time taken at each stage. You will create a Time_to_Ship field: DATEDIFF(‘day’, [Order_Date], [Ship_Date]). This measures your internal warehouse efficiency. You will create a Delivery_Variance field: DATEDIFF(‘day’, [Promised_Delivery_Date], [Actual_Delivery_Date]). A positive number here is a late delivery, while a negative number is an early delivery. Most importantly, you will create a boolean On_Time_Delivery field: [Actual_Delivery_Date] <= [Promised_Delivery_Date]. From this, you can create your main KPI, the OTD_Rate: SUM(IF [On_Time_Delivery] THEN 1 ELSE 0 END) / COUNTD([OrderID]).

Step 2: Analyzing On-Time Delivery (OTD)

Your first set of visualizations will focus on the On-Time Delivery (OTD) rate. This is the company’s North Star metric. Your first worksheet will be a large KPI card showing the overall OTD_Rate for the last quarter. Next, you need to break this down. Create a bar chart showing OTD_Rate by Shipping_Carrier. This will immediately show if certain carriers are less reliable than others. Create another bar chart showing OTD_Rate by Warehouse_Location to see if certain warehouses are underperforming. You should also create a line chart showing the OTD_Rate over time (by month) to see if the problem is getting better or worse. Finally, create a histogram of your Delivery_Variance field to see the distribution of how late or early your packages are.

Step 3: Analyzing Stock and Replenishment

The second part of your analysis is to understand why delays might be happening. A common reason for a long Time_to_Ship is that the product was out of stock. You will need to use your Inventory_Logs and Replenishment_Times tables. A key challenge here is finding the “current” stock. You may need to use a Level of Detail (LOD) expression to do this, such as {FIXED [ProductID] : MAX([Stock_on_Hand])} (assuming the log is cumulative or you filter for the latest date). You can then create a scatter plot. On one axis, put Current_Stock. On the other axis, put Lead_Time_Days. This visual will instantly identify high-risk products: those in the bottom-right quadrant have a long lead time and low current stock, meaning they are at high risk of a stockout.

Step 4: Building the End-to-End Supply Chain Dashboard

You will now assemble your worksheets into a final, interactive dashboard. This could be a multi-tab dashboard. The first tab, “Order Fulfillment,” would contain your OTD analysis. It would have the main OTD_Rate KPI at the top. Below, it would feature the bar chart of OTD_Rate by Carrier and the histogram of Delivery_Variance. You would also include a detailed table of recent late orders. The second tab, “Inventory & Replenishment,” would feature your risk-analysis scatter plot. It would also have a bar chart of Average_Time_to_Ship by Product_Category to see if certain products are harder to pack. By adding filters for ProductID or SupplierID, you create a powerful tool for managers to investigate problems.

Step 5: Identifying and Explaining Bottlenecks

Your completed dashboard is a powerful diagnostic tool. You are now prepared to present your findings to management. You can move beyond “customers are unhappy” and provide specific, data-driven explanations. “Our analysis shows that our overall On-Time Delivery rate is 82%. While our internal ‘Time to Ship’ is consistent at 2 days, our Delivery_Variance analysis shows that Carrier X has an OTD rate of only 65%, making them the primary source of our late deliveries. We recommend renegotiating our contract or shifting volume to Carrier Y, who has a 94% OTD rate.” You can also add, “Furthermore, our inventory analysis identifies 15 high-risk products that have less than 10 days of stock on hand but require 30 days of lead time. We must place emergency replenishment orders for these items.”

Skills Gained: Mastering Time-Based and Relational Analysis

This project solidifies your intermediate status. You have successfully modeled and analyzed a complex, multi-table, time-series process. You have mastered DATEDIFF functions to analyze process efficiency and bottlenecks. You have also used Level of Detail (LOD) expressions to solve complex data aggregation problems, such as finding the “current” stock level. This project demonstrates that you can analyze an entire business process from end-to-end, identify the specific points of failure, and make strategic recommendations. This type of analysis is incredibly valuable and makes your portfolio stand out, showing you can solve complex, operations-focused problems.

Your Intermediate Capstone Project

You have now completed two challenging intermediate projects, mastering domain-specific formulas, table calculations, and complex time-series analysis. This final intermediate project, “E-commerce Analytics,” will be your capstone. It will require you to combine all your skills to solve one of the most classic and sophisticated problems in retail analytics: the market basket analysis. This case study will explore the data of a fictional online pet retailer called “Munchy’s.” Your objective will be to investigate how Munchy’s can increase sales and reduce expenses by analyzing what products are frequently purchased together. This will allow you to make data-driven upselling and cross-selling recommendations.

Intermediate Project 3: E-commerce Analytics

In this case study, you will explore the sales data of “Munchy’s.” The primary business objective is to increase the Average Order Value (AOV). The leadership team believes that they are missing opportunities for upselling (encouraging customers to buy a more expensive version of a product) and cross-selling (encouraging customers to add related items to their cart). Your task is to analyze the dataset to see which products are frequently purchased together and in what quantities. Your final deliverable will be a dashboard, or even a Tableau “Story,” that walks stakeholders through your consumer basket analysis and provides clear, actionable recommendations for product bundles and marketing promotions.

The Business Context: “Munchy’s” Pet Retailer

“Munchy’s” has a large and diverse product catalog, from premium dog food to cat toys and fish tanks. Their marketing is generic, offering simple “10% off” discounts. They have a hypothesis that they could be much more effective. For example, do people who buy a fish tank also buy a filter, gravel, and fish food in the same transaction? If so, the website should automatically recommend these items as a “New Tank Bundle.” Do customers who buy ‘Premium Dog Food’ also frequently buy ‘Dental Treats’? If so, they should run a “Buy one, get one 50% off” promotion for those two items. They have no data to support these ideas. Your analysis will provide the data-driven foundation for their new, targeted marketing and sales strategy.

The Analytical Goal: Market Basket Analysis

The core of this project is a market basket analysis. The central question is: “When a customer puts Product X in their basket, what else are they most likely to buy in that same transaction?” To answer this, you cannot just look at which products are popular. You must look at which products are “co-purchased.” This requires a specific and advanced data modeling technique. You will need to take your list of orders and find all the pairs of products that appeared on the same order. This analysis is the key to unlocking the cross-selling recommendations that the “Munchy’s” team is looking for. This is a common and highly-valued analysis in the e-commerce and retail worlds.

Step 1: The Data Modeling Challenge

This project’s primary technical challenge is the data model. You will likely have one large Orders table with columns like OrderID, ProductID, ProductName, Category, and Quantity. To find products bought together, you must join this table to itself. In the Tableau Data Source tab, you will drag the Orders table onto the canvas, and then drag the same Orders table onto the canvas again. You will create a join. The join condition will be OrderID = OrderID. This creates a table that shows every possible combination of products on the same order. To avoid duplicates (e.g., A-B and B-A) and self-pairs (A-A), you will add a second join condition: ProductID (from table 1) < ProductID (from table 2). This advanced self-join is a complex and powerful technique.

Step 2: Calculated Fields and Parameters

With your self-joined data model in place, your analysis is now possible. To make your dashboard interactive, you will create a “Parameter.” Right-click in the data pane and create a parameter called [Select a Product]. Populate this parameter with the list of ProductName from your Orders table. This parameter will be a dropdown menu that allows a user to select a single product. Then, you will create a calculated field, perhaps called [Selected_Product_in_Basket], with a formula like [ProductName (table 1)] = [Select a Product]. You will use this field to filter your analysis, showing only the co-purchases for the product the user has selected.

Step 3: Advanced Visualizations

This project allows you to use a variety of advanced visualization types. You can create a Dual-Axis Graph to show sales over time, with a line representing SUM(Sales) and bars on a second axis representing SUM(Quantity). You can create a Highlight Table (a text table with color) to show a matrix of co-purchases. Put ProductName (table 1) on the rows, ProductName (table 2) on the columns, and COUNTD(OrderID) on the color. This will create a “heat map” showing all the most popular product pairings in your entire dataset. You can also create a Map to show where your sales are coming from. The main visual, however, will be a simple bar chart. It will show ProductName (table 2) on the rows and COUNTD(OrderID) on the columns. You will then filter this entire worksheet using your [Selected_Product_in_Basket] field.

Step 4: Building the “Munchy’s” Story

For this capstone project, you should use Tableau’s “Story” feature. A story is a sequence of dashboards that you can present as a narrative. Your first story point, “Sales Overview,” could contain your map and your dual-axis sales-over-time chart. This sets the context. Your second story point, “Product Pairings,” could show the large highlight table, giving a high-level view of all co-purchases. Your third and final story point, “The Market Basket,” will be your most interactive. It will contain your [Select a Product] parameter and the bar chart that it filters. A user can select “Premium Cat Food” from the dropdown and watch the bar chart instantly update to show the top 5 other products bought with it, such as “Cat Litter” and “Tuna Treats.”

Understanding the Power of Strategic Product Recommendations

Upselling and cross-selling represent two of the most powerful yet underutilized strategies for revenue growth in modern retail and e-commerce environments. These techniques leverage existing customer relationships and purchasing intent to increase transaction values and customer lifetime value without the high costs associated with acquiring new customers. Upselling involves encouraging customers to purchase higher-value alternatives or upgraded versions of products they are already considering, while cross-selling suggests complementary products that enhance or complete their purchases. When executed strategically based on data insights rather than generic suggestions, these approaches can dramatically improve profitability while simultaneously enhancing customer satisfaction by helping shoppers discover products that genuinely meet their needs.

The fundamental principle underlying effective upselling and cross-selling is relevance. Generic recommendations that bear no logical relationship to what customers are purchasing create friction and annoyance rather than value. Customers ignore irrelevant suggestions or, worse, perceive them as manipulative attempts to extract additional spending. In contrast, thoughtful recommendations based on genuine purchase patterns and customer needs feel helpful rather than pushy. When a customer buying a camera is shown compatible lenses and memory cards, the suggestion makes intuitive sense and provides real value. When that same customer is randomly shown unrelated products, the recommendation wastes attention and damages trust. Data-driven approaches to product recommendations ensure suggestions are grounded in actual customer behavior patterns rather than assumptions or guesswork.

Market basket analysis provides the foundational technique for identifying which products customers naturally purchase together, revealing the hidden relationships within transaction data. This analytical approach examines thousands or millions of transactions to discover patterns about which product combinations appear together more frequently than would be expected by chance. These patterns reflect customer needs, use cases, and decision-making processes that may not be obvious to product managers or merchandisers working from intuition alone. A rigorous statistical approach to identifying co-purchase patterns ensures recommendations are based on evidence of actual customer behavior rather than assumptions about what should logically go together. The insights revealed through market basket analysis often surprise even experienced merchandisers who thought they understood their product relationships.

The business impact of implementing effective upselling and cross-selling strategies extends far beyond incremental revenue increases to affect multiple dimensions of organizational performance. Average transaction values increase as customers add additional items or upgrade to premium alternatives. Customer lifetime value improves as enhanced purchase experiences create satisfaction that drives repeat business. Inventory management becomes more efficient when bundled products move together predictably. Marketing campaigns become more targeted and effective when built around proven product relationships. Customer service improves when shoppers receive proactive suggestions that help them find everything they need. These compounding benefits mean that investment in developing sophisticated recommendation strategies typically delivers returns that far exceed the direct revenue lifts from additional product sales.

The Evolution from Intuition to Data-Driven Recommendations

Traditional merchandising relied heavily on buyer intuition and product knowledge to determine product placements, bundles, and promotional strategies. Experienced buyers developed deep understanding of products and customers through years of observation, enabling them to make educated guesses about what products should be featured together or how to structure promotions. This intuition-based approach produced some successes, particularly when buyers had extensive direct customer interaction that informed their understanding. However, intuitive approaches also suffered from significant limitations including confirmation bias where buyers saw evidence supporting their preconceptions while missing contradictory patterns, inability to identify subtle patterns across thousands of products, and difficulty adapting to changing customer preferences as markets evolved. The scale and complexity of modern retail operations exceed what human intuition can effectively manage without analytical support.

The digital revolution in retail created unprecedented opportunities to collect detailed transaction data capturing every purchase, product combination, customer segment, time period, and contextual factor. Point-of-sale systems, e-commerce platforms, and customer relationship management databases now record comprehensive information about customer behavior that previous generations of retailers could only dream about. This data explosion provides raw material for sophisticated analysis, but data alone does not automatically translate into better decisions. Many organizations accumulate vast quantities of transaction data without extracting meaningful insights because they lack the analytical capabilities, tools, or organizational processes to transform data into actionable intelligence. The potential value of retail data is realized only when appropriate analytical techniques are applied systematically to extract patterns and generate recommendations.

Market basket analysis emerged as the signature technique for mining transaction data to discover product relationships and inform merchandising strategies. This approach, grounded in association rule learning from the field of data mining, identifies sets of products that frequently appear together in transactions and quantifies the strength of these relationships through metrics like support, confidence, and lift. Support measures how frequently product combinations occur, confidence indicates the probability that customers buying one product will also purchase another, and lift reveals whether co-purchase rates exceed what would be expected if purchases were independent. These statistical measures enable merchandisers to distinguish meaningful patterns from random coincidences and to prioritize recommendations based on the strength of underlying relationships. Market basket analysis transforms vague intuitions about product relationships into precise, quantified insights that can guide specific business decisions.

The integration of advanced analytics into merchandising processes represents a fundamental shift in how retail organizations operate and compete. Rather than relying primarily on buyer intuition supplemented by basic sales reports, data-driven organizations systematically analyze customer behavior to inform decisions about product assortments, store layouts, pricing strategies, promotional designs, and recommendation algorithms. This analytical approach does not eliminate the role of human expertise but rather augments it with insights that would be impossible to identify through observation alone. The most effective organizations combine analytical insights with merchandising experience, using data to identify opportunities and patterns while relying on product knowledge and customer understanding to translate those insights into compelling offerings. The competitive advantage increasingly flows to organizations that can most effectively leverage their data assets to understand and serve customers.

Building the Foundation Through Effective Data Collection

Transaction data quality determines the reliability and value of any analytical insights derived from customer purchase patterns. Incomplete records, incorrect product identifications, duplicate transactions, and missing information all corrupt analyses and lead to flawed recommendations. Ensuring data quality requires attention to multiple dimensions including accuracy of captured information, completeness of transaction records, consistency of product identifiers across systems, and timeliness of data availability for analysis. Organizations must implement data governance practices that establish standards for data collection, validation processes that identify and correct errors, and data integration procedures that combine information from multiple sources into coherent datasets. The investment in data quality infrastructure may feel like overhead that does not directly generate revenue, but poor data quality guarantees that analytical efforts will produce unreliable insights that damage rather than improve business performance.

Product hierarchies and categorization schemes organize items into logical groupings that enable analysis at appropriate levels of aggregation. While transaction data captures individual product codes, analysis often needs to consider broader categories like product types, brands, price tiers, or departments. Well-designed product hierarchies allow flexible analysis that can zoom in on specific products or zoom out to see patterns across categories. These hierarchies must balance granularity that preserves important distinctions with aggregation that reveals meaningful patterns obscured by excessive detail. Creating and maintaining effective product taxonomies requires ongoing effort as new products are introduced, old products are discontinued, and category definitions evolve. However, this foundational work enables far more sophisticated analysis than would be possible with flat product lists lacking hierarchical structure.

Customer identification and tracking across transactions enables analysis of individual purchase patterns and customer segments beyond what single transaction analysis reveals. When customers can be identified consistently across purchases through loyalty programs, account logins, or payment methods, analysts can examine how individual customers evolve over time, identify high-value customer segments with distinct behaviors, and personalize recommendations based on individual purchase histories. However, customer tracking raises privacy considerations that must be managed carefully through transparent policies, secure data handling, and compliance with regulations. Organizations must balance the analytical value of customer-level data against privacy responsibilities and customer trust considerations. Some valuable analyses can be conducted using anonymous transaction data without customer identification, while others require linking purchases to individuals or at minimum to persistent pseudonymous identifiers.

Temporal dimensions of transaction data capture when purchases occur, revealing patterns related to seasonality, day of week, time of day, and promotional timing. Product relationships may vary systematically across time periods, with some combinations appearing together primarily during holidays, weekends, or other specific contexts. Temporal analysis can identify when to emphasize particular product bundles, how purchase patterns change across seasons, and whether promotional timing affects co-purchase rates. Recording precise transaction timestamps enables these temporal analyses, though many insights can be derived from coarser temporal granularity like day or week. Understanding the temporal dynamics of purchase behavior allows organizations to time their recommendations and promotions for maximum effectiveness rather than assuming static relationships that remain constant across all contexts.

The Market Basket Analysis Framework

Association rule mining provides the mathematical foundation for discovering product relationships within transaction data through systematic search for itemsets that appear together frequently. The analysis begins by identifying individual products that meet minimum support thresholds, then progressively builds larger itemsets by combining products that co-occur sufficiently often. This bottom-up approach efficiently explores the space of possible product combinations without requiring exhaustive enumeration of all possibilities. The resulting frequent itemsets reveal which product combinations customers actually purchase together, providing the raw material for generating association rules that express these relationships in actionable form. While the algorithmic details can become quite technical, the conceptual framework is straightforward: look for products that appear together more often than would be expected by chance.

Support metrics quantify how frequently particular product combinations occur in the dataset, expressed either as absolute transaction counts or as percentages of total transactions. High support indicates that a product combination appears commonly enough to be practically meaningful rather than being a rare coincidence. However, high support alone does not necessarily indicate a strong relationship, as very popular products will appear together frequently simply due to their individual popularity even if their purchases are independent. Support thresholds filter out rare combinations that occur too infrequently to justify business action, focusing attention on patterns affecting meaningful numbers of customers. Setting appropriate support thresholds requires balancing the desire to capture as many patterns as possible against practical limits on how many recommendations can be implemented and the need for statistical reliability based on adequate sample sizes.

Confidence measures express the conditional probability that customers who purchase one product will also purchase another, quantifying the predictive strength of association rules. A rule stating that customers who buy product A also buy product B with eighty percent confidence means that eighty percent of transactions containing A also contain B. High confidence rules provide strong bases for recommendations because they indicate that suggested products are likely to be relevant to customers who have already selected trigger products. However, confidence can be misleading when the consequent product is itself very popular, as high baseline purchase rates inflate confidence even for weak relationships. Confidence works best when evaluated alongside other metrics that account for baseline popularity and distinguish truly meaningful relationships from those driven primarily by individual product frequencies.

Lift metrics reveal whether product combinations occur more or less frequently than would be expected if purchases were independent, providing the most reliable indicator of relationship strength. Lift values greater than one indicate that products are purchased together more often than chance would predict, suggesting genuine associations worth leveraging. Lift values near one suggest independent purchases with no special relationship, while lift below one indicates products that are actually purchased together less often than expected, possibly due to substitution effects or incompatible use cases. Lift corrects for the baseline popularity issues that can distort confidence measures, enabling reliable identification of the strongest product relationships. Prioritizing high-lift associations ensures recommendations are based on genuine customer preferences rather than artifacts of individual product popularity.

Conclusion

After completing these six projects, you have built a solid, diverse, and impressive portfolio that will help you showcase your skills. You have progressed from a beginner, who can analyze a single file, to a high-functioning intermediate analyst. You have covered topics ranging from data modeling and transformation to advanced calculated fields, table calculations, LOD expressions, and complex self-joins. You have built dashboards for multiple industries (Telecom, HR, Retail) and solved real-world problems (Churn, Hiring, Inventory, Supply Chain, E-commerce). You are no longer just someone who has “learned the essentials”; you are an analyst who has a proven track record of building guided, end-to-end analytical solutions.