Advanced Analysis Types
Explore correlation, cohort, funnel, and scenario analysis
Introduction to Advanced Analysis
You've learned descriptive analysis (what happened?), comparative analysis (how do things differ?), and trend analysis (what's changing over time?). Now it's time to explore advanced analysis techniques that answer more specialized questions.
These techniques help you understand:
- What drives the total? (Contribution Analysis)
- Which groups behave differently? (Segmentation Analysis)
- Do two things move together? (Correlation Analysis)
- How do groups change over time? (Cohort Analysis)
- Where are we losing people? (Funnel Analysis)
- What do customers buy together? (Basket Analysis)
- What if we change something? (What-If Analysis)
Why learn advanced analysis?
These techniques are used daily in business, marketing, product development, and operations. They help you:
- Make data-driven decisions with confidence
- Understand customer behavior patterns
- Optimize processes and improve conversion
- Test different scenarios before committing resources
- Discover hidden relationships in your data
Let's explore each technique with real-world examples and practical applications.
1. Contribution Analysis
What it is: Contribution analysis breaks down a total into its parts and shows what percentage each part contributes to the whole. It answers the question: "What drives the total?"
When to use it:
- Understanding which products/categories generate the most revenue
- Analyzing budget allocation by department
- Determining where website traffic comes from
- Identifying which customer segments are most valuable
Example 1: Revenue by Product Category
Question: "Which product categories contribute most to our total revenue?"
| Category | Revenue | % of Total |
|---|---|---|
| Electronics | $125,000 | 50% |
| Clothing | $62,500 | 25% |
| Home Goods | $37,500 | 15% |
| Books | $17,500 | 7% |
| Sports | $7,500 | 3% |
| Total | $250,000 | 100% |
Insight: "Electronics alone accounts for half of our revenue. The top 2 categories (Electronics + Clothing) drive 75% of total revenue."
How to calculate percentage of total:
Electronics: ($125,000 ÷ $250,000) × 100 = 50%
Example 2: Website Traffic by Source
Question: "Where do our visitors come from?"
| Traffic Source | Visitors | % of Total |
|---|---|---|
| Organic Search | 4,200 | 42% |
| Direct | 2,500 | 25% |
| Social Media | 1,800 | 18% |
| 1,000 | 10% | |
| Paid Ads | 500 | 5% |
| Total | 10,000 | 100% |
Insight: "Organic search is our biggest traffic driver (42%), followed by direct visits. Paid ads contribute only 5%, suggesting we should invest more in SEO."
💡 Real-Life Analogy: Your Monthly Budget
Imagine your monthly income is $3,000. You want to know where it goes:
- Rent: $1,200 (40%)
- Food: $600 (20%)
- Transportation: $450 (15%)
- Entertainment: $300 (10%)
- Savings: $300 (10%)
- Other: $150 (5%)
This shows you that rent is your biggest expense, consuming 40% of your income. This is contribution analysis.
Key takeaway: Contribution analysis helps you focus on what matters most. Often, a small number of categories (like the top 2-3) drive the majority of your total. This is known as the Pareto Principle or 80/20 rule.
Interactive Contribution Calculator
Calculate the percentage contribution of each part:
2. Segmentation Analysis
What it is: Segmentation divides your data into meaningful groups based on shared characteristics. It answers: "Which groups behave differently?"
When to use it:
- Understanding different customer types (by age, location, behavior)
- Comparing performance across regions or departments
- Personalizing marketing messages for different audiences
- Identifying high-value vs. low-value customer segments
Example 1: Customer Segmentation by Age
Question: "Do different age groups spend differently?"
| Age Segment | Customers | Avg. Order Value | Total Revenue |
|---|---|---|---|
| 18-24 | 1,200 | $35 | $42,000 |
| 25-34 | 2,500 | $58 | $145,000 |
| 35-44 | 1,800 | $72 | $129,600 |
| 45-54 | 1,000 | $65 | $65,000 |
| 55+ | 500 | $48 | $24,000 |
Insight: "The 35-44 age group has the highest average order value ($72), while 25-34 drives the most total revenue due to volume. The 18-24 segment spends least per order ($35), suggesting we should offer entry-level products for this group."
Example 2: Geographic Segmentation
Question: "Which regions have the highest conversion rates?"
| Region | Visitors | Purchases | Conversion Rate |
|---|---|---|---|
| Northeast | 5,000 | 400 | 8.0% |
| Southeast | 6,500 | 325 | 5.0% |
| Midwest | 4,200 | 294 | 7.0% |
| West | 8,300 | 581 | 7.0% |
Insight: "The Northeast has the highest conversion rate (8%), despite having fewer visitors than the West. The Southeast has the lowest conversion rate (5%), indicating we may need to improve our offering or messaging in that region."
1. Demographic Segmentation
Group by age, gender, income, education
Example: Luxury brands targeting high-income customers
2. Geographic Segmentation
Group by location, region, country, climate
Example: Clothing retailers offering different products for warm vs. cold climates
3. Behavioral Segmentation
Group by purchase history, usage frequency, loyalty
Example: Identifying "power users" who use your app daily vs. occasional users
4. Psychographic Segmentation
Group by lifestyle, values, interests, personality
Example: Outdoor brands targeting adventure seekers vs. casual hikers
Why segmentation reveals hidden insights: Averages can hide important differences between groups. For example, "average customer age is 35" doesn't tell you that you have two distinct segments: college students (age 20) and parents (age 50). Segmentation makes these groups visible.
3. Correlation Analysis
What it is: Correlation measures the relationship between two variables. It tells you if two things tend to move together. It answers: "Are these two variables related?"
Types of correlation:
- Positive correlation: When one variable increases, the other also increases
- Negative correlation: When one variable increases, the other decreases
- No correlation: The variables are independent—no predictable relationship
Positive Correlation
Definition: Both variables move in the same direction
Examples:
- Study hours ↑ → Test scores ↑
- Temperature ↑ → Ice cream sales ↑
- Ad spending ↑ → Website traffic ↑
Visual pattern: Data points form an upward slope
Negative Correlation
Definition: Variables move in opposite directions
Examples:
- Price ↑ → Sales volume ↓
- Distance from store ↑ → Visit frequency ↓
- Product defects ↑ → Customer satisfaction ↓
Visual pattern: Data points form a downward slope
No Correlation
Definition: No predictable relationship between variables
Examples:
- Shoe size → Intelligence
- Number of siblings → Salary
- Hair color → Math skills
Visual pattern: Data points scattered randomly
Example: Marketing Spend vs. Revenue
Question: "Does spending more on marketing increase revenue?"
| Month | Marketing Spend | Revenue |
|---|---|---|
| Jan | $5,000 | $45,000 |
| Feb | $8,000 | $62,000 |
| Mar | $6,500 | $54,000 |
| Apr | $10,000 | $78,000 |
| May | $7,500 | $59,000 |
| Jun | $12,000 | $89,000 |
Insight: "There appears to be a positive correlation—months with higher marketing spend tend to have higher revenue. When we spent $12,000 (June), revenue was $89,000. When we spent $5,000 (January), revenue was only $45,000."
⚠️ CRITICAL: Correlation Does NOT Equal Causation
This is one of the most important concepts in data analysis. Just because two things are correlated doesn't mean one causes the other.
Classic Example: Ice Cream Sales vs. Drowning Incidents
Observation: There's a strong positive correlation between ice cream sales and drowning incidents. When ice cream sales go up, drownings also go up.
❌ Wrong conclusion: "Ice cream causes drowning!" or "Drowning causes people to buy ice cream!"
✓ Correct explanation: Both are caused by a third variable: hot weather. When it's hot, people buy more ice cream AND more people go swimming (increasing drowning risk).
Hot Weather (Cause)
↓ ↓
Ice Cream Sales ↑ Swimming ↑ → Drownings ↑
(These are correlated but neither causes the other)
More Examples of Misleading Correlations
1. Nicolas Cage Movies vs. Pool Drownings
Correlation: The number of Nicolas Cage movies per year correlates with pool drownings
Reality: Pure coincidence. No causal relationship.
2. Firefighters vs. Fire Damage
Correlation: More firefighters at a fire → More damage
Reality: Big fires cause both more firefighters to respond AND more damage. Fire size is the hidden cause.
3. Shoe Size vs. Reading Ability (in children)
Correlation: Bigger shoe size correlates with better reading
Reality: Age is the hidden variable. Older children have bigger feet AND read better.
Key takeaway: Correlation is useful for finding patterns, but always ask: "Could there be a hidden third variable?" or "Could this be reverse causation?" or "Could this be pure coincidence?" Before claiming one thing causes another, you need controlled experiments, not just correlation.
4. Cohort Analysis
What it is: Cohort analysis follows specific groups of people (cohorts) over time to understand how their behavior changes. A cohort is a group that shares a common characteristic during a specific time period.
When to use it:
- Tracking retention rates for customers acquired in different months
- Comparing performance of different student classes over time
- Understanding how user engagement changes after signup
- Measuring the long-term value of customers from different acquisition channels
Example: Customer Retention by Signup Month
Question: "Do customers who signed up in January stay longer than those who signed up in February?"
Cohort definition: Month of signup
| Cohort | Month 0 | Month 1 | Month 2 | Month 3 |
|---|---|---|---|---|
| January | 100% (1,000 users) |
60% (600 users) |
45% (450 users) |
38% (380 users) |
| February | 100% (1,200 users) |
70% (840 users) |
58% (696 users) |
50% (600 users) |
| March | 100% (900 users) |
68% (612 users) |
55% (495 users) |
48% (432 users) |
How to read this:
- Month 0: All cohorts start at 100% (their signup month)
- Month 1: 60% of January signups are still active, 70% of February signups are still active
- Month 3: January cohort has 38% retention, February has 50% retention
Insight: "The February cohort has better retention than January at every time period. After 3 months, 50% of February users are still active vs. only 38% of January users. We should investigate what was different about February (better onboarding? Different traffic source? Product improvements?)"
Example: Revenue per Cohort Over Time
Question: "How much revenue does each cohort generate over time?"
| Acquisition Cohort | Month 0 | Month 1 | Month 2 | Total |
|---|---|---|---|---|
| Q1 Customers | $25,000 | $18,000 | $15,000 | $58,000 |
| Q2 Customers | $30,000 | $24,000 | $22,000 | $76,000 |
Insight: "Q2 customers are more valuable—they spent more initially ($30K vs. $25K) and maintained higher spending over time. This suggests Q2 acquisition efforts attracted higher-quality customers."
🎓 Real-Life Analogy: Graduating Classes
Think of your high school graduating class as a cohort:
- Class of 2020 (cohort defined by graduation year)
- You track them: 1 year later, 5 years later, 10 years later
- You measure: % employed, average salary, % in grad school
- You compare Class of 2020 vs. Class of 2021 to see if outcomes differ
This is cohort analysis—following a specific group over time and comparing different groups.
Why cohort analysis is powerful: It reveals trends that simple averages hide. For example, overall retention might look stable, but cohort analysis could show that recent cohorts are performing worse—a warning sign you'd miss otherwise.
5. Funnel Analysis
What it is: Funnel analysis tracks how people progress through a series of steps, measuring how many drop off at each stage. It's called a "funnel" because the number of people decreases at each step, like a funnel shape.
When to use it:
- E-commerce: Tracking visitors → cart → checkout → purchase
- Sales: Leads → qualified → demo → proposal → closed
- Hiring: Applicants → interview → offer → hired
- Onboarding: Signup → profile setup → first action → active user
Example 1: E-commerce Purchase Funnel
Question: "Where are we losing potential customers?"
Conversion rates:
- Visit → Cart: 40%
- Cart → Checkout: 75% (3,000 ÷ 4,000)
- Checkout → Purchase: 67% (2,000 ÷ 3,000)
- Overall conversion: 20% (2,000 purchases ÷ 10,000 visitors)
Insight: "Our biggest drop-off is at the first step—60% of visitors never add anything to their cart. We should focus on improving product pages, images, and descriptions to encourage cart adds."
Example 2: Job Application Funnel
Question: "How efficient is our hiring process?"
| Stage | Candidates | Pass Rate | % of Original |
|---|---|---|---|
| Applications | 500 | - | 100% |
| Phone Screen | 100 | 20% | 20% |
| Technical Interview | 30 | 30% | 6% |
| Final Interview | 12 | 40% | 2.4% |
| Offer Made | 5 | 42% | 1.0% |
| Offer Accepted | 4 | 80% | 0.8% |
Insight: "We need 500 applicants to make 1 hire (0.8% overall conversion). Our biggest filter is the phone screen (only 20% pass). One concern: only 80% of offers are accepted—we may be losing good candidates to competitors at the final stage."
How to optimize a funnel:
- Identify the biggest drop-offs: Focus on stages with the largest percentage loss
- Ask why people leave: Survey drop-offs, run user tests, analyze behavior
- Test improvements: Simplify forms, improve messaging, reduce friction
- Measure impact: Track conversion rate changes after each improvement
6. Basket Analysis
What it is: Basket analysis (also called market basket analysis or association analysis) identifies which items are frequently purchased together. It answers: "What do customers buy together?"
When to use it:
- Product recommendations: "Customers who bought X also bought Y"
- Store layout: Placing related items near each other
- Bundle pricing: Creating product bundles
- Cross-selling: Suggesting complementary products at checkout
Example 1: Grocery Store Basket Analysis
Question: "What products are purchased together?"
Classic findings:
| If customer buys... | They also often buy... | % of transactions |
|---|---|---|
| Diapers | Beer | 35% |
| Pasta | Pasta sauce | 68% |
| Hot dogs | Hot dog buns | 72% |
| Chips | Soda | 45% |
| Coffee | Cream | 52% |
Insight: "72% of customers who buy hot dogs also buy buns. We should place these items near each other and create a 'BBQ Bundle' promotion. The diapers-beer correlation is famous—young parents buying diapers often grab beer in the same trip."
Example 2: E-commerce Product Associations
Question: "What should we recommend when someone views a laptop?"
Analysis results:
- Laptop buyers also bought:
- Laptop bag (42% of transactions)
- Mouse (38% of transactions)
- External hard drive (28% of transactions)
- Screen protector (22% of transactions)
Action: "When someone adds a laptop to their cart, show: 'Customers also bought: Laptop Bag, Mouse, External Hard Drive.' This increases average order value."
🍔 Real-Life Analogy: Fast Food Combos
Fast food restaurants use basket analysis to create combo meals:
- Data shows: 80% of burger buyers also buy fries and a drink
- Action: Create a "Combo Meal" (burger + fries + drink) at a slight discount
- Result: Customers are happy (convenience + savings), restaurant increases sales
This is basket analysis in action—using purchase patterns to create bundled offers.
Key metrics in basket analysis:
- Support: How often do items appear together? (Raw frequency)
- Confidence: If someone buys A, what's the probability they buy B?
- Lift: How much more likely is B to be purchased when A is purchased vs. when A is not purchased?
Don't worry about the math for now—just understand that basket analysis finds patterns in what customers buy together.
7. What-If Analysis (Scenario Planning)
What it is: What-if analysis tests different scenarios by changing input variables to see how outputs change. It answers: "What would happen if we changed X?"
When to use it:
- Pricing decisions: "What if we increase price by 10%?"
- Staffing: "What if we hire 2 more salespeople?"
- Budgeting: "What if costs increase 15%?"
- Risk planning: Best case / expected case / worst case scenarios
Example 1: Price Increase Scenario
Question: "What happens to revenue if we raise prices?"
Current situation:
- Price: $50 per unit
- Sales volume: 2,000 units/month
- Revenue: $50 × 2,000 = $100,000
| Scenario | Price | Estimated Volume | Revenue | Change |
|---|---|---|---|---|
| Current | $50 | 2,000 | $100,000 | - |
| 5% increase | $52.50 | 1,900 (-5%) | $99,750 | -$250 |
| 10% increase | $55 | 1,800 (-10%) | $99,000 | -$1,000 |
| 15% increase | $57.50 | 1,700 (-15%) | $97,750 | -$2,250 |
Insight: "If price increases cause proportional volume decreases, raising prices actually hurts revenue. We should explore other strategies (reduce costs, improve product, find new markets) rather than raising prices."
Example 2: Hiring Decision Scenario
Question: "Should we hire 2 more salespeople?"
Current situation:
- Salespeople: 5
- Average sales per person: $200,000/year
- Total sales: $1,000,000/year
- Salary per salesperson: $60,000/year
- Total cost: $300,000/year
- Profit margin: 30%
- Net contribution: ($1M × 30%) - $300K = $0
| Scenario | Salespeople | Total Sales | Salary Cost | Profit (30%) | Net |
|---|---|---|---|---|---|
| Current | 5 | $1,000,000 | $300,000 | $300,000 | $0 |
| Add 2 (optimistic) | 7 | $1,400,000 | $420,000 | $420,000 | +$0 |
| Add 2 (realistic) | 7 | $1,300,000 | $420,000 | $390,000 | -$30,000 |
| Add 2 (pessimistic) | 7 | $1,200,000 | $420,000 | $360,000 | -$60,000 |
Insight: "New salespeople might not hit $200K in year 1. In the realistic scenario, we'd lose $30K. We should only hire if we're confident new reps can hit at least $171K each ($1.2M ÷ 7) to break even."
Interactive What-If Calculator
See how changing variables affects your outcome:
Three scenario approach:
- Best case: Optimistic assumptions (highest sales, lowest costs)
- Expected case: Realistic assumptions (most likely outcome)
- Worst case: Pessimistic assumptions (lowest sales, highest costs)
Planning for all three scenarios helps you make robust decisions and prepare for different outcomes.
Interactive Analysis Type Matcher
Test your understanding! For each business question, select the most appropriate analysis type.
1. "Which product categories generate the most revenue?"
2. "Do older customers spend more than younger customers?"
3. "Is there a relationship between ad spend and website traffic?"
4. "Where in our checkout process are we losing customers?"
5. "What products do customers frequently buy together?"
6. "Do customers who signed up in Q1 have better retention than Q2 customers?"
7. "What would happen to profit if we increased prices by 15%?"
8. "What percentage of our budget is spent on marketing vs. operations?"
9. "How many applicants make it from initial application to final interview?"
10. "Do customers in different regions have different purchasing patterns?"
Choosing the Right Analysis Type
Use this decision guide to select the appropriate analysis technique for your question:
❓ Question Type: "What drives the total?"
Use: Contribution Analysis
Examples: Revenue by category, traffic by source, budget by department
❓ Question Type: "How do groups differ?"
Use: Segmentation Analysis
Examples: Comparing age groups, regions, customer types
❓ Question Type: "Are two things related?"
Use: Correlation Analysis
Examples: Ad spend vs. sales, temperature vs. ice cream sales
Warning: Remember, correlation ≠ causation!
❓ Question Type: "How do groups change over time?"
Use: Cohort Analysis
Examples: Retention by signup month, performance by graduating class
❓ Question Type: "Where are we losing people in a process?"
Use: Funnel Analysis
Examples: Checkout process, hiring pipeline, onboarding flow
❓ Question Type: "What do people buy/use together?"
Use: Basket Analysis
Examples: Product recommendations, bundle creation, store layout
❓ Question Type: "What would happen if we changed X?"
Use: What-If Analysis
Examples: Price changes, hiring decisions, budget scenarios
Practice Exercises
Apply what you've learned with these real-world scenarios.
Exercise 1: Contribution Analysis
A company's quarterly revenue by region:
- North: $180,000
- South: $120,000
- East: $90,000
- West: $60,000
Calculate:
- a) Total revenue
- b) Percentage contribution of each region
- c) Which region(s) should the company focus on?
Exercise 2: Segmentation Analysis
Email campaign results by customer segment:
| Segment | Emails Sent | Opened | Clicked |
|---|---|---|---|
| New Customers | 5,000 | 1,000 | 150 |
| Active Customers | 8,000 | 3,200 | 640 |
| Inactive Customers | 12,000 | 1,200 | 60 |
Calculate open rates and click rates for each segment. What do you notice?
Exercise 3: Correlation vs. Causation
A study finds: "Cities with more Starbucks locations have higher average incomes."
Questions:
- a) Is this positive or negative correlation?
- b) Does Starbucks cause higher incomes?
- c) What's a more likely explanation?
Exercise 4: Funnel Analysis
App onboarding funnel:
- Downloaded app: 10,000
- Created account: 6,000
- Completed profile: 3,000
- Made first action: 1,500
- Active after 7 days: 600
Questions:
- a) What's the biggest drop-off point?
- b) What's the overall conversion rate (download to active)?
- c) Where should the company focus improvements?
Exercise 5: What-If Analysis
Current situation: Selling 500 widgets/month at $100 each. Cost per widget: $60. Fixed costs: $10,000/month.
Calculate profit for these scenarios:
- a) Current scenario
- b) Increase price to $120 (expect 10% volume drop)
- c) Reduce cost to $50 (keep price at $100)
Exercise 6: Choosing Analysis Types
Match each business question to the best analysis type:
- "Our Q1 customers seem to churn faster than Q2 customers. Is this true?"
- "If we cut our ad budget by 20%, how would it affect leads?"
- "Which customer segments contribute most to our profit?"
- "Do people who buy our premium product also buy accessories?"
Key Takeaways
- Contribution Analysis: Shows what percentage each part contributes to the total (e.g., revenue by category)
- Segmentation Analysis: Divides data into groups to find different patterns (e.g., customer age groups)
- Correlation Analysis: Measures if two variables move together—but correlation ≠ causation!
- Cohort Analysis: Follows specific groups over time (e.g., customers by signup month)
- Funnel Analysis: Tracks progression through steps to find drop-off points (e.g., checkout process)
- Basket Analysis: Identifies items purchased together (e.g., product recommendations)
- What-If Analysis: Tests different scenarios to plan decisions (e.g., price changes)
- Choose the right tool: Match your question to the appropriate analysis technique
📝 Knowledge Check
1. What does contribution analysis help you understand?
2. Which analysis type would you use to compare customer behavior across different age groups?
3. "Ice cream sales and drowning incidents are correlated." What's the most likely explanation?
4. What is the main purpose of cohort analysis?
5. A website tracks: 10,000 visitors → 4,000 add to cart → 3,000 checkout → 2,000 purchase. What analysis is this?
6. "Customers who buy laptops often buy laptop bags." Which analysis identifies this pattern?
7. "What would happen to profit if we raised prices by 10%?" Which analysis answers this?
8. Two variables have a positive correlation. What does this mean?
9. In a funnel analysis showing 10,000 visitors → 2,000 purchases, what is the overall conversion rate?
10. What is the key difference between segmentation and cohort analysis?