DAY 16: SQL - CTEs & Recursive Queries

Guru Charan Pathalla

DAY 16: SQL - CTEs & Recursive Queries

Learning Objectives: - Master Common Table Expressions (CTEs) - Write recursive CTEs - Understand hierarchical data queries - Improve query readability with CTEs - Solve complex problems with temporary result sets

Topics Covered (60 min):

1. Common Table Expressions (CTEs) - WITH clause syntax - Temporary named result set - Improves readability over subqueries - Example: WITH SalesCTE AS (SELECT ...) SELECT * FROM SalesCTE

2. Multiple CTEs - Chain multiple WITH clauses - Reference earlier CTEs in later ones - Break complex queries into logical steps - Example: Calculate sales, then costs, then profit

3. Recursive CTEs - CTE that references itself - Anchor member (base case) - Recursive member (iterative case) - UNION ALL between members - Use for: org charts, bill of materials, hierarchies

4. Practical Applications - Employee hierarchies (manager chains) - Date series generation - Graph traversal - Parts explosion (manufacturing) - MAXRECURSION option to prevent infinite loops

Video Resources:

  1. Advanced SQL Joins (Visually Explained) | ANTI, CROSS (13:53)
    https://www.youtube.com/watch?v=Of2Z6hL0ETE

  2. SQL Tutorial 7: Cross Join, Self Join and Coalesce (15:40)
    https://www.youtube.com/watch?v=ylvcrmgO6GY

  3. JOIN Masterclass: INNER, LEFT, RIGHT, CROSS, SELF, UNION & UNION ALL (28:32)
    https://www.youtube.com/watch?v=lQBhzBhkGww

AI Learning Prompts (Concept Understanding Only):

Understanding CTEs:

I'm learning Common Table Expressions (CTEs) in SQL. Please explain:

1. What is a CTE and how is it different from a subquery or temp table?
2. When should I use CTEs vs subqueries?
3. Show me how to chain multiple CTEs together
4. How do I use CTEs to simplify complex queries?
5. Provide examples using sales and customer data

Use SQL Server syntax with practical examples.

Understanding Recursive CTEs:

Help me understand recursive CTEs:

1. What is recursion in SQL and when do I need it?
2. Explain anchor member vs recursive member
3. Show me how to query an employee hierarchy (find all reports under a manager)
4. How do I generate a date series (all dates in a month)?
5. What is MAXRECURSION and why is it important?

Provide step-by-step examples for beginners.

Guided Practice (60 min):

📝 How to approach Guided Practice:

  • Follow step-by-step: Work through each exercise in order
  • Use provided resources: Reference the video tutorials and AI prompts if you get stuck
  • Verify your work: Check your results match expected outcomes
  • Take notes: Document key learnings and shortcuts you discover
  • Ask for help: Use AI assistants to clarify concepts you don't understand

Exercise 1: Simplify Complex Queries with Basic CTEs

Goal: Learn to use Common Table Expressions (CTEs) to break down complex queries into readable, modular components.

Step-by-Step Instructions:

  1. Open your SQL editor and connect to your database
  2. Start with a complex nested query that's hard to read:
    -- Before: Complex nested subquery
    SELECT
        c.CustomerID,
        c.CustomerName,
        o.TotalOrders,
        o.TotalSpent
    FROM Customers c
    JOIN (
        SELECT
            CustomerID,
            COUNT(*) AS TotalOrders,
            SUM(Amount) AS TotalSpent
        FROM Orders
        WHERE OrderDate >= DATEADD(year, -1, GETDATE())
        GROUP BY CustomerID
        HAVING COUNT(*) > 5
    ) o ON c.CustomerID = o.CustomerID
    WHERE o.TotalSpent > 1000;
  3. Convert to CTE for clarity: Refactor using WITH clause to create a named result set:
    -- After: Using CTE for better readability
    WITH RecentCustomerOrders AS (
        SELECT
            CustomerID,
            COUNT(*) AS TotalOrders,
            SUM(Amount) AS TotalSpent
        FROM Orders
        WHERE OrderDate >= DATEADD(year, -1, GETDATE())
        GROUP BY CustomerID
        HAVING COUNT(*) > 5
    )
    SELECT
        c.CustomerID,
        c.CustomerName,
        rco.TotalOrders,
        rco.TotalSpent
    FROM Customers c
    JOIN RecentCustomerOrders rco ON c.CustomerID = rco.CustomerID
    WHERE rco.TotalSpent > 1000;
  4. Execute the CTE query by pressing F5 or Ctrl+E
  5. Observe the benefits: Notice how the CTE gives the subquery a meaningful name, making the main query easier to understand
  6. Add comments to document:
    -- CTE to identify high-value customers from the past year
    WITH RecentCustomerOrders AS (
        SELECT
            CustomerID,
            COUNT(*) AS TotalOrders,
            SUM(Amount) AS TotalSpent
        FROM Orders
        WHERE OrderDate >= DATEADD(year, -1, GETDATE())  -- Last 12 months
        GROUP BY CustomerID
        HAVING COUNT(*) > 5  -- Minimum 6 orders to qualify
    )
    -- Main query: Get customer details for high spenders
    SELECT
        c.CustomerID,
        c.CustomerName,
        rco.TotalOrders,
        rco.TotalSpent
    FROM Customers c
    JOIN RecentCustomerOrders rco ON c.CustomerID = rco.CustomerID
    WHERE rco.TotalSpent > 1000  -- Spent more than $1000
    ORDER BY rco.TotalSpent DESC;
  7. Create a CTE with column aliases: Specify column names explicitly in the CTE definition:
    WITH RecentCustomerOrders (CustomerID, OrderCount, TotalRevenue) AS (
        SELECT
            CustomerID,
            COUNT(*),
            SUM(Amount)
        FROM Orders
        WHERE OrderDate >= DATEADD(year, -1, GETDATE())
        GROUP BY CustomerID
        HAVING COUNT(*) > 5
    )
    SELECT
        c.CustomerName,
        rco.OrderCount,
        rco.TotalRevenue
    FROM Customers c
    JOIN RecentCustomerOrders rco ON c.CustomerID = rco.CustomerID
    WHERE rco.TotalRevenue > 1000;
  8. Test modifications: Try changing the date range or thresholds - notice how clear the query structure is

💡 Tips:

  • CTE syntax: Starts with WITH cte_name AS ( SELECT ... ) followed immediately by the main query - no semicolon between them
  • Scope limitation: CTEs only exist for the duration of a single query - they're not permanent like views or temp tables
  • Readability first: Use CTEs primarily for code organization and readability, not performance (performance is usually similar to subqueries)
  • Naming convention: Choose descriptive CTE names that explain what the data represents (e.g., HighValueCustomers, RecentOrders)
  • Column naming: You can specify column names either in CTE definition WITH cte (col1, col2) or in the SELECT clause
  • Keyboard shortcut: Use Ctrl+Shift+U to uppercase SQL keywords for consistency

✅ Expected Result: A clean, readable query that identifies high-value customers from the past year. The CTE makes the logic transparent: first identify qualifying customers, then join with customer details. Results show CustomerName, OrderCount, and TotalRevenue for customers meeting the criteria.

🔄 Connection to Day 10: Day 10 taught subqueries which CTEs replace for better readability. Instead of nesting SELECT statements inside FROM clauses, CTEs let you define named result sets upfront - same functionality, much clearer code!

Exercise 2: Chain Multiple CTEs for Complex Analysis

Goal: Master using multiple CTEs in sequence to build complex queries step-by-step, where each CTE can reference previous ones.

Step-by-Step Instructions:

  1. Open your SQL editor ready to work with order and product data
  2. Understand the requirement: Find products that are top sellers but have declining sales trends
  3. Create first CTE - Monthly Product Sales:
    WITH MonthlyProductSales AS (
        SELECT
            ProductID,
            YEAR(OrderDate) AS OrderYear,
            MONTH(OrderDate) AS OrderMonth,
            SUM(Quantity) AS UnitsSold,
            SUM(Amount) AS Revenue
        FROM Orders
        GROUP BY ProductID, YEAR(OrderDate), MONTH(OrderDate)
    )
    SELECT * FROM MonthlyProductSales
    ORDER BY ProductID, OrderYear, OrderMonth;
  4. Execute and review the monthly breakdown by pressing F5
  5. Add second CTE - Calculate Trends: Build on the first CTE to compute month-over-month changes:
    WITH MonthlyProductSales AS (
        SELECT
            ProductID,
            YEAR(OrderDate) AS OrderYear,
            MONTH(OrderDate) AS OrderMonth,
            SUM(Quantity) AS UnitsSold,
            SUM(Amount) AS Revenue
        FROM Orders
        GROUP BY ProductID, YEAR(OrderDate), MONTH(OrderDate)
    ),
    SalesTrends AS (
        SELECT
            ProductID,
            OrderYear,
            OrderMonth,
            UnitsSold,
            Revenue,
            LAG(Revenue, 1) OVER (PARTITION BY ProductID ORDER BY OrderYear, OrderMonth) AS PrevMonthRevenue,
            Revenue - LAG(Revenue, 1) OVER (PARTITION BY ProductID ORDER BY OrderYear, OrderMonth) AS RevenueChange
        FROM MonthlyProductSales
    )
    SELECT * FROM SalesTrends
    ORDER BY ProductID, OrderYear, OrderMonth;
  6. Add third CTE - Classify Products: Categorize products based on performance:
    WITH MonthlyProductSales AS (
        SELECT
            ProductID,
            YEAR(OrderDate) AS OrderYear,
            MONTH(OrderDate) AS OrderMonth,
            SUM(Quantity) AS UnitsSold,
            SUM(Amount) AS Revenue
        FROM Orders
        WHERE OrderDate >= DATEADD(month, -6, GETDATE())
        GROUP BY ProductID, YEAR(OrderDate), MONTH(OrderDate)
    ),
    SalesTrends AS (
        SELECT
            ProductID,
            OrderYear,
            OrderMonth,
            UnitsSold,
            Revenue,
            LAG(Revenue, 1) OVER (PARTITION BY ProductID ORDER BY OrderYear, OrderMonth) AS PrevMonthRevenue,
            Revenue - LAG(Revenue, 1) OVER (PARTITION BY ProductID ORDER BY OrderYear, OrderMonth) AS RevenueChange
        FROM MonthlyProductSales
    ),
    ProductPerformance AS (
        SELECT
            ProductID,
            SUM(Revenue) AS TotalRevenue,
            AVG(RevenueChange) AS AvgMonthlyChange,
            CASE
                WHEN AVG(RevenueChange) > 0 THEN 'Growing'
                WHEN AVG(RevenueChange) < 0 THEN 'Declining'
                ELSE 'Stable'
            END AS TrendCategory
        FROM SalesTrends
        WHERE PrevMonthRevenue IS NOT NULL
        GROUP BY ProductID
    )
    SELECT * FROM ProductPerformance
    ORDER BY TotalRevenue DESC;
  7. Add final query - Join with Product Details:
    WITH MonthlyProductSales AS (
        SELECT
            ProductID,
            YEAR(OrderDate) AS OrderYear,
            MONTH(OrderDate) AS OrderMonth,
            SUM(Quantity) AS UnitsSold,
            SUM(Amount) AS Revenue
        FROM Orders
        WHERE OrderDate >= DATEADD(month, -6, GETDATE())
        GROUP BY ProductID, YEAR(OrderDate), MONTH(OrderDate)
    ),
    SalesTrends AS (
        SELECT
            ProductID,
            OrderYear,
            OrderMonth,
            UnitsSold,
            Revenue,
            LAG(Revenue, 1) OVER (PARTITION BY ProductID ORDER BY OrderYear, OrderMonth) AS PrevMonthRevenue,
            Revenue - LAG(Revenue, 1) OVER (PARTITION BY ProductID ORDER BY OrderYear, OrderMonth) AS RevenueChange
        FROM MonthlyProductSales
    ),
    ProductPerformance AS (
        SELECT
            ProductID,
            SUM(Revenue) AS TotalRevenue,
            AVG(RevenueChange) AS AvgMonthlyChange,
            CASE
                WHEN AVG(RevenueChange) > 0 THEN 'Growing'
                WHEN AVG(RevenueChange) < 0 THEN 'Declining'
                ELSE 'Stable'
            END AS TrendCategory
        FROM SalesTrends
        WHERE PrevMonthRevenue IS NOT NULL
        GROUP BY ProductID
    )
    SELECT
        p.ProductName,
        p.Category,
        pp.TotalRevenue,
        pp.AvgMonthlyChange,
        pp.TrendCategory
    FROM ProductPerformance pp
    JOIN Products p ON pp.ProductID = p.ProductID
    WHERE pp.TotalRevenue > 10000  -- High revenue products only
        AND pp.TrendCategory = 'Declining'  -- But trending down
    ORDER BY pp.TotalRevenue DESC;
  8. Execute the complete query to find high-revenue products with declining sales - these need attention!
  9. Analyze the structure: Notice how each CTE builds on previous ones, creating a clear data pipeline

💡 Tips:

  • Multiple CTE syntax: Separate CTEs with commas: WITH cte1 AS (...), cte2 AS (...), cte3 AS (...)
  • CTE referencing: Later CTEs can reference earlier ones (cte2 can use cte1), but not vice versa - order matters!
  • Step-by-step development: Build complex queries incrementally - test each CTE individually by selecting from it before adding the next
  • Pipeline thinking: Each CTE represents a stage in data transformation - raw → aggregated → analyzed → enriched
  • Debugging tip: Comment out the main query and SELECT from intermediate CTEs to inspect data at each stage
  • Performance note: SQL Server may optimize multiple CTEs differently than nested subqueries - check execution plans for large datasets

✅ Expected Result: A list of high-revenue products experiencing declining sales trends over the past 6 months. Each row shows ProductName, Category, TotalRevenue (last 6 months), AvgMonthlyChange (negative value), and TrendCategory ('Declining'). This actionable insight helps prioritize which products need marketing intervention.

🔄 Connection to Day 14: Day 14 introduced window functions like LAG for trend analysis. Multiple CTEs organize these complex calculations into logical steps - first aggregate monthly, then calculate trends with LAG, then classify performance, finally enrich with product details!

Exercise 3: Build Recursive CTE for Employee Hierarchy

Goal: Master recursive CTEs to query hierarchical data structures like organizational charts, showing all employees under a manager.

Setup: Ensure you have an Employees table with columns: EmployeeID, EmployeeName, ManagerID (references EmployeeID of their manager, NULL for CEO).

Step-by-Step Instructions:

  1. Understand the data structure: Each employee has a ManagerID pointing to their boss - this creates a tree hierarchy
  2. View the flat data:
    SELECT
        EmployeeID,
        EmployeeName,
        ManagerID
    FROM Employees
    ORDER BY ManagerID, EmployeeID;
  3. Create a basic recursive CTE - Anchor Member: Start with the top of the hierarchy (CEO):
    WITH EmployeeHierarchy AS (
        -- Anchor member: Start with the CEO (ManagerID is NULL)
        SELECT
            EmployeeID,
            EmployeeName,
            ManagerID,
            0 AS Level
        FROM Employees
        WHERE ManagerID IS NULL
    )
    SELECT * FROM EmployeeHierarchy;
  4. Add the recursive member: Find all direct reports, then their reports, and so on:
    WITH EmployeeHierarchy AS (
        -- Anchor member: Start with the CEO
        SELECT
            EmployeeID,
            EmployeeName,
            ManagerID,
            0 AS Level
        FROM Employees
        WHERE ManagerID IS NULL
    
        UNION ALL
    
        -- Recursive member: Find employees whose manager is in the hierarchy
        SELECT
            e.EmployeeID,
            e.EmployeeName,
            e.ManagerID,
            eh.Level + 1
        FROM Employees e
        INNER JOIN EmployeeHierarchy eh ON e.ManagerID = eh.EmployeeID
    )
    SELECT * FROM EmployeeHierarchy
    ORDER BY Level, EmployeeName;
  5. Execute and observe: Press F5 to see the entire organizational hierarchy with depth levels
  6. Add visual indentation: Make the hierarchy more readable with indentation:
    WITH EmployeeHierarchy AS (
        -- Anchor member
        SELECT
            EmployeeID,
            EmployeeName,
            ManagerID,
            0 AS Level,
            CAST(EmployeeName AS VARCHAR(255)) AS HierarchyPath
        FROM Employees
        WHERE ManagerID IS NULL
    
        UNION ALL
    
        -- Recursive member
        SELECT
            e.EmployeeID,
            e.EmployeeName,
            e.ManagerID,
            eh.Level + 1,
            CAST(eh.HierarchyPath + ' > ' + e.EmployeeName AS VARCHAR(255))
        FROM Employees e
        INNER JOIN EmployeeHierarchy eh ON e.ManagerID = eh.EmployeeID
    )
    SELECT
        Level,
        REPLICATE('  ', Level) + EmployeeName AS IndentedName,
        EmployeeName,
        ManagerID,
        HierarchyPath
    FROM EmployeeHierarchy
    ORDER BY HierarchyPath;
  7. Find all reports under a specific manager: Modify the anchor to start from a specific manager:
    DECLARE @ManagerID INT = 5;  -- Find all reports under EmployeeID 5
    
    WITH EmployeeHierarchy AS (
        -- Anchor: Start with the specified manager
        SELECT
            EmployeeID,
            EmployeeName,
            ManagerID,
            0 AS Level
        FROM Employees
        WHERE EmployeeID = @ManagerID
    
        UNION ALL
    
        -- Recursive: Find all subordinates
        SELECT
            e.EmployeeID,
            e.EmployeeName,
            e.ManagerID,
            eh.Level + 1
        FROM Employees e
        INNER JOIN EmployeeHierarchy eh ON e.ManagerID = eh.EmployeeID
    )
    SELECT
        Level,
        REPLICATE('  ', Level) + EmployeeName AS OrgChart,
        EmployeeID,
        ManagerID
    FROM EmployeeHierarchy
    ORDER BY Level, EmployeeName;
  8. Add aggregate information: Count total reports under each manager:
    WITH EmployeeHierarchy AS (
        SELECT
            EmployeeID,
            EmployeeName,
            ManagerID,
            0 AS Level
        FROM Employees
        WHERE ManagerID IS NULL
    
        UNION ALL
    
        SELECT
            e.EmployeeID,
            e.EmployeeName,
            e.ManagerID,
            eh.Level + 1
        FROM Employees e
        INNER JOIN EmployeeHierarchy eh ON e.ManagerID = eh.EmployeeID
    )
    SELECT
        eh.EmployeeID,
        eh.EmployeeName,
        eh.Level,
        COUNT(subordinates.EmployeeID) AS TotalSubordinates
    FROM EmployeeHierarchy eh
    LEFT JOIN EmployeeHierarchy subordinates
        ON subordinates.ManagerID = eh.EmployeeID
    GROUP BY eh.EmployeeID, eh.EmployeeName, eh.Level
    ORDER BY eh.Level, eh.EmployeeName;
  9. Add MAXRECURSION option: Prevent infinite loops (default is 100 levels):
    WITH EmployeeHierarchy AS (
        SELECT
            EmployeeID,
            EmployeeName,
            ManagerID,
            0 AS Level
        FROM Employees
        WHERE ManagerID IS NULL
    
        UNION ALL
    
        SELECT
            e.EmployeeID,
            e.EmployeeName,
            e.ManagerID,
            eh.Level + 1
        FROM Employees e
        INNER JOIN EmployeeHierarchy eh ON e.ManagerID = eh.EmployeeID
        WHERE eh.Level < 10  -- Safety check
    )
    SELECT * FROM EmployeeHierarchy
    ORDER BY Level, EmployeeName
    OPTION (MAXRECURSION 10);

💡 Tips:

  • Recursive structure: Anchor member (base case) + UNION ALL + Recursive member (references itself) - both parts are required
  • Join direction: Recursive member joins new data TO existing CTE results - think "find children of current level"
  • Termination: Recursion stops when recursive member returns no rows - ensure this happens or you'll get infinite loop
  • MAXRECURSION: Use OPTION (MAXRECURSION n) to set max depth - 0 means unlimited (use carefully!)
  • Performance consideration: Recursive CTEs can be slow on very large hierarchies - consider materialized views for frequently-queried hierarchies
  • Use cases: Organizational charts, bill of materials, folder structures, category trees, network graphs

✅ Expected Result: A complete organizational hierarchy showing every employee with their level in the organization, visual indentation, full hierarchy path from CEO down, and optionally the count of direct/indirect reports. When starting from a specific manager, shows only that subtree of the organization.

🔄 Connection to Day 6: Day 6 taught JOIN operations to combine related tables. Recursive CTEs use self-joins in a special way - each iteration joins the table to itself to traverse the hierarchy one level deeper, enabling tree traversal that's impossible with regular JOINs!

Exercise 4: Generate Date Series with Recursive CTE

Goal: Use recursive CTEs to generate sequences of dates, numbers, or other series - essential for filling gaps in data and creating reporting calendars.

Step-by-Step Instructions:

  1. Open your SQL editor ready to generate a date series
  2. Understand the use case: Generate all dates in a month to identify days with missing sales data
  3. Create a simple date series - Current Month:
    WITH DateSeries AS (
        -- Anchor: Start with first day of current month
        SELECT DATEFROMPARTS(YEAR(GETDATE()), MONTH(GETDATE()), 1) AS DateValue
    
        UNION ALL
    
        -- Recursive: Add one day at a time
        SELECT DATEADD(day, 1, DateValue)
        FROM DateSeries
        WHERE DATEADD(day, 1, DateValue) <= EOMONTH(GETDATE())
    )
    SELECT DateValue, DATENAME(weekday, DateValue) AS DayOfWeek
    FROM DateSeries
    ORDER BY DateValue;
  4. Execute to see all dates in current month by pressing F5
  5. Generate a full year of dates:
    DECLARE @Year INT = 2024;
    
    WITH DateSeries AS (
        -- Anchor: Start with January 1st
        SELECT DATEFROMPARTS(@Year, 1, 1) AS DateValue
    
        UNION ALL
    
        -- Recursive: Add one day
        SELECT DATEADD(day, 1, DateValue)
        FROM DateSeries
        WHERE DATEADD(day, 1, DateValue) <= DATEFROMPARTS(@Year, 12, 31)
    )
    SELECT
        DateValue,
        DATENAME(weekday, DateValue) AS DayName,
        DATEPART(week, DateValue) AS WeekNumber,
        DATEPART(quarter, DateValue) AS Quarter
    FROM DateSeries
    ORDER BY DateValue
    OPTION (MAXRECURSION 366);  -- Allow for leap years
  6. Use date series to find missing data: Left join with actual sales data to identify gaps:
    DECLARE @StartDate DATE = '2024-01-01';
    DECLARE @EndDate DATE = '2024-12-31';
    
    WITH DateSeries AS (
        SELECT @StartDate AS DateValue
    
        UNION ALL
    
        SELECT DATEADD(day, 1, DateValue)
        FROM DateSeries
        WHERE DateValue < @EndDate
    ),
    DailySales AS (
        SELECT
            CAST(OrderDate AS DATE) AS SaleDate,
            SUM(Amount) AS TotalSales,
            COUNT(*) AS OrderCount
        FROM Orders
        WHERE YEAR(OrderDate) = 2024
        GROUP BY CAST(OrderDate AS DATE)
    )
    SELECT
        ds.DateValue,
        DATENAME(weekday, ds.DateValue) AS DayOfWeek,
        COALESCE(sal.TotalSales, 0) AS TotalSales,
        COALESCE(sal.OrderCount, 0) AS OrderCount,
        CASE
            WHEN sal.SaleDate IS NULL THEN 'No Sales Data'
            WHEN sal.TotalSales = 0 THEN 'Zero Sales'
            ELSE 'Has Sales'
        END AS DataStatus
    FROM DateSeries ds
    LEFT JOIN DailySales sal ON ds.DateValue = sal.SaleDate
    ORDER BY ds.DateValue
    OPTION (MAXRECURSION 366);
  7. Generate business days only (exclude weekends):
    DECLARE @StartDate DATE = '2024-01-01';
    DECLARE @EndDate DATE = '2024-12-31';
    
    WITH DateSeries AS (
        SELECT @StartDate AS DateValue
    
        UNION ALL
    
        SELECT DATEADD(day, 1, DateValue)
        FROM DateSeries
        WHERE DateValue < @EndDate
    )
    SELECT
        DateValue,
        DATENAME(weekday, DateValue) AS DayOfWeek
    FROM DateSeries
    WHERE DATEPART(weekday, DateValue) NOT IN (1, 7)  -- Exclude Sunday (1) and Saturday (7)
    ORDER BY DateValue
    OPTION (MAXRECURSION 366);
  8. Create a fiscal calendar: Generate months with fiscal year numbering:
    DECLARE @FiscalYearStart DATE = '2023-07-01';  -- Fiscal year starts July 1st
    DECLARE @Periods INT = 12;  -- 12 fiscal months
    
    WITH FiscalPeriods AS (
        -- Anchor: First fiscal month
        SELECT
            1 AS FiscalMonth,
            @FiscalYearStart AS PeriodStart,
            EOMONTH(@FiscalYearStart) AS PeriodEnd
    
        UNION ALL
    
        -- Recursive: Add subsequent months
        SELECT
            FiscalMonth + 1,
            DATEADD(month, 1, PeriodStart),
            EOMONTH(DATEADD(month, 1, PeriodStart))
        FROM FiscalPeriods
        WHERE FiscalMonth < @Periods
    )
    SELECT
        FiscalMonth,
        PeriodStart,
        PeriodEnd,
        DATEDIFF(day, PeriodStart, PeriodEnd) + 1 AS DaysInPeriod,
        CASE
            WHEN FiscalMonth <= 3 THEN 'Q1'
            WHEN FiscalMonth <= 6 THEN 'Q2'
            WHEN FiscalMonth <= 9 THEN 'Q3'
            ELSE 'Q4'
        END AS FiscalQuarter
    FROM FiscalPeriods
    ORDER BY FiscalMonth;
  9. Generate number series for reporting: Create sequence 1 to N:
    WITH NumberSeries AS (
        SELECT 1 AS Number
    
        UNION ALL
    
        SELECT Number + 1
        FROM NumberSeries
        WHERE Number < 100
    )
    SELECT
        Number,
        Number * Number AS Squared,
        Number * Number * Number AS Cubed
    FROM NumberSeries
    ORDER BY Number
    OPTION (MAXRECURSION 100);

💡 Tips:

  • Date series pattern: Anchor with start date, recursive adds DATEADD, terminate with WHERE condition checking end date
  • MAXRECURSION planning: For date series, set to max possible days (366 for year, 31 for month) to avoid errors
  • Performance note: For large date ranges, consider creating a permanent Calendar table instead of recursive CTE
  • Gap analysis: LEFT JOIN date series to actual data to find missing dates - crucial for time-series analysis
  • Business calendar: Use recursive CTEs to generate custom calendars (fiscal years, 4-4-5 retail calendar, etc.)
  • Alternative in SQL Server 2022+: Use GENERATE_SERIES function for simpler number/date generation when available

✅ Expected Result: A complete date series for the specified range with associated metadata (day name, week number, quarter). When joined with sales data, clearly shows which dates have data and which don't. For fiscal calendars, shows custom period numbering aligned to business needs rather than calendar months.

🔄 Connection to Day 12: Day 12 used DATEDIFF and date functions for calculations. Recursive date series extends this by creating complete date ranges for analysis, ensuring every date is represented even if there's no data - essential for accurate trend analysis and gap identification!

Exercise 5: Solve Complex Business Problems with Combined CTEs

Goal: Master combining multiple CTEs (both recursive and non-recursive) to solve real-world business scenarios that require hierarchical and analytical logic.

Scenario: Generate a comprehensive sales report showing daily sales for each product category, including running totals, percentage of monthly target, and identifying which products haven't been sold in the past 30 days.

Step-by-Step Instructions:

  1. Create a date dimension CTE: Generate last 30 days:
    DECLARE @EndDate DATE = CAST(GETDATE() AS DATE);
    DECLARE @StartDate DATE = DATEADD(day, -29, @EndDate);
    
    WITH DateDimension AS (
        SELECT @StartDate AS ReportDate
    
        UNION ALL
    
        SELECT DATEADD(day, 1, ReportDate)
        FROM DateDimension
        WHERE ReportDate < @EndDate
    )
    SELECT * FROM DateDimension
    ORDER BY ReportDate
    OPTION (MAXRECURSION 30);
  2. Add product category hierarchy CTE: Get all categories and subcategories:
    DECLARE @EndDate DATE = CAST(GETDATE() AS DATE);
    DECLARE @StartDate DATE = DATEADD(day, -29, @EndDate);
    
    WITH DateDimension AS (
        SELECT @StartDate AS ReportDate
        UNION ALL
        SELECT DATEADD(day, 1, ReportDate)
        FROM DateDimension
        WHERE ReportDate < @EndDate
    ),
    CategoryHierarchy AS (
        -- Top-level categories
        SELECT
            CategoryID,
            CategoryName,
            ParentCategoryID,
            0 AS Level,
            CAST(CategoryName AS VARCHAR(255)) AS CategoryPath
        FROM Categories
        WHERE ParentCategoryID IS NULL
    
        UNION ALL
    
        -- Subcategories
        SELECT
            c.CategoryID,
            c.CategoryName,
            c.ParentCategoryID,
            ch.Level + 1,
            CAST(ch.CategoryPath + ' > ' + c.CategoryName AS VARCHAR(255))
        FROM Categories c
        INNER JOIN CategoryHierarchy ch ON c.ParentCategoryID = ch.CategoryID
    )
    SELECT * FROM CategoryHierarchy
    ORDER BY CategoryPath;
  3. Add daily sales aggregation CTE:
    -- Previous CTEs...
    ,
    DailySales AS (
        SELECT
            CAST(o.OrderDate AS DATE) AS SaleDate,
            p.CategoryID,
            SUM(o.Quantity) AS UnitsSold,
            SUM(o.Amount) AS Revenue
        FROM Orders o
        JOIN Products p ON o.ProductID = p.ProductID
        WHERE o.OrderDate >= @StartDate AND o.OrderDate <= @EndDate
        GROUP BY CAST(o.OrderDate AS DATE), p.CategoryID
    )
    SELECT * FROM DailySales;
  4. Create complete date-category matrix: Cross join to ensure all combinations:
    -- Previous CTEs...
    ,
    DateCategoryMatrix AS (
        SELECT
            dd.ReportDate,
            ch.CategoryID,
            ch.CategoryName,
            ch.CategoryPath,
            ch.Level
        FROM DateDimension dd
        CROSS JOIN CategoryHierarchy ch
        WHERE ch.Level = 0  -- Only top-level categories for this report
    )
    SELECT * FROM DateCategoryMatrix;
  5. Add running totals and calculations:
    -- Previous CTEs...
    ,
    EnrichedSales AS (
        SELECT
            dcm.ReportDate,
            dcm.CategoryID,
            dcm.CategoryName,
            COALESCE(ds.UnitsSold, 0) AS UnitsSold,
            COALESCE(ds.Revenue, 0) AS Revenue,
            SUM(COALESCE(ds.Revenue, 0)) OVER (
                PARTITION BY dcm.CategoryID, YEAR(dcm.ReportDate), MONTH(dcm.ReportDate)
                ORDER BY dcm.ReportDate
                ROWS UNBOUNDED PRECEDING
            ) AS MonthToDateRevenue,
            AVG(COALESCE(ds.Revenue, 0)) OVER (
                PARTITION BY dcm.CategoryID
                ORDER BY dcm.ReportDate
                ROWS BETWEEN 6 PRECEDING AND CURRENT ROW
            ) AS SevenDayAvgRevenue
        FROM DateCategoryMatrix dcm
        LEFT JOIN DailySales ds
            ON dcm.ReportDate = ds.SaleDate
            AND dcm.CategoryID = ds.CategoryID
    )
    SELECT * FROM EnrichedSales;
  6. Add targets and performance metrics:
    -- Previous CTEs...
    ,
    CategoryTargets AS (
        SELECT
            CategoryID,
            MonthlyTarget
        FROM CategoryGoals
        WHERE YEAR(TargetMonth) = YEAR(@EndDate)
            AND MONTH(TargetMonth) = MONTH(@EndDate)
    ),
    FinalReport AS (
        SELECT
            es.ReportDate,
            DATENAME(weekday, es.ReportDate) AS DayName,
            es.CategoryName,
            es.UnitsSold,
            es.Revenue,
            es.MonthToDateRevenue,
            ct.MonthlyTarget,
            CASE
                WHEN ct.MonthlyTarget > 0
                THEN ROUND((es.MonthToDateRevenue / ct.MonthlyTarget) * 100, 2)
                ELSE NULL
            END AS PercentOfTarget,
            es.SevenDayAvgRevenue,
            CASE
                WHEN es.Revenue > es.SevenDayAvgRevenue THEN 'Above Average'
                WHEN es.Revenue < es.SevenDayAvgRevenue THEN 'Below Average'
                ELSE 'On Average'
            END AS DailyPerformance,
            CASE
                WHEN es.Revenue = 0 THEN 'No Sales'
                ELSE 'Active'
            END AS SalesStatus
        FROM EnrichedSales es
        LEFT JOIN CategoryTargets ct ON es.CategoryID = ct.CategoryID
    )
    SELECT
        ReportDate,
        DayName,
        CategoryName,
        UnitsSold,
        Revenue,
        MonthToDateRevenue,
        MonthlyTarget,
        PercentOfTarget,
        SevenDayAvgRevenue,
        DailyPerformance,
        SalesStatus
    FROM FinalReport
    ORDER BY CategoryName, ReportDate
    OPTION (MAXRECURSION 30);
  7. Add summary analytics: Create a final summary section:
    -- All previous CTEs remain the same...
    
    -- Main detailed report
    SELECT * FROM FinalReport
    ORDER BY CategoryName, ReportDate;
    
    -- Summary by category
    WITH CategorySummary AS (
        SELECT
            CategoryName,
            SUM(Revenue) AS TotalRevenue,
            AVG(Revenue) AS AvgDailyRevenue,
            MAX(Revenue) AS BestDay,
            MIN(Revenue) AS WorstDay,
            SUM(CASE WHEN Revenue = 0 THEN 1 ELSE 0 END) AS DaysWithNoSales,
            MAX(PercentOfTarget) AS CurrentTargetProgress
        FROM FinalReport
        GROUP BY CategoryName
    )
    SELECT
        CategoryName,
        TotalRevenue,
        AvgDailyRevenue,
        BestDay,
        WorstDay,
        DaysWithNoSales,
        CurrentTargetProgress,
        CASE
            WHEN CurrentTargetProgress >= 100 THEN 'Target Met'
            WHEN CurrentTargetProgress >= 80 THEN 'On Track'
            WHEN CurrentTargetProgress >= 50 THEN 'Behind'
            ELSE 'Significantly Behind'
        END AS TargetStatus
    FROM CategorySummary
    ORDER BY TotalRevenue DESC
    OPTION (MAXRECURSION 30);

💡 Tips:

  • CTE organization: Order CTEs logically - dimensions first (dates, hierarchies), then facts (sales), then calculations, finally output formatting
  • Mixing CTE types: You can combine recursive CTEs (for hierarchies, date series) with regular CTEs (for aggregations, joins) in one query
  • Testing strategy: Build incrementally - test each CTE by selecting from it before adding the next layer
  • Documentation: Add comments before each CTE explaining its purpose - crucial for maintaining complex queries
  • Performance monitoring: Check execution plans for complex CTE queries - consider breaking into temp tables if performance is poor
  • Reusability: Complex CTE-based reports can be wrapped in views or stored procedures for repeated use

✅ Expected Result: A comprehensive daily sales report showing 30 days of data for each category with: daily sales figures, running month-to-date totals, 7-day moving averages, percentage of monthly target achieved, performance indicators (above/below average), and sales status. Summary section shows overall category performance with target achievement status. This provides complete visibility into sales trends and target progress.

🔄 Connection to Day 15: Day 15's Power Query workflows transformed data through multiple steps. This exercise does the same in SQL using multiple CTEs - each CTE is a transformation step (like Power Query steps), building from raw data to final insights. Both approaches create reusable, documented data pipelines!

Independent Practice (60 min):

💡 Hints for Independent Practice:

  • Start simple: Begin with basic variations before attempting complex modifications
  • Break it down: Divide complex problems into smaller, manageable steps
  • Test incrementally: Verify each step works before moving to the next
  • Use documentation: Refer to official docs or tutorials when needed
  • Experiment: Try different approaches to find what works best

Apply the concepts learned to different datasets and scenarios.

Daily Assignment (60 min):

Assignment:

Expected Outcomes:


📝 Day 16 Quick Quiz - Test Your Knowledge!

Q1. What does CTE stand for in SQL?

Q2. What's the main advantage of using CTEs?

Q3. How do you define a CTE?

Q4. What is a recursive CTE used for?

Q5. Can you use multiple CTEs in one query?

Have feedback or questions?