DAY 12: SQL - Date Functions & Window Functions Part 1 (ROW_NUMBER, RANK)
Learning Objectives: - Master SQL date and time functions - Understand window functions and OVER clause - Use ROW_NUMBER for ranking and pagination - Apply RANK and DENSE_RANK for competitive ranking
Topics Covered (60 min):
1. SQL Date Functions
- GETDATE() / CURRENT_DATE - Current date/time
- DATEADD(interval, number, date) - Add/subtract dates
- DATEDIFF(interval, start_date, end_date) - Date difference
- DATEPART(part, date) - Extract year, month, day, etc.
- EOMONTH(date, months) - End of month calculation
2. Date Formatting & Extraction
- YEAR(date), MONTH(date), DAY(date)
- DATENAME(part, date) - Get month/day names
- FORMAT(date, 'format') - Custom date formatting
- Example: WHERE YEAR(OrderDate) = 2023 AND MONTH(OrderDate) = 12
3. Window Functions - ROW_NUMBER
- Syntax: ROW_NUMBER() OVER (PARTITION BY col ORDER BY col)
- Assigns unique sequential numbers within partitions
- Use for: pagination, top N per group, duplicate detection
- Example: Number orders per customer
4. Window Functions - RANK & DENSE_RANK
- RANK() - Leaves gaps after ties (1, 2, 2, 4)
- DENSE_RANK() - No gaps (1, 2, 2, 3)
- Use PARTITION BY to rank within groups
- Example: Rank salespeople by revenue per region
Video Resources:
SQL Window Functions | Clearly Explained (7:16)
https://www.youtube.com/watch?v=rIcB4zMYMasSQL Window Functions Basics (Visually Explained) (18:47)
https://www.youtube.com/watch?v=o666k19mZwESQL Window Function | RANK, ROW_NUMBER, LAG, LEAD (24:54)
https://www.youtube.com/watch?v=Ww71knvhQ-s
AI Learning Prompts (Concept Understanding Only):
Understanding SQL Date Functions:
I'm learning SQL date functions. Please explain:
1. What is the difference between DATEADD and DATEDIFF?
2. How do I calculate "days since last order" for each customer?
3. Show me how to extract year, quarter, month from dates
4. How do I filter for "last 30 days" or "current month"?
5. Provide examples for calculating customer age from birthdate
Use SQL Server syntax with clear examples.
Understanding Window Functions:
Help me understand SQL window functions:
1. What does the OVER clause do?
2. Explain the difference between ROW_NUMBER(), RANK(), and DENSE_RANK()
3. What is PARTITION BY and how does it work?
4. Show me how to find "top 3 products per category by sales"
5. How do I use ROW_NUMBER for pagination?
6. Provide examples using sales and customer data
Explain with simple examples suitable for beginners.
Guided Practice (60 min):
📝 How to approach Guided Practice:
- Follow step-by-step: Work through each exercise in order
- Use provided resources: Reference the video tutorials and AI prompts if you get stuck
- Verify your work: Check your results match expected outcomes
- Take notes: Document key learnings and shortcuts you discover
- Ask for help: Use AI assistants to clarify concepts you don't understand
📊 Dataset: Superstore Sales with columns: OrderID, OrderDate, CustomerID, CustomerName, Region, Category, ProductName, Sales
Exercise 1: Calculate Days Since Last Order (DATEDIFF and MAX)
Goal: Learn to use DATEDIFF function with aggregates to calculate customer recency.
Step-by-Step Instructions:
- Open your SQL editor and connect to your database
- Understand the requirement: For each customer, find their most recent order date and calculate how many days ago that was
- Start with finding the last order date per customer:
Execute this to verify it shows the most recent order for each customerSELECT CustomerID, MAX(OrderDate) AS LastOrderDate FROM Orders GROUP BY CustomerID; - Now add the days calculation using DATEDIFF:
SELECT CustomerID, MAX(OrderDate) AS LastOrderDate, DATEDIFF(day, MAX(OrderDate), GETDATE()) AS DaysSinceLastOrder FROM Orders GROUP BY CustomerID ORDER BY DaysSinceLastOrder DESC; - Analyze the results: Sort by DaysSinceLastOrder DESC to see customers who haven't ordered in the longest time
- Add customer names for readability:
SELECT CustomerID, MAX(CustomerName) AS CustomerName, MAX(OrderDate) AS LastOrderDate, DATEDIFF(day, MAX(OrderDate), GETDATE()) AS DaysSinceLastOrder FROM Orders GROUP BY CustomerID ORDER BY DaysSinceLastOrder DESC LIMIT 20;
💡 Tips:
- DATEDIFF syntax:
DATEDIFF(unit, start_date, end_date)- unit can be day, month, year, hour, etc. - GETDATE() vs CURRENT_DATE: Use GETDATE() for SQL Server, CURRENT_DATE for PostgreSQL, NOW() for MySQL
- MAX with GROUP BY: MAX(OrderDate) finds the most recent date for each CustomerID group
- Why MAX(CustomerName)?: Since CustomerName is same for all orders of a customer, MAX just picks that value
- Database differences: PostgreSQL uses
CURRENT_DATE - MAX(OrderDate), MySQL usesDATEDIFF(CURDATE(), MAX(OrderDate)) - Business use case: This query identifies "at-risk" customers who haven't ordered recently
✅ Expected Result: A list of customers with their last order date and the number of days since that order (e.g., CustomerID: 101, LastOrder: 2024-10-15, DaysSince: 45)
🔄 Connection to Excel: Similar to Excel's TODAY()-MAX(dates) calculation from Day 11, but applied across groups!
Exercise 2: Extract Year, Quarter, Month from Dates (Date Part Functions)
Goal: Learn to extract date components for time-series analysis and reporting.
Step-by-Step Instructions:
- Start with extracting the year from OrderDate:
This shows how YEAR() extracts just the year partSELECT OrderID, OrderDate, YEAR(OrderDate) AS OrderYear FROM Orders LIMIT 10; - Extract multiple date parts:
SELECT OrderID, OrderDate, YEAR(OrderDate) AS OrderYear, DATEPART(quarter, OrderDate) AS OrderQuarter, MONTH(OrderDate) AS OrderMonth, DAY(OrderDate) AS OrderDay FROM Orders LIMIT 10; - Now aggregate sales by year and quarter:
SELECT YEAR(OrderDate) AS OrderYear, DATEPART(quarter, OrderDate) AS OrderQuarter, SUM(Sales) AS TotalSales, COUNT(*) AS OrderCount FROM Orders GROUP BY YEAR(OrderDate), DATEPART(quarter, OrderDate) ORDER BY OrderYear, OrderQuarter; - Add formatted quarter labels (Q1, Q2, Q3, Q4):
This creates labels like "Q1 2023", "Q2 2023", etc.SELECT YEAR(OrderDate) AS OrderYear, DATEPART(quarter, OrderDate) AS OrderQuarter, CONCAT('Q', DATEPART(quarter, OrderDate), ' ', YEAR(OrderDate)) AS QuarterLabel, SUM(Sales) AS TotalSales, COUNT(*) AS OrderCount FROM Orders GROUP BY YEAR(OrderDate), DATEPART(quarter, OrderDate) ORDER BY OrderYear, OrderQuarter; - Create a monthly sales summary:
SELECT YEAR(OrderDate) AS OrderYear, MONTH(OrderDate) AS OrderMonth, DATENAME(month, OrderDate) AS MonthName, SUM(Sales) AS TotalSales FROM Orders GROUP BY YEAR(OrderDate), MONTH(OrderDate), DATENAME(month, OrderDate) ORDER BY OrderYear, OrderMonth;
💡 Tips:
- YEAR vs DATEPART:
YEAR(date)is shorthand forDATEPART(year, date)- both work the same - Quarter calculation: Q1 = Jan-Mar (1-3), Q2 = Apr-Jun (4-6), Q3 = Jul-Sep (7-9), Q4 = Oct-Dec (10-12)
- DATENAME vs DATEPART: DATENAME returns text ("January"), DATEPART returns numbers (1)
- GROUP BY requirement: In GROUP BY, you must repeat the date function - can't just use the alias
- Database differences: MySQL uses EXTRACT(YEAR FROM date), PostgreSQL supports both syntaxes
- Fiscal quarters: If your fiscal year starts in July, add CASE logic to adjust quarters
✅ Expected Result: Sales totals grouped by year and quarter (e.g., 2023 Q1: $45,230, 2023 Q2: $52,180, etc.)
Exercise 3: Filter for Recent Orders (Last 30/60/90 Days)
Goal: Learn to filter data using date ranges with DATEADD or INTERVAL.
Step-by-Step Instructions:
- Find orders from the last 30 days:
-- SQL Server syntax SELECT OrderID, OrderDate, CustomerName, Sales FROM Orders WHERE OrderDate >= DATEADD(day, -30, GETDATE()) ORDER BY OrderDate DESC; - Alternative PostgreSQL/MySQL syntax:
-- PostgreSQL SELECT OrderID, OrderDate, CustomerName, Sales FROM Orders WHERE OrderDate >= CURRENT_DATE - INTERVAL '30 days' ORDER BY OrderDate DESC; -- MySQL SELECT OrderID, OrderDate, CustomerName, Sales FROM Orders WHERE OrderDate >= DATE_SUB(CURDATE(), INTERVAL 30 DAY) ORDER BY OrderDate DESC; - Compare different time periods (30, 60, 90 days):
This shows sales totals for each time periodSELECT SUM(CASE WHEN OrderDate >= DATEADD(day, -30, GETDATE()) THEN Sales ELSE 0 END) AS Last30Days, SUM(CASE WHEN OrderDate >= DATEADD(day, -60, GETDATE()) THEN Sales ELSE 0 END) AS Last60Days, SUM(CASE WHEN OrderDate >= DATEADD(day, -90, GETDATE()) THEN Sales ELSE 0 END) AS Last90Days FROM Orders; - Find orders in the current month:
SELECT OrderID, OrderDate, CustomerName, Sales FROM Orders WHERE YEAR(OrderDate) = YEAR(GETDATE()) AND MONTH(OrderDate) = MONTH(GETDATE()) ORDER BY OrderDate DESC; - Filter for a specific quarter:
SELECT OrderID, OrderDate, Sales FROM Orders WHERE YEAR(OrderDate) = 2024 AND DATEPART(quarter, OrderDate) = 1 -- Q1 only ORDER BY OrderDate;
💡 Tips:
- DATEADD parameters:
DATEADD(unit, number, date)- use negative numbers to go back in time - Performance tip: Date filtering on indexed columns is fast - avoid wrapping column in functions like YEAR(OrderDate) = 2024
- Better performance: Use
WHERE OrderDate >= '2024-01-01' AND OrderDate < '2025-01-01'instead ofWHERE YEAR(OrderDate) = 2024 - Current month start: Use
DATEFROMPARTS(YEAR(GETDATE()), MONTH(GETDATE()), 1)for the first day of current month - Inclusive ranges: Be careful with date ranges - decide if you want >= or >, <= or <
✅ Expected Result: Orders filtered to specific date ranges, showing only recent activity
Exercise 4: Ranking Products with ROW_NUMBER (Introduction to Window Functions)
Goal: Learn your first window function - ROW_NUMBER() to rank products within categories.
Step-by-Step Instructions:
- First, calculate total sales per product (without ranking):
This shows products grouped by category, sorted by salesSELECT Category, ProductName, SUM(Sales) AS TotalSales FROM Orders GROUP BY Category, ProductName ORDER BY Category, TotalSales DESC; - Now add ROW_NUMBER to rank products within each category:
SELECT Category, ProductName, SUM(Sales) AS TotalSales, ROW_NUMBER() OVER (PARTITION BY Category ORDER BY SUM(Sales) DESC) AS ProductRank FROM Orders GROUP BY Category, ProductName ORDER BY Category, ProductRank; - Understand the OVER clause:
PARTITION BY Category- Creates separate ranking "groups" for each categoryORDER BY SUM(Sales) DESC- Within each category, rank by sales (highest first)- Result: #1, #2, #3... restarts for each category
- Filter to show only top 3 products per category:
This uses a CTE (Common Table Expression) from Day 16 previewWITH RankedProducts AS ( SELECT Category, ProductName, SUM(Sales) AS TotalSales, ROW_NUMBER() OVER (PARTITION BY Category ORDER BY SUM(Sales) DESC) AS ProductRank FROM Orders GROUP BY Category, ProductName ) SELECT Category, ProductName, TotalSales, ProductRank FROM RankedProducts WHERE ProductRank <= 3 ORDER BY Category, ProductRank; - Alternative without CTE (using subquery):
SELECT * FROM ( SELECT Category, ProductName, SUM(Sales) AS TotalSales, ROW_NUMBER() OVER (PARTITION BY Category ORDER BY SUM(Sales) DESC) AS ProductRank FROM Orders GROUP BY Category, ProductName ) AS RankedProducts WHERE ProductRank <= 3 ORDER BY Category, ProductRank;
💡 Tips:
- Window functions don't reduce rows: Unlike GROUP BY alone, window functions add calculations without collapsing rows
- PARTITION BY is optional: Without it, ranking applies to entire result set (no groups)
- ORDER BY is required: Window functions need to know how to order data for ranking/numbering
- Can't use WHERE with window functions: That's why we need CTE or subquery to filter by rank
- ROW_NUMBER vs RANK: ROW_NUMBER always gives unique ranks (1, 2, 3, 4...), RANK allows ties (1, 2, 2, 4...)
- Performance: Window functions are powerful but can be slower on very large datasets
✅ Expected Result: Products ranked 1-3 within each category (e.g., Electronics: #1 Laptop, #2 Tablet, #3 Phone; Furniture: #1 Chair, #2 Desk, #3 Table)
🔄 Connection to Excel: This is like adding a rank column in Excel Pivot Tables from Day 7!
Exercise 5: Top 3 Customers Per Region (Complex Window Function with CTE)
Goal: Apply window functions to find top performers in each group using CTE pattern.
Step-by-Step Instructions:
- First, calculate total sales per customer by region:
SELECT Region, CustomerName, SUM(Sales) AS TotalSales FROM Orders GROUP BY Region, CustomerName ORDER BY Region, TotalSales DESC; - Add ROW_NUMBER to rank customers within each region:
Now each region has customers ranked 1, 2, 3, 4...SELECT Region, CustomerName, SUM(Sales) AS TotalSales, ROW_NUMBER() OVER (PARTITION BY Region ORDER BY SUM(Sales) DESC) AS CustomerRank FROM Orders GROUP BY Region, CustomerName ORDER BY Region, CustomerRank; - Use CTE to filter for top 3 per region:
WITH RankedCustomers AS ( SELECT Region, CustomerName, SUM(Sales) AS TotalSales, ROW_NUMBER() OVER (PARTITION BY Region ORDER BY SUM(Sales) DESC) AS Rank FROM Orders GROUP BY Region, CustomerName ) SELECT Region, CustomerName, TotalSales, Rank FROM RankedCustomers WHERE Rank <= 3 ORDER BY Region, Rank; - Enhance with additional metrics:
This adds more context about why customers are top performersWITH RankedCustomers AS ( SELECT Region, CustomerName, SUM(Sales) AS TotalSales, COUNT(*) AS OrderCount, AVG(Sales) AS AvgOrderValue, ROW_NUMBER() OVER (PARTITION BY Region ORDER BY SUM(Sales) DESC) AS Rank FROM Orders GROUP BY Region, CustomerName ) SELECT Region, CustomerName, TotalSales, OrderCount, AvgOrderValue, Rank FROM RankedCustomers WHERE Rank <= 3 ORDER BY Region, Rank; - Try with RANK() instead of ROW_NUMBER():
RANK() will show ties (e.g., if two customers have same sales, both get rank 2, next is rank 4)WITH RankedCustomers AS ( SELECT Region, CustomerName, SUM(Sales) AS TotalSales, RANK() OVER (PARTITION BY Region ORDER BY SUM(Sales) DESC) AS Rank FROM Orders GROUP BY Region, CustomerName ) SELECT * FROM RankedCustomers WHERE Rank <= 3 ORDER BY Region, Rank;
💡 Tips:
- CTE (Common Table Expression): WITH clause creates a temporary named result set - makes complex queries more readable
- When to use CTEs: When you need to filter or reference window function results
- ROW_NUMBER vs RANK vs DENSE_RANK: - ROW_NUMBER: 1, 2, 3, 4 (always unique) - RANK: 1, 2, 2, 4 (skips after tie) - DENSE_RANK: 1, 2, 2, 3 (no skip after tie)
- Multiple PARTITION BY columns: You can partition by multiple columns:
PARTITION BY Region, Category - Multiple ORDER BY columns: Break ties with:
ORDER BY SUM(Sales) DESC, CustomerName ASC - Real-world use case: Regional sales reports, MVP identification, performance dashboards
✅ Expected Result: Top 3 customers in each region with their ranks (e.g., East: 1. John ($50K), 2. Sarah ($48K), 3. Mike ($45K); West: 1. Lisa ($55K), 2. Tom ($52K), 3. Amy ($49K))
🔄 Connection to Previous Learning: This combines GROUP BY from Day 8, subqueries from Day 10, and introduces window functions - powerful SQL!
🎮 Interactive SQL Playground 1: Days Since Last Order
Calculate days since last order using date functions. Try changing CURRENT_DATE to see different results!
🎮 Interactive SQL Playground 2: Monthly Sales Trends
Extract month/year from dates and aggregate sales. Try different date parts like YEAR or DAY!
🎮 Interactive SQL Playground 3: ROW_NUMBER for Ranking
Rank orders by sales using ROW_NUMBER. Try changing the ORDER BY to rank differently!
🎮 Interactive SQL Playground 4: RANK with Ties
Compare RANK vs ROW_NUMBER behavior with ties. Notice how RANK handles duplicate values!
🎮 Interactive SQL Playground 5: Top 3 Per Region with CTE
Find top performers per region using window functions and CTEs. Try changing the rank limit or adding more columns!
Independent Practice (60 min):
💡 Hints for Independent Practice:
- Start simple: Begin with basic variations before attempting complex modifications
- Break it down: Divide complex problems into smaller, manageable steps
- Test incrementally: Verify each step works before moving to the next
- Use documentation: Refer to official docs or tutorials when needed
- Experiment: Try different approaches to find what works best
Challenge 1: Customer Recency Analysis - Find customers who haven't ordered in 90+ days - Calculate average days between orders per customer - Identify "at risk" customers (no order in 6 months) - Compare current month sales vs same month last year
Challenge 2: Product Performance Ranking - Rank products by sales within each category - Find the #1 product in each subcategory - Identify products in bottom 10% by sales - Number orders sequentially for each customer
Daily Assignment (60 min):
Assignment: Sales Performance Dashboard Query
Create a comprehensive SQL report with:
- Date-Based Metrics:
- Sales by year, quarter, month
- Year-over-year growth calculation
- Orders placed in last 30/60/90 days
- Ranking Analysis:
- Top 5 products by sales (overall)
- Top 3 customers per region (use ROW_NUMBER)
- Rank categories by profitability
- Customer Segmentation:
- Customers by recency (0-30, 31-60, 61-90, 91+ days)
- Use DATEDIFF to calculate days since last order
- Count customers in each recency bucket
Requirements: - Use at least 3 different date functions - Use ROW_NUMBER with PARTITION BY - Use RANK or DENSE_RANK for one analysis - Include comments explaining each query - Format dates in output (MM/DD/YYYY)
Expected Outcomes:
- Master SQL date manipulation and calculations
- Understand window functions and OVER clause
- Use ROW_NUMBER for pagination and top N queries
- Apply RANK/DENSE_RANK for competitive ranking
- Combine date functions with window functions
📝 Day 12 Quick Quiz - Test Your Knowledge!
Q1. What does ROW_NUMBER() do in SQL?
Q2. What's the difference between ROW_NUMBER() and RANK()?
Q3. What does PARTITION BY do in window functions?
Q4. Which SQL function extracts the year from a date?
Q5. What does DENSE_RANK() do differently than RANK()?