Banner

 

Additional SQL queries

Single entity

  1. Prepare a list of offices sorted by country, state, city.
  2. SELECT country, state, city
    FROM Offices
    ORDER BY country, state, city;

  3. How many employees are there in the company?
  4. SELECT COUNT(employeeNumber)
    FROM Employees;

  5. What is the total of payments received?
  6. SELECT SUM(amount)
    FROM Payments;

  7. List the product lines that contain 'Cars'.
  8. SELECT productLine
    FROM ClassicModels.ProductLines
    WHERE productLine REGEXP 'Cars';

  9. Report total payments for October 28, 2004.
  10. SELECT SUM(amount)
    FROM Payments
    WHERE paymentDate = '2004-10-28';

  11. Report those payments greater than $100,000.
  12. SELECT amount
    FROM Payments
    WHERE amount > 100000;

  13. List the products in each product line.
  14. SELECT productLine, productName
    FROM Products
    ORDER BY productLine, productName;

  15. How many products in each product line?
  16. SELECT productLine, COUNT(productCode)
    FROM Products
    GROUP BY productLine;

  17. What is the minimum payment received?
  18. SELECT MIN(amount)
    FROM Payments;

  19. List all payments greater than twice the average payment.
  20. SELECT amount
    FROM Payments
    WHERE amount > 2*(SELECT AVG(amount) FROM Payments);

  21. What is the average percentage markup of the MSRP on buyPrice?
  22. SELECT AVG(MSRP/buyPrice)*100 - 100 from Products;

  23. How many distinct products does ClassicModels sell?
  24. SELECT COUNT(DISTINCT(productCode)) FROM Products;

  25. Report the name and city of customers who don't have sales representatives?
  26. SELECT city, customerName FROM Customers WHERE salesRepEmployeeNumber IS NULL;

  27. What are the names of executives with VP or Manager in their title? Use the CONCAT function to combine the employee's first name and last name into a single field for reporting.
  28. SELECT CONCAT(firstName, " ", lastName) FROM Employees
    WHERE jobTitle REGEXP 'VP|Manager'

  29. Which orders have a value greater than $5,000?
  30. SELECT orderNumber AS `Order number`, FORMAT(SUM(quantityOrdered*priceEach),2) AS `Order value`
    FROM OrderDetails
    GROUP BY orderNumber HAVING SUM(quantityOrdered*priceEach) > 5000;

One to many relationship

  1. Report the account representative for each customer.
  2. SELECT customerName, firstName, lastName
    FROM Customers JOIN Employees
    ON Employees.employeeNumber = Customers.salesRepEmployeeNumber;

  3. Report total payments for Atelier graphique.
  4. SELECT SUM(amount)
    FROM Payments JOIN Customers
    ON Customers.customerNumber = Payments.customerNumber
    WHERE customerName = 'Atelier graphique';

  5. Report the total payments by date
  6. SELECT paymentDate, SUM(amount)
    FROM Payments
    GROUP BY paymentDate;

  7. Report the products that have not been sold.
  8. SELECT productName, Products.productCode
    FROM Products
    WHERE NOT EXISTS (SELECT * FROM OrderDetails WHERE OrderDetails.productCode = Products.productCode);

  9. List the amount paid by each customer.
  10. SELECT customerName, SUM(amount)
    FROM Payments JOIN Customers
    ON Payments.customerNumber = Customers.customerNumber
    GROUP BY customerName;

  11. How many orders have been placed by Herkku Gifts?
  12. SELECT COUNT(orderNumber)
    FROM Orders JOIN Customers
    ON Orders.customerNumber = Customers.customerNumber
    AND customerName = 'Herkku Gifts';

  13. Who are the employees in Boston?
  14. SELECT firstName, lastName
    FROM Employees JOIN Offices
    ON Employees.officeCode = Offices.officeCode
    AND city = 'Boston';

  15. Report those payments greater than $100,000. Sort the report so the customer who made the highest payment appears first.
  16. SELECT customerNumber, amount
    FROM Payments
    WHERE amount > 100000
    ORDER BY amount DESC;

  17. List the value of 'On Hold' orders.
  18. SELECT SUM(quantityOrdered * priceEach) AS Value
    FROM OrderDetails JOIN Orders
    ON Orders.orderNumber = OrderDetails.orderNumber
    WHERE status = 'On Hold';

  19. Report the number of orders 'On Hold' for each customer.
  20. SELECT customerName, COUNT(orderNumber)
    FROM Orders JOIN Customers
    ON Orders.customerNumber = Customers.customerNumber
    WHERE status = 'On Hold'
    GROUP BY customerName;

Many to many relationship

  1. List names of products sold by order date.
  2. SELECT orderDate, productName
    FROM OrderDetails JOIN Orders
    ON OrderDetails.orderNumber = Orders.orderNumber
    JOIN Products on Products.productCode = OrderDetails.productCode
    ORDER BY orderDate ASC;

  3. List all the order dates in descending order for orders for the 1940 Ford Pickup Truck.
  4. SELECT orderDate
    FROM OrderDetails JOIN Products
    ON OrderDetails.productCode = Products.productCode
    JOIN Orders ON Orders.orderNumber = OrderDetails.orderNumber
    WHERE productName = '1940 Ford Pickup Truck'
    ORDER BY orderDate DESC;

  5. List the names of customers and their corresponding order number where a particular order from that customer has a value greater than $25,000?
  6. SELECT customerName, Orders.orderNumber, FORMAT(SUM(quantityOrdered * priceEach),2) AS Total
    FROM Customers JOIN Orders
    ON Customers.customerNumber = Orders.customerNumber
    JOIN OrderDetails ON Orders.orderNumber = OrderDetails.orderNumber
    GROUP BY orderNumber HAVING SUM(quantityOrdered * priceEach) > 25000

  7. Are there any products that appear on all orders?
  8. SELECT * FROM Products
    WHERE NOT EXISTS
    (SELECT * FROM Orders
    WHERE NOT EXISTS
    (SELECT * FROM OrderDetails
    WHERE Products.productCode = OrderDetails.productCode
    AND Orders.orderNumber = OrderDetails.orderNumber));

  9. List the names of products sold at less than 80% of the MSRP.
  10. SELECT DISTINCT(productName), priceEach, MSRP
    FROM OrderDetails JOIN Products
    ON OrderDetails.productCode = Products.productCode
    AND priceEach < .8*MSRP;

  11. Reports those products that have been sold with a markup of 100% or more (i.e.,  the priceEach is at least twice the buyPrice)

    SELECT DISTINCT(productName)
    FROM Products JOIN OrderDetails
    ON Products.productCode = OrderDetails.productCode
    WHERE priceEach >= 2 * buyPrice
    ORDER BY productName;

  12. List the products ordered on a Monday.
  13. SELECT DISTINCT(productName)
    FROM Orders JOIN OrderDetails ON Orders.orderNumber = OrderDetails.orderNumber
    JOIN Products ON OrderDetails.productCode = Products.productCode
    AND DAYNAME(orderDate) = "Monday";

  14. What is the quantity on hand for products listed on 'On Hold' orders?
  15. SELECT productName, FORMAT(quantityInStock,0) AS `Quantity in stock`
    FROM OrderDetails JOIN Orders ON Orders.orderNumber = OrderDetails.orderNumber
    JOIN Products on OrderDetails.productCode = Products.productCode
    WHERE status = 'On Hold';

Regular expressions

  1. Find products containing the name 'Ford'.
  2. SELECT * FROM Products WHERE LOWER(productName) REGEXP 'Ford';

  3. List products ending in 'ship'.
  4. SELECT * FROM Products WHERE LOWER(productName0 REGEXP 'ship$';

  5. Report the number of customers in Denmark, Norway, and Sweden.
  6. SELECT country, COUNT(*) FROM Customers
    WHERE country REGEXP 'Denmark|Norway|Sweden'
    GROUP BY country;'

  7. What are the products with a product code in the range S700_1000 to S700_1499?
  8. SELECT * FROM Products WHERE productCode REGEXP 'S700_1[0-4][0-9]{2}';

  9. Which customers have a digit in their name?
  10. SELECT * FROM Customers WHERE customerName REGEXP '[0-9]';

  11. List the names of employees called Dianne or Diane.
  12. SELECT * FROM Employees WHERE firstName REGEXP 'Dian';

  13. List the products containing ship or boat in their product name.
  14. SELECT * FROM Products WHERE LOWER(productName) REGEXP 'ship|boat';

  15. List the products with a product code beginning with S700.
  16. SELECT * FROM Products WHERE productCode REGEXP '^S700' ;

  17. List the names of employees called Larry or Barry.
  18. SELECT * FROM Employees WHERE firstName REGEXP '[BL]array';

  19. List the names of employees with non-alphabetic characters in their names.
  20. SELECT * FROM Employees WHERE LOWER(firstName ) REGEXP '[^a-z]';

  21. List the vendors whose name ends in Diecast
  22. SELECT DISTINCT(productVendor) FROM Products WHERE LOWER(productVendor) REGEXP 'diecast$';

General queries

  1. Who is at the top of the organization (i.e.,  reports to no one).
  2. SELECT * FROM Employees WHERE reportsTo IS NULL;

  3. Who reports to William Patterson?
  4. SELECT reports.firstname, reports.lastname
    FROM Employees boss JOIN Employees reports
    ON boss.employeeNumber = reports.reportsTo
    WHERE boss.firstName = "William" and boss.lastName = "Patterson";

  5. List all the products purchased by Herkku Gifts.
  6. SELECT productName
    FROM OrderDetails JOIN Orders ON OrderDetails.orderNumber = Orders.orderNumber
    JOIN Customers ON Orders.customerNumber = Customers.customerNumber
    JOIN Products ON Products.productCode = OrderDetails.productCode
    WHERE customerName = 'Herkku Gifts';

  7. Compute the commission for each sales representative, assuming the commission is 5% of the value of an order. Sort by employee last name and first name.
  8. SELECT CONCAT(firstName, ' ',lastName) AS Name, FORMAT(.05 * SUM(quantityOrdered * priceEach),0) AS Commission
    FROM Employees JOIN Customers ON Employees.employeeNumber = Customers.salesRepEmployeeNumber
    JOIN Orders ON Customers.customerNumber = Orders.customerNumber
    JOIN OrderDetails ON Orders.orderNumber = OrderDetails.orderNumber
    GROUP BY employeeNumber
    ORDER BY lastName, firstName ASC;

  9. What is the difference in days between the most recent and oldest order date in the Orders file?
  10. SELECT DATEDIFF(MAX(orderDate), MIN(orderDate)) FROM Orders;

  11. Compute the average time between order date and ship date for each customer ordered by the largest difference.
  12. SELECT customerName, AVG(DATEDIFF(shippedDate,orderDate))AS diff FROM Customers JOIN Orders
    ON Customers.customerNumber = Orders.customerNumber
    GROUP BY customerName
    ORDER BY diff DESC;

  13. What is the value of orders shipped in August 2004? (Hint).
  14. SELECT FORMAT(SUM(quantityOrdered*priceEach),0) as orderValue
    FROM Orders JOIN OrderDetails
    ON Orders.orderNumber = OrderDetails. orderNumber
    AND YEAR(orderDate) = 2004
    AND MONTH(orderDate) = 8;

  15. Compute the total value ordered, total amount paid, and their difference for each customer for orders placed in 2004 and payments received in 2004. Only report those customers where the absolute value of the difference is greater than $100. (Hint: Create views for the total paid and total ordered).
  16. CREATE VIEW pay2004 (customerName, payTotal)
    AS (SELECT customerName,SUM(amount)
    FROM Payments JOIN Customers
    ON Payments.customerNumber = Customers.customerNumber
    AND YEAR(paymentDate) = 2004
    GROUP BY customerName);

    CREATE VIEW order2004 (customerName, orderTotal) AS
    (SELECT customerName, SUM(quantityOrdered*priceEach)
    FROM Customers JOIN Orders ON Customers.customerNumber = Orders.customerNumber
    JOIN OrderDetails ON Orders.orderNumber = OrderDetails. orderNumber
    WHERE YEAR(orderDate) = 2004
    GROUP BY customerName);

    SELECT pay2004.customerName AS Customer, FORMAT(payTotal,0) AS Payments, FORMAT(orderTotal,0) AS Orders, FORMAT(payTotal - orderTotal,0) as Difference
    FROM pay2004 JOIN order2004
    ON pay2004.customerName = order2004.customerName
    WHERE ABS(payTotal - orderTotal) > 100;

    Alternatively, you can write a single query using common table expression syntax which first creates two temporary tables, pay and ord, and then joins them

    WITH
    pay AS (SELECT customerName,SUM(amount) AS payTotal
    FROM Payments JOIN Customers
    ON Payments.customerNumber = Customers.customerNumber
    AND YEAR(paymentDate) = 2004
    GROUP BY customerName)
    ,
    ord AS (SELECT customerName, SUM(quantityOrdered*priceEach) AS orderTotal
    FROM Customers JOIN Orders
    ON Customers.customerNumber = Orders.customerNumber
    JOIN OrderDetails ON Orders.orderNumber = OrderDetails. orderNumber
    AND YEAR(orderDate) = 2004
    GROUP BY customerName)

    SELECT Pay.customerName AS Customer, FORMAT(payTotal,0)
    AS Payments, FORMAT(orderTotal,0) AS Orders, FORMAT(payTotal - orderTotal,0) AS Difference
    FROM pay JOIN ord
    ON Pay.customerName = Ord.customerName
    AND ABS(payTotal-orderTotal) > 100
    ;

  17. List the employees who report to those employees who report to Diane Murphy. Use the CONCAT function to combine the employee's first name and last name into a single field for reporting.
  18. SELECT CONCAT(firstName, ' ', lastName) from Employees
    WHERE reportsTo IN
    (SELECT employeeNumber FROM Employees
    WHERE reportsTo =
    (SELECT employeeNumber FROM Employees
    WHERE firstName = 'Diane' AND lastName = 'Murphy'));

  19. What is the percentage value of each product in inventory sorted by the highest percentage first.
  20. SELECT productName, format(quantityInStock*buyPrice,0) AS Stock, format(quantityInStock*buyPrice/(totalValue)*100,2)
    AS Percent FROM Products, (SELECT SUM(quantityInStock*buyPrice) AS totalValue FROM Products) AS T ORDER BY quantityInStock*buyPrice/(totalValue)*100 DESC;

  21. Write a function to convert miles per gallon to liters per 100 kilometers.
  22. CREATE FUNCTION mpg_to_lkm(mpg REAL)
    RETURNS REAL
    RETURN 235.214583/mpg;

  23. Write a procedure to increase the price of a specified product category by a given percentage. You will need to create a product table with appropriate data to test your procedure. Alternatively, load the ClassicModels database on your personal machine so you have complete access. You have to change the DELIMITER prior to creating the procedure.
  24. DELIMITER $$

    CREATE PROCEDURE increasePrice (
    IN percentage DECIMAL(5,2))
    LANGUAGE SQL
    DETERMINISTIC
    BEGIN
    UPDATE Products
    SET buyPrice = buyPrice + buyPrice*percentage;
    END $$

    DELIMITER ;

  25. What is the value of payments received in July 2004?
  26. SELECT FORMAT(SUM(amount),0) as Payments
    FROM Payments
    WHERE YEAR(paymentDate) = 2004
    AND MONTH(paymentDate) = 7;

  27. What is the ratio the value of payments made to orders received for each month of 2004. (i.e., divide the value of payments made by the orders received)?
  28. WITH
    t1 AS (SELECT MONTH(paymentDate) AS period, sum(amount) AS payments FROM Payments
    WHERE YEAR(paymentDate) = 2004
    GROUP BY MONTH(paymentDate)),
    t2 AS (SELECT MONTH(orderDate) AS period, sum(quantityOrdered*priceEach) AS orders FROM Orders, OrderDetails
    WHERE Orders.orderNumber = OrderDetails.orderNumber
    AND YEAR(orderDate) = 2004
    GROUP BY MONTH(orderDate))
    SELECT t1.period, FORMAT(payments/orders,2) AS ratio FROM
    t1 JOIN t2
    ON t1.period = t2.period;

  29. What is the difference in the amount received for each month of 2004 compared to 2003?
  30. WITH
    t2003 AS (SELECT YEAR(paymentDate) AS 'year', MONTH(paymentDate) AS 'month', sum(amount) AS amount FROM Payments
    WHERE YEAR(paymentDate) = 2003
    GROUP BY YEAR(paymentDate), MONTH(paymentDate)),
    t2004 AS (SELECT YEAR(paymentDate) AS 'year', MONTH(paymentDate) AS 'month', sum(amount) AS amount FROM Payments
    WHERE YEAR(paymentDate) = 2004
    GROUP BY YEAR(paymentDate), MONTH(paymentDate))
    SELECT t2003.month,format((t2004.amount - t2003.amount),2) AS variance
    FROM t2003 JOIN t2004
    ON t2003.month = t2004.month
    ORDER BY t2003.month;

  31. Write a procedure to report the amount ordered in a specific month and year for customers containing a specified character string in their name.
  32. DELIMITER //
    CREATE PROCEDURE reportOrderValue (
    IN inMonth INT,
    IN inYear INT,
    IN inString VARCHAR(20))
    LANGUAGE SQL
    BEGIN
    SELECT customerName AS Customer, (FORMAT(SUM(quantityOrdered*priceEach),0)) AS `Order Value`
    FROM Orders JOIN OrderDetails
    ON Orders.orderNumber = OrderDetails. orderNumber
    JOIN Customers
    ON Orders.customerNumber = Customers.customerNumber
    WHERE YEAR(orderDate) = inYear
    AND MONTH(orderDate) = inMonth
    AND customerName REGEXP inString
    GROUP BY customerName;
    END //

  33. Write a procedure to change the credit limit of all customers in a specified country by a specified percentage.
  34. DELIMITER //
    CREATE PROCEDURE changeCredit (
    IN inNation CHARACTER(20),
    IN inPercent DECIMAL(5,2))
    LANGUAGE SQL
    BEGIN
    UPDATE Customers
    SET creditLimit = creditLimit*(1 + inPercent)
    WHERE country = inNation;
    END //

  35. Basket of goods analysis: A common retail analytics task is to analyze each basket or order to learn which products are often purchased together. Report the names of products that appear in the same order ten or more times.

    First, create identical tables, p1 and p2, containing for each product the product code, product name, and order number. Second, join these tables where there is a match on order number and remove duplicates by specifying one product code must be greater than the other. Finally, compute the frequency with which a pair of items occur in the same order and limit reporting to more than 10 such occurrences.
  36. WITH
    p1 AS (SELECT Products.productCode, productName AS `Product 1`, orderNumber
    FROM Products JOIN OrderDetails
    ON Products.productCode = OrderDetails.productCode),
    p2 AS (SELECT Products.productCode, productName AS `Product 2`, orderNumber
    FROM Products JOIN OrderDetails
    ON Products.productCode = OrderDetails.productCode)
    SELECT `Product 1`, `Product 2`, count(*) as Frequency FROM p1 JOIN p2
    ON p1.orderNumber = p2.OrderNumber
    WHERE p1.productCode > p2.productCode
    GROUP BY `Product 1`, `Product 2` HAVING Frequency > 10
    ORDER BY Frequency DESC, `Product 1`, `Product 2`;

  37. ABC reporting: Compute the revenue generated by each customer based on their orders. Also, show each customer's revenue as a percentage of total revenue. Sort by customer name. This is often called ABC reporting because some companies recognize the first 10% by revenue as A, the next 20% as B, and the rest as C, or some other variation on percentage.
  38. SELECT customerName, FORMAT(SUM(quantityOrdered*priceEach),0) AS Revenue,
    FORMAT(SUM(quantityOrdered*priceEach)/(SELECT SUM(quantityOrdered*priceEach) FROM Customers JOIN Orders
    ON Customers.customerNumber = Orders.customerNumber
    JOIN OrderDetails ON Orders.orderNumber = OrderDetails. orderNumber)*100,2) AS Percent
    FROM Customers JOIN Orders
    ON Customers.customerNumber = Orders.customerNumber
    JOIN OrderDetails ON Orders.orderNumber = OrderDetails. orderNumber
    GROUP BY customerName
    ORDER BY customerName;

  39. Compute the profit generated by each customer based on their orders. Also, show each customer's profit as a percentage of total profit. Sort by profit descending.
  40. SELECT customerName, FORMAT(SUM(quantityOrdered*(priceEach-buyPrice)),0) AS Profit,
    FORMAT(SUM(quantityOrdered*(priceEach-buyPrice))/(SELECT SUM(quantityOrdered*(priceEach-buyPrice)) FROM Customers JOIN Orders
    ON Customers.customerNumber = Orders.customerNumber
    JOIN OrderDetails ON Orders.orderNumber = OrderDetails.orderNumber
    JOIN Products ON Products.productCode = OrderDetails.productCode)*100,2) AS Percent
    FROM Customers JOIN Orders
    ON Customers.customerNumber = Orders.customerNumber
    JOIN OrderDetails ON Orders.orderNumber = OrderDetails.orderNumber
    GROUP BY customerName
    ORDER BY Percent DESC;

  41. Compute the revenue generated by each sales representative based on the orders from the customers they serve.
  42. SELECT concat(firstName, " ", lastName) AS `Sales Representative`, FORMAT(SUM(quantityOrdered*priceEach),0) AS Revenue
    FROM Employees JOIN Customers
    ON Employees.employeeNumber = Customers.salesRepEmployeeNumber
    JOIN Orders ON Customers.customerNumber = Orders.customerNumber
    JOIN OrderDetails ON Orders.orderNumber = OrderDetails. orderNumber
    GROUP BY `Sales Representative`;

  43. Compute the profit generated by each sales representative based on the orders from the customers they serve. Sort by profit generated descending.
  44. SELECT concat(firstName, " ", lastName) AS `Sales Representative`,
    FORMAT(SUM(quantityOrdered*(priceEach - buyPrice)),0) AS Profit
    FROM Employees JOIN Customers
    ON Employees.employeeNumber = Customers.salesRepEmployeeNumber
    JOIN Orders ON Customers.customerNumber = Orders.customerNumber
    JOIN OrderDetails ON Orders.orderNumber = OrderDetails.orderNumber
    JOIN Products ON Products.ProductCode = OrderDetails.ProductCode
    GROUP BY `Sales Representative`
    ORDER BY SUM(quantityOrdered*(priceEach - buyPrice)) DESC;

  45. Compute the revenue generated by each product, sorted by product name.
  46. SELECT productName AS Product, FORMAT(SUM(quantityOrdered*priceEach),0) AS Revenue
    FROM Products JOIN OrderDetails
    ON Products.productCode = OrderDetails.productCode
    GROUP BY productName
    ORDER BY productName;

  47. Compute the profit generated by each product line, sorted by profit descending
  48. SELECT productLine AS `Product line`, FORMAT(SUM(quantityOrdered*(priceEach -buyPrice)),0) AS Profit
    FROM Products JOIN OrderDetails
    ON Products.productCode = OrderDetails.productCode
    GROUP BY productLine
    ORDER BY SUM(quantityOrdered*(priceEach -buyPrice)) DESC;

  49. Benford's law, also called the first-digit law, states that in lists of numbers from many (but not all) real-life sources of data, the leading digit is distributed in a specific, non-uniform way. Precisely, P(d) = log10(1 + 1/d) where d is a digit in the range 1-9. Thus, if you have n observations for a column, the expected values for each digit are n*log10(1 + 1/d)

    Write SQL code to compute the observed and expected values for the first digit of amount in Payments.
  50. SELECT LEFT(amount,1) as Digit, COUNT(*) as Observed,
    ROUND((SELECT COUNT(*) FROM Payments)*LOG10(1+1/left(amount,1)),0) as Expected
    FROM Payments
    GROUP BY Digit, Expected
    ORDER BY Digit;

    You need to use the Chi-squared statistic to test whether the observed data follows Benford's law. There is an R package, benford.analysis, for computing this statistic and other measures.

    Benford's law is used in accounting, auditing, and fraud detection.

  51. Same as Last Year (SALY) analysis: Compute the ratio for each product of sales for 2003 versus 2004
  52. WITH
    t2003 AS (SELECT productName, Products.productCode, SUM(quantityOrdered*priceEach) AS OrderValue FROM Orders JOIN OrderDetails
    ON Orders.`orderNumber` = OrderDetails.`orderNumber`
    JOIN Products ON OrderDetails.productCode = Products.productCode
    WHERE YEAR(orderDate) = 2003
    GROUP BY Products.productCode),
    t2004 AS (SELECT Products.productCode, SUM(quantityOrdered*priceEach) AS OrderValue FROM Orders JOIN OrderDetails
    ON Orders.`orderNumber` = OrderDetails.`orderNumber`
    JOIN Products ON OrderDetails.productCode = Products.productCode
    WHERE YEAR(orderDate) = 2004
    GROUP BY Products.productCode)
    SELECT productName, format(t2003.OrderValue,0) AS Y2003, format(t2004.OrderValue,0) AS Y2004,
    FORMAT(t2004.OrderValue/t2003.OrderValue,2) AS Ratio FROM t2003 JOIN t2004
    ON t2003.productCode = t2004.productCode
    ORDER BY t2004.OrderValue/t2003.OrderValue DESC;

  53. Compute the ratio of payments for each customer for 2003 versus 2004
  54. WITH
    t2003 AS(SELECT customerName, Customers.customerNumber, sum(amount) as Payments FROM Customers JOIN Payments
    ON Customers.`customerNumber` = Payments.customerNumber
    WHERE YEAR(paymentDate) = 2003
    GROUP BY Customers.customerNumber),
    t2004 AS (SELECT customerName, Customers.customerNumber, sum(amount) as Payments FROM Customers JOIN Payments
    ON Customers.`customerNumber` = Payments.customerNumber
    WHERE YEAR(paymentDate) = 2004
    GROUP BY Customers.customerNumber)
    SELECT t2003.customerName as `Customer name`, format(t2003.Payments,0) as Y2003,
    format(t2004.Payments,0) as Y2004, FORMAT(t2004.Payments/t2003.Payments,2) as Ratio
    FROM t2003 JOIN t2004
    ON t2003.customerNumber = t2004.customerNumber
    ORDER BY t2004.Payments/t2003.Payments DESC;

  55. Find the products sold in 2003 but not 2004.
  56. SELECT productName FROM Products
    JOIN OrderDetails ON Products.productCode = OrderDetails.productCode
    JOIN Orders ON Orders.orderNumber = OrderDetails.orderNumber WHERE YEAR(orderDate) = 2003
    AND Products.productCode NOT IN
    (SELECT Products.productCode FROM Products
    JOIN OrderDetails ON Products.productCode = OrderDetails.productCode
    JOIN Orders ON Orders.orderNumber = OrderDetails.orderNumber WHERE YEAR(orderDate) = 2004);

  57. Find the customers without payments in 2003.
  58. SELECT DISTINCT customerName as `Customer name` FROM Customers
    JOIN Payments ON Customers.customerNumber = Payments.customerNumber
    WHERE customerName NOT IN
    (SELECT customerName FROM Customers
    JOIN Payments ON Customers.customerNumber = Payments.customerNumber
    WHERE YEAR(`paymentDate`) = 2003);

Correlated subqueries

  1. Who reports to Mary Patterson?
  2. SELECT CONCAT(firstName, ' ',lastName) FROM Employees wrk WHERE EXISTS
    (SELECT * FROM Employees WHERE
    wrk.reportsTo = Employees.employeeNumber AND lastName = 'Patterson' AND firstName = 'Mary');

  3. Which payments in any month and year are more than twice the average for that month and year (ie compare all payments in Oct 2004 with the average payment for Oct 2004)? Order the results by the date of the payment. You will need to use the date functions.
  4. SELECT amount, DATE(paymentDate) FROM Payments P1
    WHERE amount > 2*(SELECT AVG(amount) FROM Payments P2
    WHERE MONTH(P1.paymentDate) = MONTH(P2.paymentDate)
    AND YEAR(P1.paymentDate) = YEAR(P2.paymentDate))
    ORDER BY P1.paymentDate;

  5. Report for each product, the percentage value of its stock on hand as a percentage of the stock on hand for product line to which it belongs. Order the report by product line and percentage value within product line descending. Show percentages with two decimal places.
  6. SELECT productName, productLine, FORMAT(100*quantityInStock*Buyprice/(SELECT SUM(quantityInStock*Buyprice) FROM Products WHERE P.productLine = Products.productLine),2) AS Percent
    FROM Products P ORDER BY productLine, Percent DESC;

  7. For orders containing more than two products, report those products that constitute more than 50% of the value of the order.
  8. SELECT productName FROM Products WHERE productCode IN
    (SELECT productCode FROM OrderDetails O
    WHERE quantityOrdered*priceEach > .5*(SELECT sum(quantityOrdered*priceEach) FROM OrderDetails
    WHERE OrderDetails.orderNumber = O.orderNumber
    GROUP BY O.orderNumber HAVING count(*) > 2));

Spatial data

* Remember x is latitude and y is longitude

  1. Which customers are in the Southern Hemisphere?
  2. SELECT customerName, city FROM Customers WHERE x(customerLocation) < 0;

  3. Which customers are south west of the New York office?
  4. SELECT customerName, city FROM Customers WHERE x(customerLocation) < (SELECT x(officeLocation) FROM Offices WHERE city = "NYC") AND y(customerLocation) < (SELECT y(officeLocation) FROM Offices WHERE city = "NYC");

  5. Which customers are closer to the Tokyo office than any other office?
  6. SELECT customerName, Customers.city FROM Customers, Offices WHERE Offices.city = 'Tokyo' AND ST_DISTANCE(customerLocation,officeLocation) =
    (SELECT MIN(ST_DISTANCE(customerLocation,officeLocation)) FROM Offices);

  7. Which French customer is furtherest from the Paris office?
  8. SELECT customerName, Customers.city FROM Customers, Offices WHERE Customers.country = 'France'
    AND ST_DISTANCE(customerLocation,officeLocation)=
    (SELECT MAX(ST_DISTANCE(customerLocation,officeLocation)) FROM Customers, Offices WHERE Offices.city = 'Paris' AND Customers.country = 'France');

  9. Who is the northernmost customer?
  10. SELECT customerName, city FROM Customers
    WHERE x(customerLocation) = (SELECT max(x(customerLocation)) FROM Customers);

Data visualization

For each problem, use R to run an SQL query and then visualize the results.

  1. Visualize in blue the number of items for each product scale
  2. library(ggvis)
    library(DBI)
    conn <- dbConnect(RMySQL::MySQL(), "richardtwatson.com", dbname="ClassicModels", user="db1", password="student")
    d <- dbGetQuery(conn, "SELECT productScale FROM Products")
    d %>% ggvis(~productScale) %>% layer_bars(fill:='blue') %>%
    add_axis('x',title='Product scale') %>%
    add_axis('y',title='Count')

  3. Prepare a line plot with appropriate labels for total payments for each month in 2004
  4. library(ggvis)
    library(DBI)
    conn <- dbConnect(RMySQL::MySQL(), "richardtwatson.com", dbname="ClassicModels", user="db1", password="student")
    d <- dbGetQuery(conn, "SELECT MONTH(paymentDate) AS Month, sum(amount) AS Sum FROM Payments WHERE YEAR(paymentDate) = 2004 GROUP BY MONTH(paymentDate)")
    d %>% ggvis(~Month, ~Sum) %>% layer_lines(stroke:='blue') %>%
    add_axis('x',title='Month') %>%
    add_axis('y',title='Total payments',title_offset=70)

  5. Create a histogram with appropriate labels for the value of orders received from the Nordic countries (Denmark, Finland, Norway, and Sweden)
  6. library(ggvis)
    library(DBI)
    conn <- dbConnect(RMySQL::MySQL(), "richardtwatson.com", dbname="ClassicModels", user="db1", password="student")
    d <- dbGetQuery(conn,"SELECT Country, sum(quantityOrdered*priceEach) AS Orders FROM Orders JOIN OrderDetails ON Orders.orderNumber = OrderDetails.orderNumber JOIN Customers ON Customers.customerNumber = Orders.customerNumber AND country IN ('Denmark','Finland', 'Norway','Sweden') GROUP BY country")
    d %>% ggvis(~Country, ~Orders) %>% layer_bars(fill:='yellow') %>%
    add_axis('x',title='Country') %>%
    add_axis('y',title='Order value',title_offset=70)

  7. Create a heatmap with for product lines and USA cities
  8. library(ggvis)
    library(DBI)
    conn <- dbConnect(RMySQL::MySQL(), "richardtwatson.com", dbname="ClassicModels", user="db1", password="student")
    d <- dbGetQuery(conn,"SELECT count(*) as frequency, productLine, city
    FROM Products JOIN OrderDetails, Orders, Customers
    WHERE Products.productCode = OrderDetails.productCode
    AND OrderDetails.orderNumber = Orders.orderNumber
    AND Orders.customerNumber = Customers.customerNumber
    AND country = 'USA'
    GROUP BY city")
    d %>% ggvis( ~productLine, ~city, fill= ~frequency) %>%
    layer_rects(width = band(), height = band()) %>%
    layer_text(text:=~frequency, stroke:='white', align:='left', baseline:='top') %>%
    add_axis('x',title='Product line') %>%
    add_axis('y',title='City',title_offset=80)

Workbench file for the following image.


 

This page is part of the promotional and support material for Data Management (open edition) by Richard T. Watson
For questions and comments please contact the author
Date revised: 10-Dec-2021