The Data Foundry

Built by Data with Pranjal

SQLBeginnerBroken SQL FixFree

Wrong GROUP BY Grain Causing Customer Revenue Inflation

You are on the analytics engineering rota for an e-commerce company. Finance has escalated that the customer revenue mart is higher than the payment processor report after last night's model change.

Scenario context

The query groups by customer and order status, but the dashboard expects one row per customer. When downstream users sum the status-level rows again, cancelled and completed order rows are mixed into the customer metric.

Business requirement

Build a customer-level revenue result with exactly one row per customer. Include only completed orders, return customer_id, customer_name, and completed_revenue, and make sure duplicate status rows cannot inflate the dashboard.

Sample production data

Use these small tables to reason about the bug before writing the fix. The browser checker seeds this data when you click Check Answer.

customers

customer_idcustomer_name
101Asha
102Ben
103Chen

orders

order_idcustomer_idorder_statusorder_amountorder_date
1101COMPLETED1202026-05-01
2101CANCELLED802026-05-01
3101COMPLETED302026-05-02
4102COMPLETED502026-05-02
5103CANCELLED402026-05-03

Schema

orders(order_id, customer_id, order_status, order_amount, order_date)
customers(customer_id, customer_name)

Sample input

orders
1, 101, COMPLETED, 120, 2026-05-01
2, 101, CANCELLED, 80, 2026-05-01
3, 101, COMPLETED, 30, 2026-05-02
4, 102, COMPLETED, 50, 2026-05-02

Broken logic / code

SELECT
  c.customer_id,
  c.customer_name,
  o.order_status,
  SUM(o.order_amount) AS revenue
FROM customers c
JOIN orders o ON c.customer_id = o.customer_id
GROUP BY 1, 2, 3;

Expected output / expected logic

customer_id | customer_name | completed_revenue
101 | Asha | 150
102 | Ben | 50

Your attempt

Write the corrected SQL

Write the corrected query. The browser will run it against the sample tables and compare the result with the expected output.

Saved locally

Interview-style explanation

Now explain your solution as if you are in an interview: symptom, root cause, fix, edge cases, trade-offs, monitoring, and prevention.