Skip to main content

πŸ“˜ SQL vs Aggregations

SQL​

A SQL query statement usually starts with a SELECT where we put a list of the fields we need, then a FROM clause to state the table/s (or in this case, the subquery) that will return the rows. We can filter out using WHERE and group data using GROUP.

We read this from the inside. If there's too much nesting, it's not easy to follow.

SELECT
city,
SUM(annual_spend) Total_Spend,
AVG(annual_spend) Average_Spend,
MAX(annual_spend) Max_Spend,
COUNT(annual_spend) customers
FROM (
SELECT t1.city, customer.annual_spend
FROM customer
LEFT JOIN (
SELECT address.address_id, city.city,
address.customer_id, address.location
FROM address LEFT JOIN city
ON address.city_id = city.city_id
) AS t1
ON
(customer.customer_id = t1.customer_id AND
t1.location = "home")
) AS t2
GROUP BY city;

Equivalent MongoDB aggregation pipeline​

Here we pass three stages: one to return one document per element in the address array, and then we filter out to get only the documents that have a home address location. Finally, we do the grouping. As we'll see, this can be split and tested separately and resembles our code.

db.customers.aggregate([
{
$unwind: "$address",
},
{
$match: {"address.location": "home"}
},
{
$group: {
_id: "$address.city",
totalSpend: {$sum: "$annualSpend"},
averageSpend: {$avg: "$annualSpend"},
maximumSpend: {$max: "$annualSpend"},
customers: {$sum: 1}
}
}
])