Skip to main content

πŸ‘ $group and $unwind

In SQL, the GROUP BY statement groups rows that have the same values into summary rows, like "find the number of customers in each country." The GROUP BY statement is often used with aggregate functions (COUNT(), MAX(), MIN(), SUM(), AVG()) to group the result-set by one or more columns.

The $group stage in MongoDB’s Aggregation Framework is equivalent to GROUP BY in SQL. It allows us to group documents by a specific field and apply aggregate functions like sum, average, count, min, and max.


🎬 How does $group work?​

The $group stage groups documents based on a field and performs calculations on grouped data.

Syntax​

{
$group: {
_id: <expression>,
<field>: { <accumulator>: <expression> }
}
}
  • _id: The field to group by (use null to aggregate all documents together)
  • <accumulator>: An aggregation operator ($sum, $avg, $min, $max, $push, $addToSet, etc.)

πŸ‘ Example 1: Count the number of books published every year​

MongoDB query​

db.books.aggregate([
{
$group: {
_id: "$year",
totalBooks: { $sum: 1 },
},
},
]);

Equivalent SQL query​

SELECT year, COUNT(*) AS totalBooks
FROM books
GROUP BY year;

Sample output​

[
{"_id": 1980,"totalBooks": 42},
{"_id": 2000,"totalBooks": 490},
{"_id": 1981,"totalBooks": 45},
...
]

πŸ‘ Example 2: Without using $group, count the number of books published every year​

MongoDB query​

db.books.aggregate([
{
$sortByCount: "$year",
},
]);

Sample output remains the same as before​

[
{"_id": 1980,"totalBooks": 42},
{"_id": 2000,"totalBooks": 490},
{"_id": 1981,"totalBooks": 45},
...
]

πŸ‘ Example 3: Find the total number of pages published every year.​

MongoDB query​

db.books.aggregate([
{
$group: {
_id: "$year",
totalPages: { $sum: "$pages" },
},
},
]);

Equivalent SQL query​

SELECT year, SUM(rating) AS totalPages
FROM books
GROUP BY year;

Sample output​

[
{ "_id": 1955, "totalPages": 664 },
{ "_id": 1952, "totalPages": 416 },
{ "_id": 1899, "totalPages": 128 }
...
]

πŸ”Ή How does $unwind work?​

The $unwind stage deconstructs an array field and creates one output document for each element in the array.

{
$unwind: "<array field>"
}
  • <array field>: The path to the array field you want to deconstruct

πŸ‘ Example 1: Return one document for each author​

MongoDB Query​

db.books.aggregate([
{
$unwind: "$authors"
}
]);

Equivalent SQL query​

SELECT
b._id,
b.title,
authors
FROM books b
CROSS JOIN UNNEST(b.authors) AS authors
WHERE b._id = '1567189644';

Sample output​

[
{ "_id": "0740714104", "title": "Who's Your Daddy?", "author": "Phyllis Wright-Herman" }
{ "_id": "0740714104", "title": "Who's Your Daddy?", "author": "Timothy Mikkelsen" }
{ "_id": "0740714104", "title": "Who's Your Daddy?", "author": "Ltd. MikWrgiht" }
]
  • This returns one document per author entry in the authors array.

πŸ‘ Challenge​

πŸ‘ 1. Find the average book rating of all books​

Answer
await reviews.aggregate([
{
$group: {
_id: "$bookId",
avgRating: { $avg: "$rating" }
}
}
]).toArray();

πŸ‘ 2. Find the top 5 users with the most number of reviews (Hint: use the name field in the reviews collection)​

Answer

There are 2 ways to solve this-

  • $group with $sort
  • $sortByCount
await reviews.aggregate([
{
$group: {
_id: "$name",
totalReviews: { $sum: 1 },
},
},
{
$sort: {
totalReviews: -1,
},
},
{
$limit: 5,
},
]).toArray();

πŸ‘ 3. Return one document per author for book id 1567189644​

Answer
await books.aggregate([            
{
$match: {
_id: "1567189644"
}
},
{
$unwind: {
path: "$authors"
}
},
{
$project: {
"_id": 0,
"title": 1,
"author": "$authors.name"
}
}
]).toArray();