π¦Έ $merge
In MongoDB, the $merge stage allows you to write the results of an aggregation pipeline into a new or existing collection. This is similar to the concept of "INSERT INTO ... SELECT" or "MERGE INTO" in SQL databases.
πΉ Understanding $mergeβ
The $merge stage enables you to store aggregation results into a different collection. If the target collection doesnβt exist, MongoDB will create it automatically.
Key features:β
- βοΈ Inserts new documents if they donβt exist
- βοΈ Updates existing documents based on
_idor a specified field - βοΈ Can replace, merge, or discard duplicate records
- βοΈ Useful for ETL workflows, reporting tables, and maintaining summary data
πΉ Syntaxβ
{ $merge: {
into: "targetCollection",
on: "_id", // Field to match existing records
whenMatched: "merge", // Action when a match is found
whenNotMatched: "insert" // Action when no match is found
} }
intoβ Specifies the target collection.onβ Defines how documents are matched (default:_id).whenMatchedβ Specifies action when a document with the same key exists.merge: Updates matching documents.replace: Replaces matching documents.keepExisting: Keeps the existing document unchanged.fail: Throws an error on duplicates.
whenNotMatchedβ Specifies action when a match is not found.insert: Inserts new documents.discard: Ignores unmatched documents.fail: Throws an error if a match is not found.
πΉ Example 1: Creating a summary collectionβ
π Suppose we want to generate a collection that contains the total number of books per genre. Using mongosh:
db.books.aggregate([
{ $unwind: "$genres" },
{ $group: { _id: "$genres", totalBooks: { $sum: 1 } } },
{
$merge: {
into: "genre_summary",
on: "_id",
whenMatched: "merge",
whenNotMatched: "insert",
},
},
]);
Equivalent SQL query
INSERT INTO genre_summary (genre, totalBooks)
SELECT genre, COUNT(*) FROM books
GROUP BY genre
ON DUPLICATE KEY UPDATE totalBooks = VALUES(totalBooks);
π Challengeβ
πΉ 1: Maintaining an Author Summary Tableβ
π We want to create an author_stats collection with the total number of books written by each author.
Hint: SQL equivalentβ
INSERT INTO author_stats (authorName, totalBooks)
SELECT authors.name, COUNT(*) FROM books
GROUP BY authors.name
ON DUPLICATE KEY UPDATE totalBooks = VALUES(totalBooks);
Answer
There are 2 ways to solve this-
- through
bookscollection - through
authorscollection
- JavaScript
- C#
- through books collection
- through authors collection
await books.aggregate([
{ $unwind: "$authors" },
{ $group: { _id: "$authors.name", totalBooks: { $sum: 1 } } },
{
$merge: {
into: "author_stats",
on: "_id",
whenMatched: "merge",
whenNotMatched: "insert",
},
},
]).toArray();
const authors = db.collection("authors");
await authors.aggregate([
{ $unwind: "$books" },
{ $group: { _id: "$name", totalBooks: { $sum: 1 } } },
{
$merge: {
into: "author_stats",
on: "_id",
whenMatched: "merge",
whenNotMatched: "insert"
}
}
]).toArray();
- through books collection
- through authors collection
var result = await booksCollection.Aggregate()
.Unwind("authors")
.Group(new BsonDocument
{
{ "_id", "$authors.name" },
{ "totalBooks", new BsonDocument("$sum", 1) }
})
.AppendStage<BsonDocument>(new BsonDocument("$merge", new BsonDocument
{
{ "into", "author_stats" },
{ "on", "_id" },
{ "whenMatched", "merge" },
{ "whenNotMatched", "insert" }
}))
.ToListAsync();
var pipeline = authorsCollection.Aggregate()
.AppendStage<BsonDocument>(new BsonDocument{"$unwind", "$books"})
.AppendStage<BsonDocument>(new BsonDocument{
"$group",
new BsonDocument{
{"_id","$name"},
{"totalBooks", new BsonDocument{"$sum", 1}}
}
}
)
.AppendStage<BsonDocument>(new BsonDocument{
"$merge",
new BsonDocument{
{"into","author_stats"},
{"on","_id"},
{"whenMatched","merge"},
{"whenNotMatched","insert"}
}
}
);
var result = pipeline.ToList();