Skip to main content

🦸 Saving to a Collection

$out

You can export the result of an aggregation pipeline to a different DB/collection. To do that, just add a last aggregation stage $out to your pipeline:

Run this from the source collection

[{ $out: { db: "<output-db>", coll: "<output-collection>" } }]

You can omit the db attribute, and the new collection will be created in the current DB.

{ $out: "<output-collection>" }
info

The $out stage must be the last stage in the pipeline.

👐 Create a copy of the books with exactly 100 pages and output as a new collection named OneHundredPagesBooks.

Answer
[
{$match: {pages: 100}},
{$out: "OneHundredPagesBooks"}
]

Refresh the database to see the new collection.


Reference: 📗 $out documentation

Merging output results

If the collection specified by the $out operation already exists, then the $out stage atomically replaces the existing collection with the new results collection upon completion of the aggregation.

To avoid overwriting the existing collection we can use $merge instead of $out.

{ $merge : { into : "newCollection" } }
  • if the collection does not exists, it will be created
  • if it exists, new data will be added
  • if a doc already exists, we can replace it, keep the existing one, merge both documents cause the stage to fail or run a pipeline.

This is perfect for creating On-Demand Materialized Views

As an example, let's say we want the authors to contain all the books they've written, with all the book information. In this case, we'll do a $lookup to get the book information into the authors collection. We can even use the name books for the resulting data we're joining, shadowing the original books array we have in authors. This way it will look like the books array changes.

[
{$lookup: {
from: "books",
localField: "books",
foreignField: "_id",
as: "books"
}
},
]

Now a book will look like this. You can see that the books array has been "overwritten" by the $lookup.

{
"name": "Richard Bruce Wright",
"sanitizedName": "richardbrucewright",
"books": [
{
"_id": "0002005018",
"title": "Clara Callan: A novel",
"authors": [
{
"_id": {
"$oid": "64cc2db4830ba29148da4c3b"
},
"name": "Richard Bruce Wright"
}
],
"genres": [
"Women Teachers",
"Young Women",
"Actresses",
"Sisters"
],
"pages": 414,
"year": 2001,
"synopsis": "Giller Prize Winner 2001. Richard B. Wright. A Phyllis Bruce Book.",
"cover": "https://images.isbndb.com/covers/50/12/9780002005012.jpg",
"attributes": [
{
"key": "edition",
"value": "1st"
},
{
"key": "dimensions",
"value": "Height: 11.11 Inches, Length: 6.11 Inches, Weight: 1 Pounds, Width: 1.11 Inches"
},
{
"key": "isbn13",
"value": "9780002005012"
},
{
"key": "msrp",
"value": "0.00"
},
{
"key": "isbn",
"value": "0002005018"
},
{
"key": "isbn10",
"value": "0002005018"
}
],
"totalInventory": 2,
"available": 3,
"binding": "Hardcover",
"language": "en",
"publisher": "HarperFlamingoCanada",
"longTitle": "Clara Callan: A novel",
"reviews": [
{
"_id": {
"$oid": "678900bc99a40f049f32d0be"
},
"text": "bbb",
"name": "Tawdry Lemur",
"rating": 5,
"timestamp": 1737031868630
},
{
"_id": {
"$oid": "672231e138e1f3e7c1c1c1cf"
},
"text": "testss",
"name": "Brash Iguana",
"rating": 4,
"timestamp": 1730294241948
},
{
"_id": {
"$oid": "6719acf4232c1a1deb2cbd7c"
},
"text": "hi",
"name": "Brash Rhino",
"rating": 5,
"timestamp": 1729735924953
},
{
"_id": {
"$oid": "6719ace0232c1a1deb2cbd7b"
},
"text": "good",
"name": "Killer Alligator",
"rating": 3,
"timestamp": 1729735904188
},
{
"_id": {
"$oid": "6719acdd232c1a1deb2cbd7a"
},
"text": "good",
"name": "Killer Alligator",
"rating": 5,
"timestamp": 1729735901895
}
]
}
],
"aliases": [
"Wright, Richard Bruce"
]
}

We can go ahead and remove the authors from the books array, as it is redundant:

[
{$lookup: {
from: "books",
localField: "books",
foreignField: "_id",
as: "books"
}
},
{$unset: 'books.authors'},
]

Now that our authors look the way we want, we can overwrite the authors collection using $merge

[
{$lookup: {
from: "books",
localField: "books",
foreignField: "_id",
as: "books"
}
},
{$unset: 'books.authors'},
{$merge: {
into: 'authors',
on: '_id',
whenMatched: 'replace',
}}
]
  • we use the _id field to match documents
  • we replace the existing ones with replace
warning

We should see a message telling us that the $merge operator will cause the pipeline to persist the results to the specified location. This stage changes data.