MongoDB Aggregation Pipelines

BPK45 - Jul 31 - - Dev Community

Hi, aliens! I am Pavan. So in this repository, I will explain all the aggregation stages in depth with basic examples. I will also include links to resources for further learning.

So this repository contains JSON files for various MongoDB aggregation pipelines. These pipelines demonstrate how to use different aggregation stages and operations to process and analyze data.

Table of Contents

Introduction

Aggregation in MongoDB is a powerful way to process and analyze data stored in collections. It allows you to perform operations like filtering, grouping, sorting, and transforming data.

CRUD Operations

Create

db.orders.insertOne({
  "order_id": 26,
  "cust_id": 1006,
  "status": "A",
  "amount": 275,
  "items": ["apple", "banana"],
  "date": "2023-01-26"
});
Enter fullscreen mode Exit fullscreen mode

Read

db.orders.find().pretty();
Enter fullscreen mode Exit fullscreen mode

Update

db.orders.updateOne(
  { "order_id": 2 },
  {
    $set: { "status": "C", "amount": 500 },
    $currentDate: { "lastModified": true }
  }
);
Enter fullscreen mode Exit fullscreen mode

Delete

db.orders.deleteOne({ "order_id": 1 });
Enter fullscreen mode Exit fullscreen mode

Aggregation Stages

$match

Filters the documents to pass only the documents that match the specified condition(s) to the next pipeline stage.

db.orders.aggregate([
  { $match: { "status": "A" } }
]);
Enter fullscreen mode Exit fullscreen mode

$group

Groups input documents by the specified _id expression and for each distinct grouping, outputs a document. The _id field contains the unique group by value.

db.orders.aggregate([
  {
    $group: {
      _id: "$cust_id",
      totalSpent: { $sum: "$amount" }
    }
  }
]);
Enter fullscreen mode Exit fullscreen mode

$project

Passes along the documents with the requested fields to the next stage in the pipeline.

db.orders.aggregate([
  { $project: { "order_id": 1, "items": 1, "_id": 0 } }
]);
Enter fullscreen mode Exit fullscreen mode

$sort

Sorts all input documents and returns them to the pipeline in sorted order.

db.orders.aggregate([
  { $sort: { "amount": -1 } }
]);
Enter fullscreen mode Exit fullscreen mode

$limit

Limits the number of documents passed to the next stage in the pipeline.

db.orders.aggregate([
  { $limit: 5 }
]);
Enter fullscreen mode Exit fullscreen mode

$skip

Skips the first n documents and passes the remaining documents to the next stage in the pipeline.

db.orders.aggregate([
  { $skip: 5 }
]);
Enter fullscreen mode Exit fullscreen mode

$lookup

Performs a left outer join to another collection in the same database to filter in documents from the "joined" collection for processing.

db.orders.aggregate([
  {
    $lookup: {
      from: "orderDetails",
      localField: "order_id",
      foreignField: "order_id",
      as: "details"
    }
  }
]);
Enter fullscreen mode Exit fullscreen mode

$unwind

Deconstructs an array field from the input documents to output a document for each element.

db.orders.aggregate([
  { $unwind: "$items" }
]);
Enter fullscreen mode Exit fullscreen mode

$addFields

Adds new fields to documents.

db.orders.aggregate([
  { $addFields: { totalWithTax: { $multiply: ["$amount", 1.1] } } }
]);
Enter fullscreen mode Exit fullscreen mode

$replaceRoot

Replaces the input document with the specified document.

db.orders.aggregate([
  { $replaceRoot: { newRoot: "$items" } }
]);
Enter fullscreen mode Exit fullscreen mode

Aggregation Operations

$sum

Calculates and returns the sum of numeric values. $sum ignores non-numeric values.

db.orders.aggregate([
  {
    $group: {
      _id: "$cust_id",
      totalSpent: { $sum: "$amount" }
    }
  }
]);
Enter fullscreen mode Exit fullscreen mode

$avg

Calculates and returns the average value of the numeric values.

db.orders.aggregate([
  {
    $group: {
      _id: "$cust_id",
      averageSpent: { $avg: "$amount" }
    }
  }
]);
Enter fullscreen mode Exit fullscreen mode

$min

Returns the minimum value from the numeric values.

db.orders.aggregate([
  {
    $group: {
      _id: "$cust_id",
      minSpent: { $min: "$amount" }
    }
  }
]);
Enter fullscreen mode Exit fullscreen mode

$max

Returns the maximum value from the numeric values.

db.orders.aggregate([
  {
    $group: {
      _id: "$cust_id",
      maxSpent: { $max: "$amount" }
    }
  }
]);
Enter fullscreen mode Exit fullscreen mode

$first

Returns the first value from the documents for each group.

db.orders.aggregate([
  {
    $group: {
      _id: "$cust_id",
      firstOrder: { $first: "$amount" }
    }
  }
]);
Enter fullscreen mode Exit fullscreen mode

$last

Returns the last value from the documents for each group.

db.orders.aggregate([
  {
    $group: {
      _id: "$cust_id",
      lastOrder: { $last: "$amount" }
    }
  }
]);
Enter fullscreen mode Exit fullscreen mode

Example Datasets

Example documents used for performing CRUD and aggregation operations:

[
  { "order_id": 1, "cust_id": 1001, "status": "A", "amount": 250, "items": ["apple", "banana"], "date": "2023-01-01" },
  { "order_id": 2, "cust_id": 1002, "status": "B", "amount": 450, "items": ["orange", "grape"], "date": "2023-01-02" },
  { "order_id": 3, "cust_id": 1001, "status": "A", "amount": 300, "items": ["apple", "orange"], "date": "2023-01-03" },
  { "order_id": 4, "cust_id": 1003, "status": "A", "amount": 150, "items": ["banana", "grape"], "date": "2023-01-04" },
  { "order_id": 5, "cust_id": 1002, "status": "C", "amount": 500, "items": ["apple", "banana"], "date": "2023-01-05" },
  { "order_id": 6, "cust_id": 1004, "status": "A", "amount": 350, "items": ["orange", "banana"], "date": "2023-01-06" },
  { "order_id": 7, "cust_id": 1005, "status": "B", "amount": 200, "items": ["grape", "banana"], "date": "2023-01-07" },
  { "order_id": 8, "cust_id": 1003, "status": "A", "amount": 100, "items": ["apple", "orange"], "date": "2023-01-08" },
  { "order_id": 9, "cust_id": 1004, "status": "C", "amount": 400, "items": ["banana", "grape"], "date": "2023-01-09" },
  { "order_id": 10, "cust_id": 1001, "status": "A", "amount": 250, "items": ["apple", "grape"], "date": "2023-01-10" },
  { "order_id": 11, "cust_id": 1002, "status": "B", "amount": 350, "items": ["orange", "banana"], "date": "2023-01-11" },
  { "order_id": 12, "cust_id": 1003, "status": "A", "amount": 450, "items": ["apple", "orange"], "date": "2023-01-12" },
  { "order_id": 13, "cust_id": 1005, "status": "A", "amount": 150, "items": ["banana", "grape"], "date": "2023-01-13" },
  { "order_id": 14, "cust_id": 1004, "status": "C

", "amount": 500, "items": ["apple", "banana"], "date": "2023-01-14" },
  { "order_id": 15, "cust_id": 1002, "status": "A", "amount": 300, "items": ["orange", "grape"], "date": "2023-01-15" },
  { "order_id": 16, "cust_id": 1003, "status": "B", "amount": 200, "items": ["apple", "banana"], "date": "2023-01-16" },
  { "order_id": 17, "cust_id": 1001, "status": "A", "amount": 250, "items": ["orange", "grape"], "date": "2023-01-17" },
  { "order_id": 18, "cust_id": 1005, "status": "A", "amount": 350, "items": ["apple", "banana"], "date": "2023-01-18" },
  { "order_id": 19, "cust_id": 1004, "status": "C", "amount": 400, "items": ["orange", "grape"], "date": "2023-01-19" },
  { "order_id": 20, "cust_id": 1001, "status": "B", "amount": 150, "items": ["apple", "orange"], "date": "2023-01-20" },
  { "order_id": 21, "cust_id": 1002, "status": "A", "amount": 500, "items": ["banana", "grape"], "date": "2023-01-21" },
  { "order_id": 22, "cust_id": 1003, "status": "A", "amount": 450, "items": ["apple", "banana"], "date": "2023-01-22" },
  { "order_id": 23, "cust_id": 1004, "status": "B", "amount": 350, "items": ["orange", "banana"], "date": "2023-01-23" },
  { "order_id": 24, "cust_id": 1005, "status": "A", "amount": 200, "items": ["grape", "banana"], "date": "2023-01-24" },
  { "order_id": 25, "cust_id": 1001, "status": "A", "amount": 300, "items": ["apple", "orange"], "date": "2023-01-25" }
]
Enter fullscreen mode Exit fullscreen mode

Resources for Further Learning

Feel free to clone this repository and experiment with the aggregation pipelines provided. If you have any questions or suggestions, please open an issue or submit a pull request.

$group

Groups orders by status and calculates the total amount and average amount for each status.

db.orders.aggregate([
  {
    $group: {
      _id: "$status",
      totalAmount: { $sum: "$amount" },
      averageAmount: { $avg: "$amount" }
    }
  }
]);
Enter fullscreen mode Exit fullscreen mode

$project

Projects the order ID, customer ID, and a calculated field for the total amount with tax (assuming 10% tax).

db.orders.aggregate([
  {
    $project: {
      "order_id": 1,
      "cust_id": 1,
      "totalWithTax": { $multiply: ["$amount", 1.1] }
    }
  }
]);
Enter fullscreen mode Exit fullscreen mode

$sort

Sorts orders first by status in ascending order and then by amount in descending order.

db.orders.aggregate([
  { $sort: { "status": 1, "amount": -1 } }
]);
Enter fullscreen mode Exit fullscreen mode

$limit

Limits the result to the top 3 orders with the highest amount.

db.orders.aggregate([
  { $sort: { "amount": -1 } },
  { $limit: 3 }
]);
Enter fullscreen mode Exit fullscreen mode

$skip

Skips the first 5 orders and returns the rest.

db.orders.aggregate([
  { $skip: 5 }
]);
Enter fullscreen mode Exit fullscreen mode

$lookup

Joins the orders collection with an orderDetails collection to add order details.

db.orders.aggregate([
  {
    $lookup: {
      from: "orderDetails",
      localField: "order_id",
      foreignField: "order_id",
      as: "details"
    }
  }
]);
Enter fullscreen mode Exit fullscreen mode

$unwind

Deconstructs the items array in each order to output a document for each item.

db.orders.aggregate([
  { $unwind: "$items" }
]);
Enter fullscreen mode Exit fullscreen mode

$addFields

Adds a new field discountedAmount which is 90% of the original amount.

db.orders.aggregate([
  { $addFields: { discountedAmount: { $multiply: ["$amount", 0.9] } } }
]);
Enter fullscreen mode Exit fullscreen mode

$replaceRoot

Replaces the root document with the items array.

db.orders.aggregate([
  { $replaceRoot: { newRoot: "$items" } }
]);
Enter fullscreen mode Exit fullscreen mode

$sum

Calculates the total amount for all orders.

db.orders.aggregate([
  {
    $group: {
      _id: null,
      totalAmount: { $sum: "$amount" }
    }
  }
]);
Enter fullscreen mode Exit fullscreen mode

$avg

Calculates the average amount spent per order.

db.orders.aggregate([
  {
    $group: {
      _id: null,
      averageAmount: { $avg: "$amount" }
    }
  }
]);
Enter fullscreen mode Exit fullscreen mode

$min

Finds the minimum amount spent on an order.

db.orders.aggregate([
  {
    $group: {
      _id: null,
      minAmount: { $min: "$amount" }
    }
  }
]);
Enter fullscreen mode Exit fullscreen mode

$max

Finds the maximum amount spent on an order.

db.orders.aggregate([
  {
    $group: {
      _id: null,
      maxAmount: { $max: "$amount" }
    }
  }
]);
Enter fullscreen mode Exit fullscreen mode

$first

Gets the first order placed (by date).

db.orders.aggregate([
  { $sort: { "date": 1 } },
  {
    $group: {
      _id: null,
      firstOrder: { $first: "$$ROOT" }
    }
  }
]);
Enter fullscreen mode Exit fullscreen mode

$last

Gets the last order placed (by date).

db.orders.aggregate([
  { $sort: { "date": -1 } },
  {
    $group: {
      _id: null,
      lastOrder: { $last: "$$ROOT" }
    }
  }
]);
Enter fullscreen mode Exit fullscreen mode

So, we have covered basic CRUD operations, all major aggregation stages, and operations, and looked into resources for further learning.

. .
Terabox Video Player