Skip to content

Commit

Permalink
Document aggregation operations (#2672)
Browse files Browse the repository at this point in the history
Closes #2603.

Co-authored-by: Chi Fujii <chi.fujii@ferretdb.io>
Co-authored-by: Alexey Palazhchenko <alexey.palazhchenko@ferretdb.io>
  • Loading branch information
3 people authored Jun 2, 2023
1 parent 325c19c commit 4fae741
Show file tree
Hide file tree
Showing 12 changed files with 111 additions and 8 deletions.
6 changes: 6 additions & 0 deletions website/docs/aggregation-operations/_category_.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,6 @@
---
label: Aggregation Operations
position: 6
link:
description: >
This section details aggregation operations in FerretDB, including aggregation commands, stages, and operators
Original file line number Diff line number Diff line change
@@ -0,0 +1,77 @@
---
sidebar_position: 1
---

# Aggregation pipeline and commands

Aggregation operations involve performing various operations on a large number of data records, such as data grouping, sorting, restructuring, or modifying.
These operations pass through one or more stages, which make up a pipeline.

![aggregation stages](/img/docs/aggregation-stages.jpg)

Each stage acts upon the returned documents of the previous stage, starting with the input documents.
As shown above, the documents pass through the pipeline with the result of the previous stage acting as input for the next stage, going from `$match` => `$group` => `$sort` stage.

For example, insert the following documents in a `sales` collection:

```js
db.sales.insertMany([
{ _id: 1, category: 'Electronics', price: 1000 },
{ _id: 2, category: 'Electronics', price: 800 },
{ _id: 3, category: 'Clothing', price: 30 },
{ _id: 4, category: 'Clothing', price: 50 },
{ _id: 5, category: 'Home', price: 1500 },
{ _id: 6, category: 'Home', price: 1200 },
{ _id: 7, category: 'Books', price: 20 },
{ _id: 8, category: 'Books', price: 40 }
])
```

A typical aggregation pipeline would look like this:

```js
db.sales.aggregate([
{ $match: { category: { $ne: 'Electronics' } } },
{
$group: {
_id: '$category',
totalPrice: { $sum: '$price' },
productCount: { $sum: 1 }
}
},
{ $sort: { totalPrice: -1 } }
])
```

In the pipeline, the complex query is broken down into separate stages where the record goes through a series of transformations until it finally produces the desired result.
First, the `$match` stage filters out all documents where the `category` field is not `Electronics`.
Then, the `$group` stage groups the documents by their `category` and calculates the total price and product count for each of those category.
Finally, the `$sort` stage sorts the documents by the `totalPrice` field in descending order.

So the above aggregation pipeline operation would return the following result:

```json5
[
{ _id: 'Home', totalPrice: 2700, productCount: 2 },
{ _id: 'Clothing', totalPrice: 80, productCount: 2 },
{ _id: 'Books', totalPrice: 60, productCount: 2 }
]
```

This section of the documentation will focus on [`aggregate` command](#aggregate-command), [aggregation stages](./aggregation-stages.md), and aggregation operators.

## `aggregate` command

The aggregation command `aggregate` is a top-level command used for aggregating data across various pipeline stages.

The command is used for performing aggregation operations on a collection and lets you specify aggregation operations in a pipeline consisting of one or more stages and operators for transforming and analyzing data, such as grouping, filtering, sorting, projecting, and calculating aggregates.

```js
// Aggregation pipeline to perform aggregation operations on a collection
db.collection.aggregate([
// Stage 1: Matching documents based on a specific field and value
{ $match: { field: value } },
// Stage 2: Grouping documents by the "category" field and calculating the sum of the "quantity" field
{ $group: { _id: '$category', total: { $sum: '$quantity' } } }
])
```
17 changes: 17 additions & 0 deletions website/docs/aggregation-operations/aggregation-stages.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,17 @@
---
sidebar_position: 2
---

# Aggregation stages

Aggregation stages are a series of one or more processes in a pipeline that acts upon the returned result of the previous stage, starting with the input documents.

| Supported aggregation stages | Description |
| ---------------------------- | ----------------------------------------------------------------------------------------------------- |
| `$count` | Returns the count of all matched documents in a specified query |
| `$group` | Groups documents based on specific value or expression and returns a single document for each group |
| `$limit` | Limits specific documents and passes the rest to the next stage |
| `$match` | Acts as a `find` operation by only returning documents that match a specified query to the next stage |
| `$skip` | Skips a specified `n` number of documents and passes the rest to the next stage |
| `$sort` | Sorts and returns all the documents based on a specified order |
| `$unwind` | Deconstructs and returns a document for every element in an array field |
2 changes: 1 addition & 1 deletion website/docs/configuration/_category_.yml
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
---
label: Configuration
position: 8
position: 9
link:
type: generated-index
description: >
Expand Down
2 changes: 1 addition & 1 deletion website/docs/contributing/_category_.yml
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
---
label: Contributing to FerretDB
position: 13
position: 14
link:
type: generated-index
description: >
Expand Down
2 changes: 1 addition & 1 deletion website/docs/diff.md
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
---
sidebar_position: 9
sidebar_position: 10
slug: /diff/ # referenced in README.md and beacon
---

Expand Down
2 changes: 1 addition & 1 deletion website/docs/indexes.md
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
---
sidebar_position: 7
sidebar_position: 8
---

# Indexes
Expand Down
2 changes: 1 addition & 1 deletion website/docs/pushdown.md
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
---
sidebar_position: 6
sidebar_position: 7
hide_table_of_contents: true
---

Expand Down
2 changes: 1 addition & 1 deletion website/docs/reference/_category_.yml
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
---
label: Reference
position: 12
position: 13
link:
type: generated-index
description: >
Expand Down
2 changes: 1 addition & 1 deletion website/docs/security.md
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
---
sidebar_position: 11
sidebar_position: 12
slug: /security/ # referenced in README.md
description: TLS and authentication
---
Expand Down
2 changes: 1 addition & 1 deletion website/docs/telemetry.md
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
---
sidebar_position: 10
sidebar_position: 11
slug: /telemetry/ # referenced in many places; must not change
---

Expand Down
3 changes: 3 additions & 0 deletions website/static/img/docs/aggregation-stages.jpg
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.

0 comments on commit 4fae741

Please sign in to comment.