Accumulators¶
Accumulators are used within Pipeline.group() to compute aggregate values for each group. All accumulator classes live in gault.accumulators and are dataclasses that extend Accumulator.
Usage in group()¶
Accumulators are passed to Pipeline.group() as named expressions. Each accumulator must be aliased (given an output field name):
from gault import Pipeline, Field
from gault.accumulators import Sum, Avg, Count
# Dict form
Pipeline().group(
{"total_sales": Sum("$amount"), "avg_price": Avg("$price"), "doc_count": Count()},
by="$category",
)
# Spread Aliased form (using .alias())
Pipeline().group(
Sum("$amount").alias("total_sales"),
Avg("$price").alias("avg_price"),
Count().alias("doc_count"),
by="$category",
)
# Group all documents (by=None)
Pipeline().group({"grand_total": Sum("$amount")}, by=None)
All accumulators support the .alias(name) method inherited from AsAlias, which wraps the accumulator in an Aliased container for use with the spread form.
Accumulator (base class)¶
Abstract base class for all accumulators. Subclasses must implement:
Sum¶
Returns the sum of numeric values. Ignores non-numeric values.
| Parameter | Type | Description |
|---|---|---|
input |
NumberExpression |
Expression that resolves to a number. Use 1 to count documents. |
Example:
Pipeline().group({"total": Sum("$price")}, by="$category")
# {"$group": {"_id": "$category", "total": {"$sum": "$price"}}}
Avg¶
Returns the average of numeric values.
| Parameter | Type | Description |
|---|---|---|
input |
NumberExpression |
Expression that resolves to a number. |
Example:
Count¶
Returns the number of documents in a group. Takes no parameters.
Example:
Min¶
Returns the minimum value.
| Parameter | Type | Description |
|---|---|---|
input |
NumberExpression |
Expression that resolves to a comparable value. |
Max¶
Returns the maximum value.
| Parameter | Type | Description |
|---|---|---|
input |
NumberExpression |
Expression that resolves to a comparable value. |
First¶
Returns the value from the first document in each group. Order depends on the preceding $sort stage.
| Parameter | Type | Description |
|---|---|---|
input |
NumberExpression |
Expression to evaluate. |
Example:
Last¶
Returns the value from the last document in each group.
| Parameter | Type | Description |
|---|---|---|
input |
NumberExpression |
Expression to evaluate. |
FirstN¶
Returns the first n values in each group.
| Parameter | Type | Description |
|---|---|---|
input |
NumberExpression |
Expression to evaluate. |
n |
int |
Number of values to return. |
LastN¶
Returns the last n values in each group.
| Parameter | Type | Description |
|---|---|---|
input |
NumberExpression |
Expression to evaluate. |
n |
int |
Number of values to return. |
Push¶
Returns an array of all values for each group (including duplicates).
| Parameter | Type | Description |
|---|---|---|
input |
AnyExpression |
Expression to evaluate. |
Example:
AddToSet¶
Returns an array of unique values for each group.
| Parameter | Type | Description |
|---|---|---|
input |
NumberExpression |
Expression to evaluate. |
Example:
Top¶
Returns the top element within a group according to a sort order.
| Parameter | Type | Description |
|---|---|---|
sort_by |
SortPayload |
Sort specification. |
output |
AnyExpression \| list[AnyExpression] |
Expression(s) to return. |
TopN¶
Returns the top n elements within a group.
| Parameter | Type | Description |
|---|---|---|
n |
int |
Number of elements. |
sort_by |
SortPayload |
Sort specification. |
output |
AnyExpression \| list[AnyExpression] |
Expression(s) to return. |
TopN(n=5, sort_by={"score": -1}, output="$name")
# {"$topN": {"n": 5, "sortBy": {"score": -1}, "output": "$name"}}
Bottom¶
Returns the bottom element within a group according to a sort order.
| Parameter | Type | Description |
|---|---|---|
sort_by |
SortPayload |
Sort specification. |
output |
AnyExpression \| list[AnyExpression] |
Expression(s) to return. |
Bottom(sort_by={"score": 1}, output="$name")
# {"$bottom": {"sortBy": {"score": 1}, "output": "$name"}}
BottomN¶
Returns the bottom n elements within a group.
| Parameter | Type | Description |
|---|---|---|
n |
int |
Number of elements. |
sort_by |
SortPayload |
Sort specification. |
output |
AnyExpression \| list[AnyExpression] |
Expression(s) to return. |
BottomN(n=3, sort_by={"score": 1}, output=["$name", "$score"])
# {"$bottomN": {"n": 3, "sortBy": {"score": 1}, "output": ["$name", "$score"]}}
MinN¶
Returns the n minimum valued elements within a group.
| Parameter | Type | Description |
|---|---|---|
input |
NumberExpression |
Expression to evaluate. |
n |
int |
Number of minimum values. |
MaxN¶
Returns the n maximum valued elements within a group.
| Parameter | Type | Description |
|---|---|---|
input |
NumberExpression |
Expression to evaluate. |
n |
int |
Number of maximum values. |
Median¶
Returns an approximation of the median value. Uses the "approximate" method.
| Parameter | Type | Description |
|---|---|---|
input |
NumberExpression |
Expression that resolves to a number. |
Percentile¶
Returns an approximation of percentile values. Uses the "approximate" method.
| Parameter | Type | Description |
|---|---|---|
input |
NumberExpression |
Expression that resolves to a number. |
p |
list[float] |
Percentile values between 0.0 and 1.0 inclusive. |
Percentile("$score", p=[0.25, 0.5, 0.75])
# {"$percentile": {"input": "$score", "p": [0.25, 0.5, 0.75], "method": "approximate"}}
StdDevPop¶
Returns the population standard deviation of the input values.
| Parameter | Type | Description |
|---|---|---|
input |
NumberExpression |
Expression that resolves to a number. |
StdDevSamp¶
Returns the sample standard deviation of the input values.
| Parameter | Type | Description |
|---|---|---|
input |
NumberExpression |
Expression that resolves to a number. |
MergeObjects¶
Combines multiple documents into a single document. When used as a group accumulator, merges all documents in the group.
| Parameter | Type | Description |
|---|---|---|
input |
ObjectExpression |
Expression that resolves to a document. |
Example:
Quick reference table¶
| Accumulator | MongoDB operator | Parameters | Description |
|---|---|---|---|
Sum |
$sum |
input |
Sum of numeric values |
Avg |
$avg |
input |
Average of numeric values |
Count |
$count |
(none) | Document count |
Min |
$min |
input |
Minimum value |
Max |
$max |
input |
Maximum value |
First |
$first |
input |
First value in group |
Last |
$last |
input |
Last value in group |
FirstN |
$firstN |
input, n |
First N values |
LastN |
$lastN |
input, n |
Last N values |
Push |
$push |
input |
Array of all values |
AddToSet |
$addToSet |
input |
Array of unique values |
Top |
$top |
sort_by, output |
Top element by sort |
TopN |
$topN |
n, sort_by, output |
Top N elements by sort |
Bottom |
$bottom |
sort_by, output |
Bottom element by sort |
BottomN |
$bottomN |
n, sort_by, output |
Bottom N elements by sort |
MinN |
$minN |
input, n |
N minimum values |
MaxN |
$maxN |
input, n |
N maximum values |
Median |
$median |
input |
Approximate median |
Percentile |
$percentile |
input, p |
Approximate percentiles |
StdDevPop |
$stdDevPop |
input |
Population std deviation |
StdDevSamp |
$stdDevSamp |
input |
Sample std deviation |
MergeObjects |
$mergeObjects |
input |
Merge documents |