Skip to content

aggregate

aggregate turns row-level data into named summaries. It emits one row per group, or one global summary row when groupBy: [].

This is the node to reach for when the output columns are the story: counts by site, mean score by arm, maximum date per account, or a compact table for a report.

FieldRequiredNotes
inputsyesExactly one upstream table.
groupByyesArray of expressions. Empty array means one global summary row.
metricsyesOne or more alias expressions like "[mean_age] = [age].mean()".
  • Each metric should be an alias expression, for example "[mean_score] = [score].mean()".
  • Keep metric names report-ready. Anonymous or machine-looking aliases make the resulting table harder to review.
  • Common reducers include .sum(), .mean(), .count(), .min(), .max(), .n_unique(), and .distinct().

default contains the group keys plus metric columns. The row count usually collapses, so output shape is the first thing to inspect.

For more complex windowed reductions or custom statistics, move to a Python, R, JavaScript, or SQL node.

- id: by_site
kind: aggregate
inputs: [data] # length 1
groupBy: ["[site]"]
metrics:
- "[mean_score] = [score].mean()"
- "[n] = [score].count()"