R language nodes
An R language node uses kind: r. You write a top-level run <- function(...)
or transform <- function(...), and Rime calls it with named arguments from the
YAML in: map.
Minimum Example
Section titled “Minimum Example”- id: efficiency kind: r source: scripts/efficiency.R in: cohort: features threshold: params.thresholdrun <- function(cohort, threshold) { cohort$flag <- cohort$score > as.numeric(threshold) cohort}There is no Rime-specific registration call. The function name and YAML slots are the contract.
Function Signature
Section titled “Function Signature”YAML in: slot | R value |
|---|---|
Upstream node ref, for example cohort: features | data.frame/tibble-like table |
Param ref, for example threshold: params.threshold | native scalar/list |
The default entrypoint is run. transform is accepted for older scripts. To
use another function, set entrypoint: on the node.
run <- function(cohort, lookup, threshold) { # cohort and lookup are tabular values. # threshold came from params.threshold.}If your function declares params, Rime passes the resolved params object.
Outputs
Section titled “Outputs”Single Output
Section titled “Single Output”Return a data.frame or tibble:
run <- function(orders) { orders[orders$total > 0, ]}Downstream nodes reference the default output by the node ID.
Multiple Outputs
Section titled “Multiple Outputs”Declare named outputs in YAML and return a named list with matching values:
- id: split kind: r source: scripts/split.R in: { cohort: features } out: { train: table, test: table }run <- function(cohort) { set.seed(42) train_idx <- sample(seq_len(nrow(cohort)), size = floor(nrow(cohort) * 0.8)) list( train = cohort[train_idx, ], test = cohort[-train_idx, ] )}Downstream refs are split.train and split.test.
Non-Tabular Output
Section titled “Non-Tabular Output”For model summaries, fitted parameters, or compact JSON-like values, declare an
any output:
out: { result: any }run <- function(cohort) { fit <- lm(score ~ age + treatment, data = cohort) list( coefficients = as.list(coef(fit)), r_squared = summary(fit)$r.squared )}Plot Capture
Section titled “Plot Capture”Rime captures R plot candidates returned by the entrypoint, including ggplot objects, recorded plots, grobs, and gTrees.
For a plot-only diagnostic node:
- id: score_plot kind: r source: scripts/score_plot.R in: { cohort: features } out: { result: any }run <- function(cohort) { library(ggplot2) ggplot(cohort, aes(x = age, y = score)) + geom_point()}For table-producing analysis, keep the returned value tabular. If you need a publishable plot and a table, use separate nodes so each output has a clear type.
Runtime Model
Section titled “Runtime Model”R nodes run in a warm R runner session for the selected Rscript during a
CLI/editor run. The runner is isolated from the host Node process, but startup
is amortized across R nodes that share the same interpreter.
Inputs and outputs move through Rime’s Arrow-backed artifact path. The runner reads upstream tables into R tabular values, calls your function, and returns tables or JSON-like objects to the runtime.
Environment
Section titled “Environment”Required: R 4.0+ with arrow, jsonlite, and tibble.
install.packages(c("arrow", "jsonlite", "tibble"))Point Rime at a specific Rscript:
rime run pipeline.dag.yaml --rscript-bin "$(which Rscript)"Or inline in the DAG:
interpreters: r: /usr/local/bin/RscriptSee Also
Section titled “See Also”- Python language nodes - same slot protocol, pandas native
- JavaScript language nodes -
defineNodeand row arrays - SQL language nodes - DuckDB temp tables
- Language node reference - full field list
- Polyglot runtime overview - cross-language design