Skip to content

R language nodes

An R language node uses kind: r. You write a top-level run <- function(...) or transform <- function(...), and Rime calls it with named arguments from the YAML in: map.

- id: efficiency
kind: r
source: scripts/efficiency.R
in:
cohort: features
threshold: params.threshold
scripts/efficiency.R
run <- function(cohort, threshold) {
cohort$flag <- cohort$score > as.numeric(threshold)
cohort
}

There is no Rime-specific registration call. The function name and YAML slots are the contract.

YAML in: slotR value
Upstream node ref, for example cohort: featuresdata.frame/tibble-like table
Param ref, for example threshold: params.thresholdnative scalar/list

The default entrypoint is run. transform is accepted for older scripts. To use another function, set entrypoint: on the node.

run <- function(cohort, lookup, threshold) {
# cohort and lookup are tabular values.
# threshold came from params.threshold.
}

If your function declares params, Rime passes the resolved params object.

Return a data.frame or tibble:

run <- function(orders) {
orders[orders$total > 0, ]
}

Downstream nodes reference the default output by the node ID.

Declare named outputs in YAML and return a named list with matching values:

- id: split
kind: r
source: scripts/split.R
in: { cohort: features }
out: { train: table, test: table }
run <- function(cohort) {
set.seed(42)
train_idx <- sample(seq_len(nrow(cohort)), size = floor(nrow(cohort) * 0.8))
list(
train = cohort[train_idx, ],
test = cohort[-train_idx, ]
)
}

Downstream refs are split.train and split.test.

For model summaries, fitted parameters, or compact JSON-like values, declare an any output:

out: { result: any }
run <- function(cohort) {
fit <- lm(score ~ age + treatment, data = cohort)
list(
coefficients = as.list(coef(fit)),
r_squared = summary(fit)$r.squared
)
}

Rime captures R plot candidates returned by the entrypoint, including ggplot objects, recorded plots, grobs, and gTrees.

For a plot-only diagnostic node:

- id: score_plot
kind: r
source: scripts/score_plot.R
in: { cohort: features }
out: { result: any }
run <- function(cohort) {
library(ggplot2)
ggplot(cohort, aes(x = age, y = score)) + geom_point()
}

For table-producing analysis, keep the returned value tabular. If you need a publishable plot and a table, use separate nodes so each output has a clear type.

R nodes run in a warm R runner session for the selected Rscript during a CLI/editor run. The runner is isolated from the host Node process, but startup is amortized across R nodes that share the same interpreter.

Inputs and outputs move through Rime’s Arrow-backed artifact path. The runner reads upstream tables into R tabular values, calls your function, and returns tables or JSON-like objects to the runtime.

Required: R 4.0+ with arrow, jsonlite, and tibble.

install.packages(c("arrow", "jsonlite", "tibble"))

Point Rime at a specific Rscript:

Terminal window
rime run pipeline.dag.yaml --rscript-bin "$(which Rscript)"

Or inline in the DAG:

interpreters:
r: /usr/local/bin/Rscript