Rime

dbt-style pipelines across Python, R, JavaScript, and SQL. Declare your data retrievals and transforms in one file; Rime handles caching, logs, validation, outputs, and reports.

Open the Editor Read the docs View on GitHub

What is Rime?

Rime is a runtime for reproducible data work. You declare the pipeline once in pipeline.dag.yaml. Rime runs the graph, caches each node, captures logs, validates outputs, and writes artifacts.

specification_version: "2.1"

nodes:
  - id: raw_orders
    kind: sql
    source: queries/load_orders.sql

  - id: order_metrics
    kind: derive
    inputs: [raw_orders]
    as: revenue
    expr: "[unit_price] * [quantity]"

  - id: sales_chart
    kind: python
    source: scripts/plot_sales.py
    in:
      orders: order_metrics

Here, SQL imports data, the derive node computes one reviewable feature with Rime’s expression language, and Python graphs the result. Rime captures intermediate data and script side effects, then produces a report with a runtime overview like this.

A Rime DAG where a SQL node feeds a derive node, then a Python node.

Why Rime

⚡ Functions, not jobs

A node is a function over dataframes, not a task that wires I/O. You write what each step computes; the runtime owns reading, writing, serialization, and language boundaries. The dbt mental model, extended past SQL.

🧰 One DAG, four languages

SQL for joins, Python for ML, R for stats, JavaScript for everything else. Same pipeline, named slots, typed boundaries. Dataframes cross language borders through Arrow-backed payloads instead of ad hoc CSV handoffs.

🔒 Reproducible by default

Content-addressed caching, deterministic outputs, freeze-able snapshots. Same script plus same inputs means the same artifact, every time. No “works on my machine.”

📄 Publishable narratives

Render a publishable HTML report directly from your DAG. Tables, stats, stdout, figures, and node status: one render step, one document, one source of truth.

How Rime is different

Airflow and Prefect orchestrate recurring jobs; Rime is local and one run.
Reads, writes, retries, errors, and persistence are usually coded inside tasks.
Rime owns dataframe handoff, execution order, caching, logs, validation, and outputs.

@task
def load_orders():
    orders = read_sql("SELECT * FROM orders")
    orders.to_parquet("outputs/raw_orders.parquet")

@task
def plot_sales():
    orders = pd.read_parquet("outputs/raw_orders.parquet")
    plot(orders)

@flow
def nightly_sales():
    load_orders()
    plot_sales()

nodes:
  - id: raw_orders
    kind: sql
    source: queries/load_orders.sql

  - id: sales_chart
    kind: python
    source: scripts/plot_sales.py
    in:
      orders: raw_orders

Hex is a proprietary notebook-style workspace; Rime is open source and file-backed.
Rime projects are portable pipeline.dag.yaml files.
Rime’s Directed Acyclic Graph (DAG) is made of functions, not inline notebook modifications.

# SQL cell: raw_orders
SELECT *
FROM warehouse.orders

# Python cell: order_metrics
order_metrics = raw_orders.copy()
order_metrics["revenue"] = (
    order_metrics["unit_price"] * order_metrics["quantity"]
)

# Python cell: sales_chart
plot_sales(order_metrics)

nodes:
  - id: raw_orders
    kind: sql
    source: queries/load_orders.sql

  - id: order_metrics
    kind: derive
    inputs: [raw_orders]
    as: revenue
    expr: "[unit_price] * [quantity]"

  - id: sales_chart
    kind: python
    source: scripts/plot_sales.py
    in:
      orders: order_metrics

Snakemake dependencies are based on files; Rime dependencies are based on nodes.
Reads, writes, errors, and data persistence are usually coded manually in rules or scripts.
Snakemake freshness is based mainly on file timestamps and metadata, not runtime-managed node identity.

rule load_orders:
    output: "outputs/raw_orders.parquet"
    shell: "python scripts/load_orders.py {output}"

rule plot_sales:
    input: "outputs/raw_orders.parquet"
    output: "outputs/sales.png"
    shell: "python scripts/plot_sales.py {input} {output}"

nodes:
  - id: raw_orders
    kind: python
    source: scripts/load_orders.py

  - id: sales_chart
    kind: python
    source: scripts/plot_sales.py
    in:
      orders: raw_orders

Choose your surface

The Editor and the CLI both consume the same pipeline.dag.yaml. Start visually, drop to YAML, or run the same project in CI.

Rime Editor Visual DAG authoring, live table previews, data diffs, and no-code core-node workflows.

What is Rime? The two-minute pitch: core nodes, script functions, reports, and where Rime fits.

Two ways to use Rime Learn when to use core nodes, when to use script nodes, and what the YAML file is doing.

Quick start Author your first DAG in 10 minutes. Core nodes, script nodes, and HTML report, end to end.

Real pipelines you can clone

Cars × CO₂ emissions SQL source + JS API fetch + Python UMAP + R regression → HTML narrative. The canonical multi-language example.

Penguin classifier Single-file teaching pipeline. Pivot, derive, t-test, plot — the smallest interesting DAG.

Embed in Node Use @rimekit/runtime programmatically from any Node script. For headless contexts and CI plugins.

DuckDB single source Minimal SQL-only pipeline. Good for ad-hoc warehouse reports without leaving the SQL world.