Expression language
Rime’s expression language is the shared formula syntax behind core transform nodes. It is intentionally smaller than Python or SQL: enough for row filters, feature columns, grouping keys, aggregate metrics, sort keys, and expression join keys, while staying readable in YAML and inspectable in the editor.
Where Expressions Appear
Section titled “Where Expressions Appear”| Node | Fields | Meaning |
|---|---|---|
filter | expr | Keep rows where the expression is truthy |
derive | expr | Compute one new column named by as |
aggregate | groupBy, metrics | Define grouping keys and named reductions |
select | columns | Runtime projection expressions; schema currently restricts these to identifiers |
sort | by[].expr | Compute sort keys |
join | leftKey, rightKey | Bare column names, or expressions when the key is not a bare identifier |
Column References
Section titled “Column References”Column names go in square brackets.
expr: "[age] >= 18"expr: "[Cost of Goods Sold] / [revenue]"Use brackets even when a column name looks like an identifier. That keeps expressions visually distinct from YAML field names and string literals.
Literals And Operators
Section titled “Literals And Operators”Expressions support numbers, strings, booleans, and null:
expr: "[status] == 'active' and [score] >= 0.8"expr: "[site] in ('north', 'south')"expr: "not ([deleted] == true)"Supported operator groups:
| Group | Operators |
|---|---|
| Arithmetic | +, -, *, /, unary - |
| Comparison | ==, !=, >, >=, <, <= |
| Boolean | and, or, not |
| Membership | in (...) with a parenthesized literal list |
Parentheses work for grouping.
expr: "([crp_mean] * 2.0 + [ldl_max] * 0.05) / [n_visits]"Function Calls
Section titled “Function Calls”Function calls operate across expressions.
| Function | Use |
|---|---|
coalesce(a, b, ...) | Fill null values from the next expression |
concat(a, b, ...) | Concatenate expressions as strings |
max(a, b, ...) | Horizontal maximum across expressions |
min(a, b, ...) | Horizontal minimum across expressions |
Example:
- id: risk_index kind: derive inputs: [lab_load] as: risk_index expr: "coalesce([crp_mean], 0) * 2.0 + coalesce([ldl_max], 0) * 0.05"Column Methods
Section titled “Column Methods”Methods hang off a column or expression.
| Method | Common place | Use |
|---|---|---|
.uppercase(), .lowercase() | derive, sort, join keys | Normalize strings |
.to_date(), .to_int(), .to_float(), .to_string() | derive | Cast values |
.sum(), .mean(), .count(), .min(), .max() | aggregate metrics | Reduce a group |
.n_unique(), .distinct() | aggregate metrics | Count distinct values |
.lag(n), .lead(n) | sort/derive patterns | Shift values |
.rolling_mean(n) | feature engineering | Rolling average |
.first_value(), .rank() | grouped/window-like features | First value or rank |
.sort_by(expr), .over(expr) | advanced Polars-backed expressions | Sort/window context |
Aggregate metrics should name their output with an alias expression:
metrics: - "[mean_crp] = [crp].mean()" - "[n_visits] = [crp].count()"Alias Expressions
Section titled “Alias Expressions”Alias expressions use a bracketed output name on the left side:
"[mean_score] = [score].mean()"Use aliases in aggregate.metrics. For derive, prefer as: instead:
- id: lab_load kind: derive as: lab_load expr: "[crp_mean] * [ldl_max] / 1000.0"Practical Patterns
Section titled “Practical Patterns”Null-safe Score
Section titled “Null-safe Score”expr: "coalesce([baseline_score], 0) + coalesce([followup_score], 0)"Cohort Filter
Section titled “Cohort Filter”expr: "[age] >= 18 and [site] in ('north', 'south')"Grouped Rollup
Section titled “Grouped Rollup”groupBy: - "[site]"metrics: - "[mean_risk] = [risk_index].mean()" - "[n] = [patient_id].count()"Computed Join Key
Section titled “Computed Join Key”- id: joined kind: join inputs: [left_table, right_table] leftKey: "[site].lowercase()" rightKey: "[site_code].lowercase()"For important computed keys, a separate derive node is often easier to review than hiding the key logic inside the join.
Limits
Section titled “Limits”The expression language is not a general scripting language. Use a Python, R, JavaScript, or SQL node when you need multi-step control flow, external libraries, custom statistical routines, or transformations that are clearer as code.