Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[docs] Explain each LIR operator #31054

Merged
merged 6 commits into from
Jan 24, 2025
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
23 changes: 20 additions & 3 deletions doc/user/content/sql/explain-plan.md
Original file line number Diff line number Diff line change
Expand Up @@ -59,11 +59,11 @@ This stage determines the query optimization stage at which the plan snapshot wi

Plan Stage | Description
------|-----
**RAW PLAN** | Display the raw plan.
**DECORRELATED PLAN** | Display the decorrelated plan.
**RAW PLAN** | Display the raw plan; this is closest to the original SQL.
**DECORRELATED PLAN** | Display the decorrelated but not-yet-optimized plan.
**LOCALLY OPTIMIZED** | Display the locally optimized plan (before view inlining and access path selection). This is the final stage for regular `CREATE VIEW` optimization.
**OPTIMIZED PLAN** | _(Default)_ Display the optimized plan.
**PHYSICAL PLAN** | Display the physical plan.
**PHYSICAL PLAN** | Display the physical plan; this is close but not identical to the operators shown in [`mz_introspection.mz_lir_mapping`](../../sql/system-catalog/mz_introspection/#mz_lir_mapping).

### Output modifiers

Expand Down Expand Up @@ -240,7 +240,24 @@ Below the plan, a "Used indexes" section indicates which indexes will be used by

### Reference: Plan operators

Materialize offers several output formats for `EXPLAIN` and debugging.
LIR plans as rendered in
[`mz_introspection.mz_lir_mapping`](../../sql/system-catalog/mz_introspection/#mz_lir_mapping)
are deliberately succinct, while the plans in other formats give more
detail.

The decorrelated and optimized plans from `EXPLAIN DECORRELATED PLAN
FOR ...`, `EXPLAIN LOCALLY OPTIMIZED PLAN FOR ...`, and `EXPLAIN
OPTIMIZED PLAN FOR ...` are in a mid-level representation that is
closer to LIR than SQL. The raw plans from `EXPLAIN RAW PLAN FOR ...`
are closer to SQL (and therefore less indicative of how the query will
actually run).

{{< tabs >}}
{{< tab "In fully optimized physical (LIR) plans" >}}
{{< explain-plans/operator-table data="explain_plan_operators" planType="LIR" >}}
{{< /tab >}}

{{< tab "In decorrelated and optimized plans" >}}
{{< explain-plans/operator-table data="explain_plan_operators" planType="optimized" >}}
{{< /tab >}}
Expand Down
174 changes: 171 additions & 3 deletions doc/user/data/explain_plan_operators.yml
Original file line number Diff line number Diff line change
Expand Up @@ -14,6 +14,19 @@ operators:
- (3, 4)
```

- operator: Constant
plan_types: "LIR"
description: |
Always produces the same collection of rows.
uses_memory: False
memory_details: ""
expansive: False

example: |
```mzsql
Constant 2 rows
```

- operator: Get
plan_types: "optimized,raw"
description: |
Expand All @@ -22,11 +35,32 @@ operators:
uses_memory: False
memory_details: ""
expansive: False
expansive_details: |
mgree marked this conversation as resolved.
Show resolved Hide resolved
Each row has _less_ data (i.e., shorter rows, but same number of rows).

example: "`Get materialize.public.ordered`"

- operator: Get::~
plan_types: "LIR"
description: |
Produces rows from either an existing source/view or
from a previous operator in the same plan. There may be a
`MapFilterProject` included in the lookup.

There are three types of `Get`.

1. `Get::PassArrangements`, which means the plan will use an
existing arrangement.

2. `Get::Arrangement`, which means that the results will be
_looked up_ in an existing arrangement.

3. `Get::Collection`, which means that the results are
unarranged, and will be processed as they arrive.

uses_memory: False
memory_details: ""
expansive: False
example: "`Get::PassArrangements materialize.public.ordered`"

- operator: Project
plan_types: "optimized,raw"
description: |
Expand All @@ -35,7 +69,8 @@ operators:
uses_memory: False
memory_details: ""
expansive: False

expansive_details: |
Each row has _less_ data (i.e., shorter rows, but same number of rows).
mgree marked this conversation as resolved.
Show resolved Hide resolved
example: "`Project (#2, #3)`"

- operator: Map
Expand All @@ -49,6 +84,22 @@ operators:
Each row has more data (i.e., longer rows but same number of rows).
example: "`Map (((#1 * 10000000dec) / #2) * 1000dec)`"

- operator: MapFilterProject
plan_types: "LIR"
description: |
The number after the operator is the input operator's `lir_id`.

Computes new columns, filters columns, and projects away columns. Works row-by-row.
uses_memory: False
memory_details: ""
expansive: True
expansive_details: |
Each row may have more data, from the `Map`.
Each row may also have less data, from the `Project`.
There may be fewer rows, from the `Filter`.

example: "`MapFilterProject 5`"

- operator: FlatMap
plan_types: "optimized"
description: |
Expand All @@ -60,6 +111,19 @@ operators:
Depends on the [table function](/sql/functions/#table-functions) used.
example: "`FlatMap jsonb_foreach(#3)`"

- operator: FlatMap
plan_types: "LIR"
description: |
The number after the operator is the input operator's `lir_id`.

Appends the result of some (one-to-many) [table function](/sql/functions/#table-functions) to each row in the input.
uses_memory: False
memory_details: ""
expansive: True
expansive_details: |
Depends on the [table function](/sql/functions/#table-functions) used.
example: "`FlatMap 3 (jsonb_foreach)`"

- operator: CallTable
plan_types: "raw"
description: |
Expand Down Expand Up @@ -107,6 +171,23 @@ operators:
Depends on the join order and facts about the joined collections.
example: "`Join on=(#1 = #2) type=delta`"

- operator: Join::~
plan_types: "LIR"
description: |
The input operators are listed in the order performed by the join.

Returns combinations of rows from each input whenever some equality predicates are `true`.

There are two types of `Join`: `Join::Differential` and `Join::Delta`, with [documented differences](/transform-data/optimization/#join).
uses_memory: True
memory_details: |
Uses memory for 3-way or more differential joins.
expansive: True
expansive_details: |
Depends on the join order and facts about the joined collections.
example: "`Join::Differential 6 » 7`"


- operator: CrossJoin
plan_types: "optimized"
description: |
Expand All @@ -131,6 +212,33 @@ operators:
expansive: False
example: "`Reduce group_by=[#0] aggregates=[max((#0 * #1))]`"

- operator: Reduce::~
plan_types: "LIR"
description: |
The number after the operator is the input operator's `lir_id`.

Groups the input rows by some scalar expressions, reduces each group using some aggregate functions, and produces rows containing the group key and aggregate outputs.

There are five types of `Reduce`, ordered by increasing complexity:

1. `Reduce::Distinct` corresponds to the SQL `DISTINCT` operator.

2. `Reduce::Accumulable` corresponds to several easy to implement aggregations that can be done simultaneously.

3. `Reduce::Hierarchical` corresponds to an aggregation requiring a tower of arrangements. These can be either monotonic (more efficient) or bucketed. These may benefit from a hint; [see `mz_introspection.mz_expected_group_size_advice`](/sql/system-catalog/mz_introspection/#mz_expected_group_size_advice).

4. `Reduce::Collation` corresponds to an arbitrary mix of reductions, which will be performed separately and then joined together.

5. `Reduce::Basic` corresponds to a single hard-to-incrementalize aggregation.

uses_memory: True
memory_details: |
Can use significant amount as the operator can significantly overestimate
the size. For `MIN` and `MAX` aggregates, consult
[`mz_introspection.mz_expected_group_size_advice`](/sql/system-catalog/mz_introspection/#mz_expected_group_size_advice).
expansive: False
example: "`Reduce::Accumulable 8`"

- operator: Reduce
plan_types: "raw"
description: |
Expand Down Expand Up @@ -179,6 +287,27 @@ operators:
expansive: False
example: "`TopK order_by=[#1 asc nulls_last, #0 desc nulls_first] limit=5`"

- operator: TopK::~
plan_types: "LIR"
description: |
The number after the operator is the input operator's `lir_id`.

Groups the input rows, sorts them according to some ordering, and returns at most `K` rows at some offset from the top of the list, where `K` is some (possibly computed) limit.

There are three types of `TopK`. Two are special cased for monotonic inputs (i.e., inputs which never retract data).

1. `TopK::MonotonicTop1`.
2. `TopK::MonotonicTopK`, which may give an expression indicating the limit.
3. `TopK::Basic`, a generic `TopK` plan.
uses_memory: True
memory_details: |
Can use significant amount as the operator can significantly overestimate
the size. Consult
[`mz_introspection.mz_expected_group_size_advice`](/sql/system-catalog/mz_introspection/#mz_expected_group_size_advice).
expansive: False
example: "`TopK::Basic 10`"


- operator: Negate
plan_types: "optimized,raw"
description: |
Expand All @@ -188,6 +317,15 @@ operators:
expansive: False
example: "`Negate`"

- operator: Negate
plan_types: "LIR"
description: |
Negates the row counts of the input. This is usually used in combination with union to remove rows from the other union input.
uses_memory: False
memory_details: ""
expansive: False
example: "`Negate 17`"

- operator: Threshold
plan_types: "optimized,raw"
description: |
Expand All @@ -198,6 +336,16 @@ operators:
expansive: False
example: "`Threshold`"

- operator: Threshold
plan_types: "LIR"
description: |
Removes any rows with negative counts.
uses_memory: True
memory_details: |
Uses memory proportional to the number of input updates, twice.
expansive: False
example: "`Threshold 47`"

- operator: Union
plan_types: "optimized,raw"
description: |
Expand All @@ -208,6 +356,16 @@ operators:
expansive: False
example: "`Union`"

- operator: Union
plan_types: "LIR"
description: |
Combines its inputs into a unified output, emitting one row for each row on any input.
uses_memory: True
memory_details: |
If the union "consolidates output", it will make moderate use of memory, particularly at hydration time. If the union is not marked with "consolidates output", it will not consume memory.
expansive: False
example: "`Union 7 10 11 14 (consolidates output)`"

- operator: ArrangeBy
plan_types: "optimized"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should this be "lir"?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No---we render them slightly differently in LIR. I had this entry duplicated for some reason, though. 🤔

description: |
Expand All @@ -218,6 +376,16 @@ operators:
expansive: False
example: "`ArrangeBy keys=[[#0]]`"

- operator: Arrange
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Was this supposed to be ArrangeBy? (See comment on line https://github.com/MaterializeInc/materialize/pull/31054/files#diff-8a8cb8a7b9f7777ed5d1f88369aa9a4c9ef159e56cd49f4f1a2e309435d707deR370). -- although, the example has Arrange 12 ... but the description talks about keys.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No---I had it just output Arrange, in part because mz_lir_mapping won't show any info about the keys. (Corrected the text on this.)

plan_types: "LIR"
description: |
Indicates a point that will become an arrangement in the dataflow engine, i.e., it will consume memory to cache results.
uses_memory: True
memory_details: |
Depends. When it does, uses memory proportional to the number of input updates.
expansive: False
example: "`Arrange 12`"

- operator: Return ... With ...
plan_types: "optimized,raw"
description: |
Expand Down
Loading