By the time 1.4 ends, the planner has produced one PlannedStmt. Inside it is an execution tree built from Plan nodes, frozen into a form you can follow step by step, something like "go into the primary key index on users, fetch the one matching row, then output that whole row." But that is still only a blueprint. Reading actual pages off disk, picking out the rows that match the condition, handing results back to the caller: none of that has happened yet. The stage that takes that blueprint and produces actual rows is the executor.
The difference between the planner and the executor is the difference between deciding and doing. The planner was the stage that weighed "which index, in what order, with what join method" by cost and chose. The executor takes the chosen approach and carries it out as is. There is nothing left to choose. It just runs the nodes baked into the plan tree and pulls rows out of them.
To run it, the executor takes the Plan tree it received and turns it into a PlanState tree. The Plan tree is the static blueprint the planner made, and it does not change during execution. But to actually run, each node needs state that changes as execution proceeds: which row it is reading now, whether the hash table is fully built, what tuple it has buffered from a child. So when execution begins, a PlanState tree with the exact same shape as the Plan tree is created. The blueprint Plan tree is left untouched, and the running state lives in that PlanState tree instead.
How the executor produces result rows is the heart of the stage. The executor does not build the entire result set at once and stack it up. Instead, it asks the topmost node of the tree for "the next row," and that request travels down the tree to the leaves. When a leaf scan node reads one row from a page and passes it up to its parent, that row climbs up one level at a time through joins and filters until it reaches the top. The top sends that single row to the caller (the client, or the target table of an INSERT). When the next row is needed, the top is asked for "the next row" again. Rows are pulled from above whenever they are needed, and that pull propagates downward. This structure is called the pull model, or the Volcano iterator model.
Why pull one row at a time? Building everything at once inflates memory by the size of the result, and even when the client only needs the first 10 rows, as in LIMIT 10, the whole thing gets computed anyway. Pulling one row at a time lets the top stop after it has received ten rows, and then the nodes beneath it do exactly that much work and no more. This is possible because rows do not flow top to bottom: requests flow top to bottom, and rows flow bottom to top.
In the 1.4 planner chapter, the planner decided which scan, join, or aggregation method to use and whether to go parallel, by cost. This chapter looks at how the chosen nodes actually run on top of the pull skeleton.
This chapter splits into five sections.
- 1.5.1 The Volcano iterator model: a closer look at the "pull one row at a time" skeleton we just saw. Every node returns one next tuple through the same interface, and we explain how that single call ends up driving the whole tree.
- 1.5.2 Scan nodes: SeqScan, IndexScan, BitmapScan: the nodes at the leaves that actually read rows off disk. The conditions under which each of the three scan methods is used.
- 1.5.3 Join nodes: NestLoop, HashJoin, MergeJoin: how the three nodes that bring rows from two children together behave on top of the pull skeleton.
-
1.5.4 Aggregation: HashAgg, GroupAgg, sort-based: the nodes that execute
GROUP BYand aggregate functions. How aggregation, which needs to see everything before it can answer, is handled inside a skeleton that pulls one row at a time. - 1.5.5 Parallel query: how worker processes cooperate: until now, execution has been one backend process driving the tree. Here we look at parallel execution, where several worker processes split a large scan among themselves and run it concurrently. The shared memory and background worker infrastructure that supports this cooperation is covered in detail in Chapter 6.
The executor is the stage that runs the plan as is, not the stage that makes new decisions. The node tree shown in EXPLAIN output is exactly this execution tree, and the actual rows and loops that EXPLAIN ANALYZE attaches to each node measure how many rows the node actually produced and how many times it ran. Exactly what those two numbers count is explained in 1.5.1 along with the pull model. Once you get through this chapter, those numbers in EXPLAIN read not as abstract statistics but as traces of real execution.
United States
NORTH AMERICA
Related News
Why Every Developer Needs a Strong Test Suite (Even If You Hate Writing Tests)
23h ago
SOLSTICE SIDEBAR - AI INCIDENT DESK
1d ago
The CFO's AI Playbook: 5 Finance Automations Every Indian Business Should Run in 2026
1d ago
Passkeys in 2026: A Practical Engineering Guide to Passwordless Auth
1d ago
AWS S3 Basics for Beginners
23h ago