Proposal: Decompose `plan_node` into Graph-Level Stages¶

Status: proposal (fable-5) Refers to: docs/planner-graph-ref/current-graph.md, src/venturescope/planner/agent.py

Problem¶

plan_node() (agent.py:846-1192) is a ~350-line mega-node that hides most of the planner's actual control flow from the graph. The Mermaid diagram in current-graph.md shows 8 nodes, but the real decision tree lives inside one node as a chain of early returns. Responsibilities currently fused into plan:

#	Responsibility	Lines (agent.py)	Kind
1	Iteration tick, abort/max-iters caps	846-862	deterministic
2	Region/currency bootstrap questions	864-887	deterministic
3	Proactive decomposition generation	889-904	LLM call
4	Component composition (`compose_ready_fields`)	905-917	deterministic
5	Calculator gates (cap → abort, success → finish)	919-943	deterministic
6	Blocked-calculation acquisition (incl. on-demand decomposition)	945-988	deterministic + LLM call
7	Auto-finish scan + acquisition fast path	993-1035	deterministic
8	Planner LLM structured decision	1037-1068	LLM call
9	Guardrail rewrites (derived-field redirect, web-first redirect, search/ask caps, calc adjustment)	1070-1192	deterministic + LLM call (on-demand decomposition)

Consequences:

Invisible control flow. Routing documented in current-graph.md ("plan → ask_user when…") is prose because the graph cannot express it; every consumer must read the function body.
Coarse checkpointing. A single tick performs up to 3 LLM calls (decomposition, planner decision, possibly a second decomposition). If the process dies mid-tick, all of them are redone on resume — LangGraph can only replay from node boundaries.
Coarse observability. LangGraph stream/trace shows one opaque plan step; the planner compensates with the ad-hoc _emit_event side channel.
Hard-to-target tests. tests/planner/test_planner_agent.py must drive the whole function to test a single early-return branch, with all preceding branches defused via state fixtures.

Proposal¶

Split plan into five graph nodes, each with one job, and let conditional edges express the routing that is today buried in early returns. Action nodes (search, observe, calculate, ask_user, observe_user, reflect, finish) are unchanged except their return edge now targets tick instead of plan.

Target graph¶

flowchart TD
    planner_start([START]) --> tick[tick]

    tick -->|aborted / max_iters| finish[finish]
    tick -->|region or currency missing| ask_user[ask_user / interrupt]
    tick -->|otherwise| prepare[prepare]

    prepare --> select[select]

    select -->|deterministic decision found| guard[guard]
    select -->|no decision| decide[decide / LLM]

    decide --> guard

    guard -->|search| search[search]
    guard -->|ask_user| ask_user
    guard -->|reflect| reflect[reflect]
    guard -->|calculate| calculate[calculate]
    guard -->|finish| finish

    search -->|last_observation present| observe[observe]
    search -->|no hits or backend failure| tick

    observe --> tick
    calculate --> tick
    ask_user --> observe_user[observe_user]
    observe_user --> tick
    reflect --> tick
    finish --> planner_end([END])

Node responsibilities¶

`tick` (deterministic, no LLM)¶

Increments iterations, emits the "planning step N" event.
Clears decision and decision_origin for the new cycle.
Terminal caps: status == "aborted" or iterations > max_iters → writes a finish decision (same reasoning strings as today).
Bootstrap questions: _needs_region_question / _needs_currency_question → writes the region/currency ask_user decision exactly as lines 864-887.
route_after_tick: finish decision → finish; ask_user decision → ask_user; otherwise → prepare.

Region/currency decisions intentionally bypass guard, matching today's early returns before any guardrail runs.

`prepare` (one optional LLM call)¶

_proactive_decompositions (at most one fresh generate_decomposition per tick, as today) + build_dynamic_recipes + compose_ready_fields.
Persists schema (when composition changed it) and dynamic_decompositions into state — same persistence rule as the current schema_changed / dynamic_decomps != … checks.
Unconditional edge → select.

Isolating this stage makes the decomposition LLM call individually resumable: a crash after prepare no longer re-runs decomposition on replay.

`select` (deterministic, no LLM except blocked-path decomposition)¶

The deterministic decision ladder, in current order:

Calculator cap reached with BLOCKED/ERROR → finish decision + status="aborted" (lines 919-931).
Successful and current calculation → finish decision (lines 933-943).
BLOCKED calculation → next_acquisition_task, falling back to dynamic decomposition of the first uncovered blocked field (lines 945-988).
Acquisition fast path when no actionable missing fields but open tasks exist (lines 1009-1024).
Auto-finish when all raw inputs are collected (lines 993-1007, 1026-1035).

If a decision is produced, set decision_origin = "deterministic". route_after_select: decision present → guard; otherwise → decide.

The one remaining LLM call here (decomposition for a blocked field without a recipe, lines 960-968) can stay in step 3 initially; see "Follow-up" for the option of extracting a decompose loop node.

`decide` (the planner LLM call, nothing else)¶

Builds planner_prompt from the prepared state and calls _llm().structured(..., PlannerDecision).
On structured-output failure: finish decision + llm_failed=True flag in state (so guard skips _adjust_calculation_decision, preserving the current infinite-loop protection at lines 1175-1178).
Sets decision_origin = "llm". Unconditional edge → guard.

This is the node that finally matches what the diagram calls "plan": produce a structured PlannerDecision — and nothing more.

`guard` (deterministic decision-rewrite pipeline)¶

Applies the existing rewrites, keyed on decision_origin:

origin = "llm" (full pipeline, current order): on-demand decomposition for a targeted requires-components field (1072-1088) → _redirect_derived_direct_decision → _redirect_premature_ask_for_web_field → search duplicate/cap fallback (1093-1137) → ask_user cap → finish/abort (1139-1173) → _adjust_calculation_decision (unless llm_failed).
origin = "deterministic": only _adjust_calculation_decision, mirroring the acquisition fast-path and auto-finish branches today (1019-1020, 1031). Calc-gate finish decisions pass through untouched, as they do now.

Ends with the decision log line + _emit_decision_event. route_after_guard = today's route_after_plan (dispatch on decision.action).

State changes (`planner/schema.py`)¶

Field	Type	Purpose
`decision_origin`	`Literal["deterministic", "llm"] \\| None`	Tells `guard` which rewrite subset applies.
`llm_failed`	`bool`	Replaces the local `llm_failed` variable; lets `guard` skip calc-adjustment after a structured-output failure.

Both are runtime-only, reset by tick, and serializer-compatible (plain str/bool). PlannerState (the serializer mirror) gains the same two fields with defaults, so old checkpoints deserialize cleanly.

What does NOT change¶

Outer contract: build_planner_graph, run_planner_step, initial_state, interrupt/resume via Command(resume=...), the {conversation_id}:planner thread namespace.
Action nodes and their helpers (search_node, observe_node, _merge_evidence_into_state, etc.).
Decision semantics: every branch above maps 1:1 to an existing early return. This is a topology refactor, not a behavior change.

Trade-offs¶

More checkpoint writes per tick. One plan superstep becomes up to 4 (tick, prepare, select, decide/guard). With PostgresSaver that is ~4 small writes per planner iteration. Acceptable for a chat-paced agent and exactly what buys the resumability/observability; if it ever matters, tick+prepare can be fused.
In-flight checkpoint compatibility. Renaming/removing the plan node breaks resume for planner threads checkpointed mid-graph (pending tasks reference node names). Mitigation: bump the planner thread namespace (e.g. planner_thread_id() → f"{conversation_id}:planner:v2"). The runner already supports re-bootstrap from prior_schema + prior_dynamic_decompositions, so an in-flight conversation degrades to a clean re-bootstrap with all collected values preserved.
Test churn. test_planner_agent.py tests that call plan_node directly must target the new stage nodes. This is a net win — each branch becomes testable through a small node instead of a fixture obstacle course — but it is the bulk of the diff.

Migration plan¶

Extract without rewiring. Pull bodies of tick/prepare/select/ decide/guard out of plan_node as module-level functions; plan_node becomes a thin sequential composition of them. Pure refactor, existing tests stay green. Add decision_origin/llm_failed to State.
Rewire the graph. Register the five nodes in _build_state_graph, add the conditional edges above, point action-node return edges at tick, delete plan_node and route_after_plan. Bump the planner thread namespace. Update stage-level tests.
Docs. Regenerate current-graph.md from the new topology; the "Routing details" prose section mostly disappears because the graph now says it.

Follow-up (out of scope)¶

decompose as a loop node. Both remaining in-stage LLM decomposition calls (select step 3, guard on-demand) could route to a dedicated decompose node and back, making every LLM call a graph step. Deferred: it adds two more edges and a re-entry flag for marginal benefit.
Command(goto=...) returns instead of router functions would remove the three route_after_* helpers. Deferred to keep the diff mechanical and the topology declaratively visible in _build_state_graph.

Proposal: Decompose plan_node into Graph-Level Stages¶