Proposal: Decompose `plan` Node into Graph-Level Routing¶

Author: GLM-5.1 analysis Date: 2026-06-07 Scope: src/venturescope/planner/agent.py — plan_node() (lines 846–1192, ~350 LOC)

1. Problem Statement¶

The plan node is the central router of the planner subgraph, but it conflates two fundamentally different concerns:

Concern	What it does	Lines	LLM needed?
Guard checks	Early-stop on aborted, max_iters, calculator caps, all-fields-collected	848–1035	No
Prerequisite questions	Ask region/currency before any real work	864–887	No
Schema preprocessing	Proactive decomposition, composition of component fields	889–917	No
Deterministic acquisition routing	`next_acquisition_task()` picks next field/action without LLM	945–1025	No
LLM planning	Call the planner LLM with full prompt, parse `PlannerDecision`	1037–1060	Yes
Post-LLM decision rewriting	Redirect derived fields, web-preferred forcing, search/ask caps, calculation adjustments	1070–1192	No

All six produce PlannerDecision objects and return state updates, but they are sequentially interleaved with early returns — making the logic hard to follow, test in isolation, or extend without breaking distant branches.

The current route_after_plan function (line 2070) is trivial: return state["decision"].action. This means the graph has zero routing intelligence — all routing decisions are computed inside the plan node before it returns, and the graph merely dispatches on the pre-computed string.

2. Proposed Architecture¶

2.1 Core Idea: Separate decisions from routing¶

Move guard logic and deterministic routing into new graph nodes connected by conditional edges, and reserve the plan node exclusively for LLM-based decision-making. The graph itself becomes the orchestrator.

START
  │
  ▼
┌──────────────────┐
│   pre_check      │  ← guards + prerequisites + preprocessing
└───────┬──────────┘
        │
  (conditional edge)
  ├── abort/finish → finish
  ├── ask_region → ask_user (region)
  ├── ask_currency → ask_user (currency)
  └── continue ──────────────┐
                              ▼
                    ┌──────────────────┐
                    │ acquire_or_plan ──┤  ← deterministic acquisition OR LLM call
                    └───────┬───────────┘
                            │
                      (conditional edge)
                      ├── search → search
                      ├── ask_user → ask_user
                      ├── calculate → calculate
                      ├── reflect → reflect
                      └── finish → finish

2.2 New Node Definitions¶

`pre_check` (new)¶

Purpose: Pure state inspection — no LLM calls.

Responsibilities extracted from current plan_node: - Increment iterations - Check status == "aborted" → route to finish - Check max_iters exceeded → route to finish - Check _needs_region_question() → produce region PlannerDecision, route to ask_user - Check _needs_currency_question() → produce currency PlannerDecision, route to ask_user - Run _proactive_decompositions() — update dynamic_decompositions and schema - Check calculation-cap exhaustion → route to finish - Check successful calculation already done → route to finish

Returns: Updated state (iterations, schema, dynamic_decompositions) + a routing tag.

Routing tag is a Literal["continue", "finish_aborted", "finish_iter_cap", "finish_calc_cap", "finish_calc_done", "ask_region", "ask_currency"].

@dataclass
class PreCheckResult:
    state_updates: dict[str, Any]   # iterations, schema, dynamic_decompositions
    route: Literal[
        "continue", "finish_aborted", "finish_iter_cap",
        "finish_calc_cap", "finish_calc_done",
        "ask_region", "ask_currency",
    ]
    decision: PlannerDecision | None  # set for ask_region / ask_currency

Why separate: These are all guard/prerequisite checks that must happen before any real planning. Isolating them means: - They can be unit-tested without any LLM mocking - They can short-circuit without the LLM cost - The "continue" path is the only one that needs further processing

`acquire_or_plan` (replaces most of current `plan_node`)¶

Purpose: Either produce a deterministic acquisition decision or call the LLM.

Responsibilities: 1. Check if calculator is blocked → compute next acquisition task from blocking errors 2. If acquisition_task found → produce PlannerDecision from it, skip LLM 3. If no acquisition task → check auto_finish_ok (all fields collected, no open tasks) 4. If not auto-finish → call LLM via planner_prompt() → parse PlannerDecision

Post-LLM rewriting stays here (for now — see section 2.4): - _redirect_derived_direct_decision() - _redirect_premature_ask_for_web_field() - Search/ask per-field cap enforcement - _adjust_calculation_decision()

Returns: Updated state + PlannerDecision.

`plan` node (simplified)¶

After extraction, plan becomes a thin wrapper or is removed entirely. The node can be renamed acquire_or_plan to reflect its actual role.

2.3 Reworked Graph Topology¶

def _build_state_graph() -> Any:
    builder = StateGraph(State)
    # ── Nodes ──
    builder.add_node("pre_check", pre_check_node)
    builder.add_node("acquire_or_plan", acquire_or_plan_node)
    builder.add_node("search", search_node)
    builder.add_node("observe", observe_node)
    builder.add_node("calculate", calculate_node)
    builder.add_node("ask_user", ask_user_node)
    builder.add_node("observe_user", observe_user_node)
    builder.add_node("reflect", reflect_node)
    builder.add_node("finish", finish_node)

    # ── Edges ──
    builder.add_edge(START, "pre_check")

    builder.add_conditional_edges(
        "pre_check",
        route_after_pre_check,
        {
            "continue": "acquire_or_plan",
            "finish_aborted": "finish",
            "finish_iter_cap": "finish",
            "finish_calc_cap": "finish",
            "finish_calc_done": "finish",
            "ask_region": "ask_user",
            "ask_currency": "ask_user",
        },
    )

    builder.add_conditional_edges(
        "acquire_or_plan",
        route_after_plan,  # same as current — reads decision.action
        {
            "search": "search",
            "ask_user": "ask_user",
            "calculate": "calculate",
            "reflect": "reflect",
            "finish": "finish",
        },
    )

    builder.add_conditional_edges(
        "search",
        route_after_search,
        {"observe": "observe", "plan": "pre_check"},
    )
    builder.add_edge("observe", "pre_check")
    builder.add_edge("calculate", "pre_check")
    builder.add_edge("ask_user", "observe_user")
    builder.add_edge("observe_user", "pre_check")
    builder.add_edge("reflect", "pre_check")
    builder.add_edge("finish", END)

    return builder

Key change: All "loop back" edges now point to pre_check instead of plan, ensuring guards run on every iteration.

2.4 Post-LLM Rewriting: Options¶

The post-LLM rewriting logic in plan_node (lines 1070–1192) currently performs:

Rewrite	Lines	Could be graph-level?
`_redirect_derived_direct_decision()`	1090	Partially — could be a separate validation node
`_redirect_premature_ask_for_web_field()`	1091	Could be a validation node
Search cap → force ask_user or reflect	1093–1137	Could be a validation node
Ask-user cap → force finish	1139–1173	Could be a validation node
`_adjust_calculation_decision()`	1178	Currently OK inline
Completion logging + state assembly	1179–1192	Stays inline

Option A: Keep post-LLM rewriting in `acquire_or_plan` (Recommended)¶

Simplest migration — the rewriting is closely coupled to the LLM decision
Avoids 2–3 extra state round-trips per iteration
The rewriting functions are pure transformations on PlannerDecision; they don't need separate graph nodes
Recommend this for the first pass.

Option B: Extract a `validate_decision` node¶

Insert between acquire_or_plan and action nodes
Handles all decision rewriting as a separate graph step
Adds a graph hop per iteration but makes each node trivially testable
Better long-term separation of concerns, but higher migration risk

Recommendation: Start with Option A. If tests reveal the node is still too complex, extract validate_decision later.

3. Implementation Plan¶

Phase 1: Extract `pre_check_node`¶

Create pre_check_node(state: State) -> dict[str, Any] that performs:
Iteration bump
Aborted / max_iters / calculation-cap / calculation-success guards
Region / currency prerequisite checks
Proactive decomposition + composition
Stored as new state field _pre_check_route (or a dedicated route_tag field)
Create route_after_pre_check(state: State) -> str that reads the route tag.
Update _build_state_graph() to insert pre_check before acquire_or_plan and rewire loop-back edges.
Remove the same logic from plan_node — keep only the acquisition + LLM + rewriting logic.
Update tests — pre_check logic is now independently testable without LLM.

Phase 2: Rename `plan` → `acquire_or_plan`¶

Rename plan_node → acquire_or_plan_node.
Rename graph node "plan" → "acquire_or_plan".
Update all references in runner.py, tests, and AGENTS.md.

Phase 3: Extract state updates into composable helpers¶

The current code assembles out = {"decision": ..., "iterations": ..., "status": ...} with conditional schema and dynamic_decompositions merges in 9 different return paths. Extract a helper:

def _build_plan_output(
    decision: PlannerDecision,
    iterations: int,
    status: Status,
    *,
    schema_changed: bool = False,
    schema_dict: dict[str, Any] | None = None,
    decomps_changed: bool = False,
    dynamic_decomps: dict[str, list[dict[str, Any]]] | None = None,
    original_decomps: dict[str, list[dict[str, Any]]] | None = None,
) -> dict[str, Any]:
    ...

This eliminates the 9-way conditional-return pattern without changing graph topology.

Phase 4 (optional): Extract `validate_decision` node¶

If acquire_or_plan is still too complex after Phase 1–3, extract post-LLM rewriting into a validate_decision node between acquire_or_plan and action nodes.

4. What Stays in `acquire_or_plan`¶

After Phase 1 extraction, the remaining acquire_or_plan_node contains:

1. Check calculator blocked → compute acquisition task from blocking errors
2. If acquisition_task → produce decision, adjust for calculator, return
3. Check auto_finish conditions → produce finish decision
4. Build planner prompt → call LLM → parse PlannerDecision
5. If LLM fails → produce finish decision
6. Post-LLM rewriting:
   a. Generate dynamic decomposition if LLM targets a composite field
   b. _redirect_derived_direct_decision()
   c. _redirect_premature_ask_for_web_field()
   d. Per-field search cap check
   e. Per-field ask cap check
   f. _adjust_calculation_decision()
7. Log + emit event + return state updates

This is still substantial (~200 LOC) but each step is sequential and serial — no more guard interleaving. The function reads top-to-bottom without early returns for guard conditions.

5. State Schema Changes¶

The State TypedDict needs one new field for pre_check routing:

class State(TypedDict):
    # ... existing fields ...
    _pre_check_route: str  # set by pre_check_node, consumed by route_after_pre_check

Alternatively, encode the route in the PlannerDecision.action enum by extending Action:

Action = Literal["search", "reflect", "ask_user", "calculate", "finish",
                 "ask_region", "ask_currency",  # new
                 "finish_aborted", "finish_iter_cap", "finish_calc_cap", "finish_calc_done"]  # new

Recommendation: Use a separate _pre_check_route field. It keeps Action clean (action nodes only see action-relevant decisions) and avoids polluting PlannerDecision with guard-only variants.

6. Risks and Mitigations¶

Risk	Mitigation
Loop-back edge change (`plan` → `pre_check`) changes checkpointer state	LangGraph checkpointer is keyed by state, not node name. Test that resume-after-interrupt still works.
Region/currency questions produce decisions before `pre_check` existed	`pre_check_node` produces the same `PlannerDecision` objects that `plan_node` currently does for these cases — no behavioral change.
`_build_plan_output` helper hides conditional logic	Unit-test it separately; it's pure data assembly.
Dynamic decomposition logic in `pre_check` depends on schema changes from the same function	Split `pre_check` into two steps if needed: `pre_check_guards` (pure guards) → `pre_check_preprocess` (schema mutation).
Redirect functions (`_redirect_derived_direct_decision`, etc.) read state that may have changed between `pre_check` and `acquire_or_plan`	They don't — they read `state` which is the same snapshot. State only mutates between node executions via LangGraph's reducer pattern.

7. Comparison: Before vs After¶

Before (current)¶

START → plan(guards + prerequisites + preprocessing + acquisition + LLM + rewriting) → route → action nodes → loop back to plan

1 node, ~350 LOC, 9 return paths, 7 interleaved concerns, 0 graph-level routing intelligence.

After (proposed)¶

START → pre_check(guards + prerequisites + preprocessing) → route → acquire_or_plan(acquisition + LLM + rewriting) → route → action nodes → loop back to pre_check

2 nodes, ~150 + ~200 LOC, 7 return paths in pre_check (all early-return guards), 1 sequential path in acquire_or_plan. Graph encodes which checks happen before the LLM call.

8. Summary¶

The plan node mixes guard checks, prerequisite questions, schema preprocessing, deterministic routing, LLM calls, and post-LLM rewriting into a single 350-line function with 9 return paths. The proposal:

Extract pre_check node — all no-LLM guard checks and prerequisites run first, short-circuit to finish or ask_user as needed.
Rename plan → acquire_or_plan — retains the LLM call, acquisition logic, and post-LLM rewriting.
Rewire loop-back edges to pre_check instead of plan, ensuring guards run every iteration.
Add route_after_pre_check conditional edge — gives the graph actual routing intelligence instead of a trivial string dispatch.
Keep post-LLM rewriting in acquire_or_plan for Phase 1; extract to validate_decision node later if needed.
Extract _build_plan_output helper to collapse the 9 conditional-return pattern into a single return with composable state assembly.

This makes the graph topology reflect the actual control flow, makes pre_check independently testable without LLM, and reduces cognitive load per node from ~350 LOC to ~150 LOC + ~200 LOC with clear separation of concerns.

Proposal: Decompose plan Node into Graph-Level Routing¶