Proposal: Decompose `plan_node` into Graph-Level Logic¶

Status: Proposal
Date: 2026-06-07
Author: Sisyphus (deepseek-v4-pro)

1. Problem Statement¶

The current plan_node() function in src/venturescope/planner/agent.py is a 347-line monolith that handles all of the following in a single function:

Iteration counting and early abort checks
Region/currency bootstrap questions
Dynamic decomposition generation and field composition
Calculator lifecycle checks (cap, success, blocked)
Blocked calculator → acquisition task routing
Auto-completeness detection
LLM prompt construction and structured-output call
Post-LLM decomposition for requires_components fields
Decision redirection (derived-field, web-preferred)
Search/ask-user attempt cap enforcement
Calculation decision adjustment (finish→calculate, blocked→reflect)
Logging, event emission, and output assembly

The consequence is that route_after_plan() (L2070-L2073) is trivially 4 lines — it just reads decision.action from state. All routing intelligence lives inside plan_node as 15 early-return paths scattered across the function body.

This makes the graph's "plan" node a black box where: - Routing is invisible at the graph level — reading the _build_state_graph() edges gives no insight into actual flow - Testing is coarse-grained — you test plan_node as one unit or not at all - Debugging is hard — a wrong decision at line 1139 requires tracing through 11 preceding paths to understand the state - New contributors struggle — the function mixes gatekeeping, preparation, LLM, and correction concerns

2. Proposed Architecture¶

2.1 Design Principle¶

Move logic that determines "where to go next" from inside a node to the graph's conditional edges. The plan_node should produce a raw LLM-level decision. A dedicated adjustment node, connected via conditional edges, should refine it. Pre-LLM gatekeeping should be separate nodes that can short-circuit the flow.

2.2 New Graph Structure¶

flowchart TD
    planner_start([START]) --> guard[guard]

    guard -->|action=ask_user, region/currency missing| ask_user
    guard -->|aborted or max_iters| finish
    guard -->|continue| prepare[prepare]

    prepare -->|calculator_cap_exhausted or calculator_success| finish
    prepare -->|blocked_calculator has acquisition task| adjust
    prepare -->|auto_complete| adjust
    prepare -->|continue| plan[plan]

    plan -->|inferred decomposition needed| plan
    plan -->|continue| adjust[adjust]

    adjust -->|action=search| search[search]
    adjust -->|action=ask_user| ask_user[ask_user / interrupt]
    adjust -->|action=calculate| calculate[calculate]
    adjust -->|action=reflect| reflect[reflect]
    adjust -->|action=finish| finish[finish]

    search -->|last_observation present| observe[observe]
    search -->|no hits or backend failure| guard

    observe --> guard
    calculate --> guard
    ask_user --> observe_user[observe_user]
    observe_user --> guard
    reflect --> guard
    finish --> planner_end([END])

2.3 Node Responsibilities¶

`guard` — Early-Exit Gatekeeper (~20 lines)¶

Replaces plan_node lines 846-887 (early portion).

Pure gatekeeping. Does not prepare decomposition or compose fields. Short-circuits to ask_user or finish when appropriate. Returns to the main loop for continuation.

Condition	Action	Destination
`status == "aborted"`	set decision=finish	`finish`
`iterations > max_iters`	set decision=finish	`finish`
Region missing, retries < 3	set decision=ask_user(core.region)	`ask_user`
Currency missing, retries < 3	set decision=ask_user(core.currency)	`ask_user`
None of the above	increment iterations, pass through	`prepare`

State output: iterations, decision (if exiting), status

`prepare` — State Preparation + Calculator Gating (~50 lines)¶

Replaces plan_node lines 889-1035.

Handles everything that needs a fully composed schema before making decisions: - Dynamic decomposition generation (_proactive_decompositions) - Recipe building (build_dynamic_recipes) - Field composition (compose_ready_fields) - Calculator lifecycle checks (_profile_has_calculator, _successful_calculation_current, cap exhaustion) - Blocked calculator → next_acquisition_task - Auto-completeness check (iter_schema_leaves, missing_leaves, acquisition_task_summary)

Condition	Action	Destination
Calculator cap exhausted	decision=finish	`finish`
Calculator completed successfully	decision=finish	`finish`
Blocked calculator → acquisition task found	decision from task	`adjust`
Auto-complete OK, no open tasks	decision=finish	`adjust` (for calc adjust)
Auto-complete not OK, acquisition task found	decision from task	`adjust`
None of the above	pass through	`plan`

State output: schema (composed), dynamic_decompositions, decision (if exiting)

Key design choice: Auto-complete and acquisition task decisions route through adjust rather than directly to their final destination. This ensures calculator decision adjustments (_adjust_calculation_decision) and cap checks still apply. The alternative (routing directly to ask_user/search/finish) would duplicate _adjust_calculation_decision logic.

`plan` — LLM Decision (~40 lines)¶

Replaces plan_node lines 1037-1088.

Pure LLM interaction. No pre-LLM gatekeeping, no post-LLM correction: - Build planner_prompt with full context - Call _llm().structured() with PlannerDecision schema - Handle LLM errors (fallback to finish) - If LLM targets a requires_components field without existing decomposition, generate one and loop back to itself (via conditional edge) to re-prompt with the new recipes

State output: decision (raw LLM output or error fallback), dynamic_decompositions (if augmented)

Conditional edge after plan: - If decomposition was generated and decision.action is still search/ask_user targeting the same field → loop back to plan (re-prompt with updated recipes) - Otherwise → adjust

`adjust` — Decision Correction (~80 lines)¶

Replaces plan_node lines 1090-1192.

Post-LLM decision refinement. This is a pure transformation node — no LLM calls, only state inspection and decision rewriting: - _redirect_derived_direct_decision — component-derived fields → acquisition tasks or reflect - _redirect_premature_ask_for_web_field — web-preferred fields → force search first - Search cap / duplicate query detection → force ask_user - Ask-user cap detection → finish - _adjust_calculation_decision — finish→calculate, blocked→reflect - Logging and event emission

State output: decision (final, corrected), status, schema, dynamic_decompositions

After adjust: route_after_adjust reads decision.action and routes to search/ask_user/calculate/reflect/finish. This replaces the current route_after_plan.

2.4 Edge Changes¶

def _build_state_graph() -> Any:
    builder = StateGraph(State)
    # Node registrations
    builder.add_node("guard", guard_node)
    builder.add_node("prepare", prepare_node)
    builder.add_node("plan", plan_node)
    builder.add_node("adjust", adjust_node)
    builder.add_node("search", search_node)
    builder.add_node("observe", observe_node)
    builder.add_node("calculate", calculate_node)
    builder.add_node("ask_user", ask_user_node)
    builder.add_node("observe_user", observe_user_node)
    builder.add_node("reflect", reflect_node)
    builder.add_node("finish", finish_node)

    builder.add_edge(START, "guard")

    # guard routing
    builder.add_conditional_edges(
        "guard",
        route_after_guard,
        {"prepare": "prepare", "ask_user": "ask_user", "finish": "finish"},
    )

    # prepare routing — three destinations + pass-through to plan
    builder.add_conditional_edges(
        "prepare",
        route_after_prepare,
        {"plan": "plan", "adjust": "adjust", "finish": "finish"},
    )

    # plan → may loop to itself for decomposition, else adjust
    builder.add_conditional_edges(
        "plan",
        route_after_plan,
        {"plan": "plan", "adjust": "adjust"},
    )

    # adjust → final routing to action nodes
    builder.add_conditional_edges(
        "adjust",
        route_after_adjust,
        {
            "search": "search",
            "reflect": "reflect",
            "ask_user": "ask_user",
            "calculate": "calculate",
            "finish": "finish",
        },
    )

    # Remaining edges unchanged from current graph
    builder.add_conditional_edges(
        "search",
        route_after_search,
        {"observe": "observe", "guard": "guard"},
    )
    builder.add_edge("observe", "guard")
    builder.add_edge("calculate", "guard")
    builder.add_edge("ask_user", "observe_user")
    builder.add_edge("observe_user", "guard")
    builder.add_edge("reflect", "guard")
    builder.add_edge("finish", END)
    return builder

Key routing change: Nodes that previously returned to plan now return to guard. This is semantically correct because guard checks the abort/max_iters conditions at the top of every iteration — currently plan_node does this manually. With the decomposed graph, guard becomes the canonical entry point for every iteration.

2.5 New Routing Functions¶

# Current
def route_after_plan(state: State) -> str:  # 4 lines, trivial
    return state["decision"].action

# Proposed
def route_after_guard(state: State) -> str:    # reads decision.action, maps to prepare/ask_user/finish
def route_after_prepare(state: State) -> str:  # reads decision.action, maps to plan/adjust/finish
def route_after_plan(state: State) -> str:     # checks if decomposition was needed, maps to plan/adjust
def route_after_adjust(state: State) -> str:   # reads decision.action, maps to all action nodes

Each routing function is small (3-8 lines) and does one specific mapping. The logic that determines which action to take lives in the nodes, not the routing functions — this is the key architectural improvement over the current state where plan_node both decides AND routes.

3. State Changes¶

3.1 New State Field¶

class State(TypedDict):
    # ... existing fields ...
    recipes: dict[str, FieldAcquisition]  # NEW: cached recipes from build_dynamic_recipes

Currently, recipes (built from build_dynamic_recipes) is a local variable in plan_node that gets recomputed in observe_user_node and other nodes. Adding it to state avoids redundant recomputation and makes it available to all nodes.

3.2 Schema Storage¶

Currently prepare composes a schema_dict from compose_ready_fields and stores it as a local variable. With the decomposed graph, prepare writes the composed schema directly to state["schema"] (via its return dict), so when guard is re-entered on the next iteration, the composed schema is already in state.

This is safe because guard only reads schema to check core.region/core.currency values — it doesn't need the raw/uncomposed schema.

4. Migration Path¶

Phase 1: Extract helper node functions (no graph changes)¶

Extract guard_node() from plan_node lines 846-887
Extract prepare_node() from plan_node lines 889-1035
Extract plan_node() as lines 1037-1088 (pure LLM)
Extract adjust_node() from plan_node lines 1090-1192

Each extraction is a mechanical move with no behavioral changes. Tests pass at each step.

Phase 2: Update graph construction¶

Add new nodes to _build_state_graph()
Replace START → plan edge with START → guard
Replace plan conditional edges with the new routing chain
Update all plan → return edges to guard →
Add recipes to State

Phase 3: Remove old code¶

Remove route_after_plan (replaced by route_after_adjust)
Clean up any dead code

Phase 4: Verify¶

Run tests/planner/test_planner_agent.py — node-level tests
Run tests/planner/test_planner_runner.py — integration tests
Run make test — full suite
Run linter: ruff check . && ruff format --check . && mypy src

5. Trade-offs¶

Advantages¶

Aspect	Before	After
Graph readability	`plan` is a black box; edges tell you nothing	Each node's role is visible at the graph level
Test granularity	Test `plan_node` as one 347-line unit	Test `guard`, `prepare`, `plan`, `adjust` independently
Debugging	Trace through 15 early-return paths in one function	Each node has 1-3 exit paths, easy to isolate
Extensibility	Adding a new gate requires inserting into `plan_node` body	Add a node + edge, no surgery on existing code
Routing transparency	`route_after_plan` is trivial because all logic is hidden	Each routing function has a clear, narrow contract

Disadvantages¶

Concern	Mitigation
More graph nodes (8 → 11)	LangGraph overhead is negligible; the graph is small
More state I/O (nodes pass more state)	LangGraph merges partial state dicts efficiently
More routing functions (2 → 5)	Each is 3-8 lines, total complexity is lower
`recipes` in state (new field)	Small dict, already computed in 2 places; caching avoids redundant work
`guard` re-entry from action nodes	Same pattern as current `plan` re-entry; just a different node name

Risk Assessment¶

Low risk: The decomposition is purely structural — the same helper functions (_needs_region_question, _adjust_calculation_decision, etc.) are called in the same order with the same inputs. No logic changes.
Medium risk: recipes in state must stay synchronized with dynamic_decompositions changes. Mitigated by having prepare and plan both rebuild recipes when decommpos change.
Testing risk: Tests that mock plan_node directly will need updating. Mitigated by Phase 1 extraction before Phase 2 graph changes.

6. Alternatives Considered¶

6.1 Deeper decomposition (16 nodes)¶

Splitting every check into its own node (region_node, currency_node, calc_cap_node, etc.) would create ~16 graph nodes with mostly single-condition exits. Rejected: Too chatty. The LangGraph overhead per node iteration adds up, and many checks share the same output destinations.

6.2 Keep plan_node but add conditional-edge overrides¶

Keep plan_node as-is but add post-plan conditional edges that further refine routing. Rejected: Doesn't solve the monolith problem — plan_node would still have 15 early-return paths. The graph would show routing that doesn't actually happen.

6.3 Move calculator logic into its own subgraph¶

Extract _adjust_calculation_decision and related checks into a calculator_gate subgraph. Rejected: Over-engineering. The calculator lifecycle is a single concern with 3 states (not-run, blocked, success) that maps cleanly into the prepare → adjust pipeline.

6.4 Extract only `guard` (minimal change)¶

Move only lines 846-887 into a separate guard node, keep the rest in plan_node. Rejected: This leaves the bulk of the problem untouched. plan_node would still be ~270 lines handling 10+ concerns.

7. Summary¶

The proposal replaces a single 347-line plan_node with a 4-node pipeline (guard → prepare → plan → adjust) connected by purpose-built conditional edges. The transformation is structural: the same functions are called in the same order. Total new code is ~80 lines of routing functions + ~60 lines of node boilerplate, offset by deleting ~100 lines of embedded early-return logic from the original plan_node. Net code change is approximately neutral, but the architecture becomes significantly more maintainable and testable.

Recommended next step: If this proposal is accepted, begin with Phase 1 (extraction without graph changes) to validate that the decomposition is correct before changing any edges.

Proposal: Decompose plan_node into Graph-Level Logic¶