Skip to content

Planner Graph 5.4 Proposal

Problem

plan_node() currently does too many jobs inside one node in src/venturescope/planner/agent.py:846:

  • iteration ticking and terminal checks
  • region/currency bootstrap gating
  • schema composition and dynamic decomposition prep
  • calculator-blocked acquisition recovery
  • deterministic completion checks
  • LLM planning
  • post-LLM policy rewrites and retry caps

That makes the graph in docs/planner-graph-ref/current-graph.md look simple while the real orchestration is hidden inside one node. The result is harder checkpoint reasoning, harder test targeting, and more risk when changing one policy.

Proposal

Move orchestration policy to graph level, but do not explode every rule into its own node. The target shape should use a small decision pipeline:

  1. graph nodes for durable orchestration phases
  2. pure helpers for business rules and schema logic
  3. one slim LLM planning node that only proposes a PlannerDecision
  4. one policy node that normalizes that decision before routing

What should move out of plan_node()

1. Iteration and terminal control -> tick

Move these checks into a dedicated loop-entry node:

  • increment iterations
  • emit the planning-step event
  • short-circuit when planner is already aborted
  • stop on max_iters

All loopbacks should return to tick, not directly to the planner node. That preserves the current "one iteration per decision cycle" behavior used by attempt tracking in search_node(), observe_user_node(), and reflect_node().

2. Bootstrap questions -> bootstrap_gate

Move deterministic region/currency gating into a separate node:

  • _needs_region_question()
  • _needs_currency_question()
  • creation of deterministic PlannerDecision(action="ask_user", ...)

This keeps bootstrap policy visible in the graph instead of being buried as an early return.

3. Schema/decomposition preparation -> prepare_context

Move context preparation before any planning decision:

  • _proactive_decompositions()
  • build_dynamic_recipes()
  • compose_ready_fields()
  • persistence of updated schema and dynamic_decompositions

The node should only orchestrate. Existing helpers should continue to own the domain logic.

4. Deterministic acquisition/calculation routing -> acquisition_gate

Move deterministic routing that does not require the planner LLM into a pre-LLM gate:

  • calculation-attempt cap
  • successful-calculation finish
  • blocked-calculation acquisition selection
  • blocked-field decomposition generation
  • open acquisition task selection
  • auto-finish checks when raw inputs are complete

This is the highest-value extraction because it removes calculator and acquisition policy from the LLM planning path.

5. Decision rewrites and retry policy -> decision_policy

Move post-LLM normalization into one deterministic node:

  • derived-field redirects
  • premature ask_user -> search redirect for web-preferred fields
  • duplicate/capped search fallback
  • ask-user retry cap
  • _adjust_calculation_decision()

Keep this as one policy node. Splitting every rewrite into separate nodes would add graph noise without improving clarity.

What should stay in the planning node

Rename plan_node() to llm_plan_node() and keep only work that truly belongs to the planner model:

  • compute prompt inputs from current state
  • call planner_prompt(...)
  • call _llm().structured(..., schema=PlannerDecision)
  • fall back to finish when structured output is invalid

This node should propose a decision, not run the whole planner.

Target graph shape

flowchart TD
    START --> tick
    tick -->|continue| bootstrap_gate
    tick -->|terminal| finish

    bootstrap_gate -->|needs region/currency| ask_user
    bootstrap_gate -->|ready| prepare_context

    prepare_context --> acquisition_gate
    acquisition_gate -->|deterministic decision| decision_policy
    acquisition_gate -->|needs LLM| llm_plan

    llm_plan --> decision_policy

    decision_policy -->|search| search
    decision_policy -->|ask_user| ask_user
    decision_policy -->|reflect| reflect
    decision_policy -->|calculate| calculate
    decision_policy -->|finish| finish

    search -->|last_observation| observe
    search -->|no observation| tick
    observe --> tick
    calculate --> tick
    reflect --> tick
    ask_user --> observe_user
    observe_user --> tick
    finish --> END

Why this is a better fit

Graph-level visibility

Checkpointed phase boundaries become explicit:

  • loop entry
  • bootstrap interrupt gating
  • state preparation
  • deterministic acquisition/calc routing
  • LLM proposal
  • policy normalization

That matches the actual planner lifecycle better than the current single-node hub.

Better test seams

Current tests in tests/planner/test_planner_decisions.py heavily target plan_node(). After the split, tests can target:

  • tick for iteration/terminal semantics
  • bootstrap_gate for region/currency behavior
  • prepare_context for decomposition/schema composition
  • acquisition_gate for deterministic routing
  • decision_policy for redirect/cap behavior
  • llm_plan_node for prompt + structured-output handling

This isolates policy regressions instead of forcing broad plan_node() fixture coverage.

Better alignment with LangGraph

This follows the usual LangGraph split:

  • nodes do work and persist state updates
  • routing stays explicit in graph edges
  • interrupts remain at the surface node that actually pauses execution

Importantly, conditional edge functions should stay simple and side-effect free. Mutating state in edge functions would make checkpoint behavior harder to reason about.

What should not change

  • ask_user remains the only interrupt() node
  • planner thread namespacing in planner_thread_id() stays unchanged
  • outer run_planner_step() bootstrap/resume contract stays unchanged
  • domain helpers keep owning decomposition, acquisition, validation, and schema merge logic
  • planner still emits exactly one actionable decision per iteration

This proposal is an internal graph refactor, not an outer orchestration redesign.

Migration order

Stage 1: extract tick

Change START -> plan to START -> tick -> plan first. Remove iteration increment and terminal checks from plan_node() but keep the rest intact.

This gives the cleanest first seam with minimal checkpoint churn.

Stage 2: extract bootstrap_gate

Move region/currency gating out next. This preserves current interrupt behavior because the graph still routes through ask_user -> observe_user.

Stage 3: extract prepare_context

Move proactive decomposition and schema composition out before any planning decision.

Stage 4: extract acquisition_gate

Move deterministic calculator and acquisition routing out of the LLM path.

Stage 5: slim to llm_plan_node()

Reduce the old planner node to prompt construction and structured decision generation only.

Stage 6: extract decision_policy

Move all decision rewrites and retry caps into one post-planning policy node.

Only after behavior is stable should tests be renamed/reorganized.

Risks and anti-patterns to avoid

Do not move state mutation into edge functions

Conditional edges should inspect state and return route labels only. Nodes should own mutations so checkpoints capture phase changes explicitly.

Do not increment iterations in every gate

Only tick should change iterations. Otherwise attempt logs and turn-scoped search extraction in run_planner_step() will drift.

Do not turn every policy rule into a node

One decision_policy node is enough. A longer chain like derived_redirect -> source_policy -> retry_policy -> calculator_policy would make the graph harder to follow than it is today.

Do not move interrupts into policy gates

ask_user should remain the single interrupting node. Gate nodes should create decisions, not pause execution themselves.

Recommendation

Adopt the staged graph split above.

The key design rule is:

deterministic orchestration belongs in graph phases; business logic stays in helpers; LLM planning stays small.

That keeps the planner graph honest: the graph shows the lifecycle, while helpers keep the domain complexity out of routing glue.