Proposal 5.5: Move Planner Control Logic to Graph Level¶

Status¶

Proposed

Context¶

current-graph.md shows the planner graph as a small ReAct-style loop:

START -> plan
plan -> search | ask_user | reflect | calculate | finish
search -> observe | plan
observe -> plan
calculate -> plan
ask_user -> observe_user -> plan
reflect -> plan
finish -> END

The visible graph is simple, but most of the real orchestration is hidden inside plan_node() in src/venturescope/planner/agent.py. Today plan_node() is not only the LLM planning node. It also:

increments iteration count and handles iteration caps;
asks mandatory region and currency questions before normal planning;
generates and composes dynamic component decompositions;
applies calculator preconditions and calculator stop conditions;
selects acquisition tasks after blocked calculator runs;
detects completion and rewrites finish to calculate when needed;
calls the planner LLM;
rewrites unsafe LLM decisions, including aggregate-to-component redirects and web-preferred ask_user redirects;
enforces search and ask-user retry caps.

This makes the graph diagram less useful than it should be: the graph says "plan -> action", but the plan node contains several deterministic gates and fallback routes that are effectively graph transitions. The result is harder to test, harder to reason about, and easier to regress when adding new acquisition policy such as calculator-backed component loops or U+ref benchmark hops.

Decision¶

Move deterministic orchestration out of plan_node() and into explicit graph nodes and conditional edges. Keep plan_node() responsible for only one thing: building the planner prompt and returning the LLM's raw PlannerDecision when no deterministic graph-level gate has already selected the next action.

A. Target graph shape¶

Replace the current START -> plan -> action topology with a control pipeline:

flowchart TD
    planner_start([START]) --> enter_iteration[enter_iteration]

    enter_iteration -->|status aborted or max_iters exceeded| finish[finish]
    enter_iteration --> prepare_schema[prepare_schema]

    prepare_schema -->|region missing| require_region[require_region]
    prepare_schema -->|currency missing| require_currency[require_currency]
    prepare_schema -->|calculator cap hit| finish
    prepare_schema -->|calculator already current| finish
    prepare_schema --> acquisition_gate[acquisition_gate]

    require_region --> ask_user[ask_user / interrupt]
    require_currency --> ask_user

    acquisition_gate -->|component task selected| normalize_decision[normalize_decision]
    acquisition_gate -->|all raw inputs ready| maybe_calculate[maybe_calculate]
    acquisition_gate -->|LLM needed| plan[plan]

    plan --> normalize_decision
    normalize_decision --> retry_gate[retry_gate]
    retry_gate --> maybe_calculate

    maybe_calculate -->|calculator required| calculate[calculate]
    maybe_calculate -->|search| search[search]
    maybe_calculate -->|ask_user| ask_user
    maybe_calculate -->|reflect| reflect[reflect]
    maybe_calculate -->|finish| finish

    search -->|last_observation present| observe[observe]
    search -->|no hits or backend failure| enter_iteration
    observe --> enter_iteration
    calculate --> enter_iteration
    ask_user --> observe_user[observe_user]
    observe_user --> enter_iteration
    reflect --> enter_iteration
    finish --> planner_end([END])

The important change is not the exact node names. The important change is that the graph exposes the actual control stages that currently happen inside plan_node().

B. Node responsibility split¶

`enter_iteration`¶

Owns iteration lifecycle only.

increments iterations once per planner loop;
routes directly to finish when status == "aborted";
routes directly to finish when iterations > settings.max_iters;
emits the "planning step" progress event.

It should not inspect schema completeness, profile recipes, calculator status, or LLM output.

`prepare_schema`¶

Owns deterministic state preparation before any routing decision.

ensures mandatory dynamic decompositions exist for known requires-components paths;
builds dynamic acquisition recipes from dynamic_decompositions and static_field_acquisition;
composes ready component fields into aggregate parameters;
writes schema and dynamic_decompositions back when they changed.

This node turns the current inline dynamic_decomps, build_dynamic_recipes(), and compose_ready_fields() block into a reusable graph stage. Downstream nodes can then assume schema and recipes are current.

`require_region` and `require_currency`¶

Own mandatory bootstrap questions.

create the PlannerDecision(action="ask_user", target_field="core.region") when region is missing and retries remain;
create the PlannerDecision(action="ask_user", target_field="core.currency") when currency is missing and retries remain;
route to the existing ask_user node.

This removes the special first-question branches from plan_node() while keeping the existing observe_user_node() handling for region and currency answers.

`acquisition_gate`¶

Owns deterministic field/component task selection.

if the last calculator run was BLOCKED, call next_acquisition_task() with calculator errors prioritized;
if no actionable missing fields remain but open component tasks exist, select the next component task;
if all raw inputs are collected and no acquisition tasks remain, emit PlannerDecision(action="finish", reasoning="All raw inputs collected.");
otherwise route to plan.

This stage makes acquisition policy a graph-level gate rather than a hidden precondition inside the LLM planner.

`plan`¶

Owns only the LLM planning decision.

builds planner_prompt() from the current state;
calls _llm().structured(..., schema=PlannerDecision);
on structured-output failure, emits a finish decision and marks the failure so downstream calculator rewrite logic does not create an LLM failure loop.

It should not enforce calculator caps, search caps, ask-user caps, direct/derived redirects, or mandatory region/currency questions.

`normalize_decision`¶

Owns deterministic correction of a proposed decision.

rewrites direct aggregate decisions for derived fields into component tasks;
forces search when a web-preferred field was incorrectly targeted with ask_user before any search attempt;
generates a decomposition if the LLM targets a requires-components field that has not yet been decomposed;
adds blocked-calculator wording to user questions when needed.

This preserves current safety behavior while making it explicit that these are policy corrections applied after a decision is proposed, not part of planning.

`retry_gate`¶

Owns per-field retry and fallback policy.

if a search decision repeats an existing query or exceeds max_attempts_per_field, rewrite to the appropriate component ask_user, a direct ask_user, or reflect when no component task remains;
if an ask-user decision exceeds max_attempts_per_field, rewrite to finish and mark status="aborted";
otherwise pass the decision through unchanged.

This makes retry limits visible in the graph instead of burying them after LLM planning.

`maybe_calculate`¶

Owns calculator routing.

if the current decision is finish, the profile has a calculator, calculation has not succeeded, and the calculation cap is not reached, rewrite to calculate;
if the calculator is blocked and the decision is still finish, rewrite to reflect unless the LLM itself failed;
route the normalized decision to search, ask_user, reflect, calculate, or finish.

This replaces _adjust_calculation_decision() as hidden plan-node logic.

C. State changes¶

Keep state changes minimal.

Recommended additions to State / PlannerState:

llm_failed: bool or decision_origin: Literal["deterministic", "llm", "llm_error"] so maybe_calculate can preserve the current safeguard: a failed planner LLM should finish rather than redirect to reflect and fail again.

Optional additions if implementation readability needs them:

prepared_recipes: dict[str, FieldAcquisition] should not be added unless serialization of FieldAcquisition is proven safe and necessary. Prefer rebuilding recipes from dynamic_decompositions and static_field_acquisition in each node that needs them.
route_reason: str can help tests and event logs, but should not become part of user-visible state.

No SQL persistence changes are required. Planner state remains checkpoint-owned.

D. Keep these responsibilities out of the outer conversation graph¶

This proposal moves logic to the planner subgraph level, not to conversation/graph.py.

The outer graph should remain:

route_if_needed -> planner_step -> persist -> END

The planner still owns attempts, decisions, search results, component decompositions, calculator feedback, and user interrupts. Moving these concerns to the outer graph would revive the legacy question-loop ownership that the project explicitly avoids.

Migration plan¶

Step 1: Extract pure helpers without changing graph shape¶

Create small helper functions that mirror the future node boundaries:

_enter_iteration(state) -> dict[str, Any]
_prepare_schema(state) -> dict[str, Any]
_region_decision(state) -> PlannerDecision | None
_currency_decision(state) -> PlannerDecision | None
_acquisition_decision(state) -> PlannerDecision | None
_normalize_decision(state, decision) -> dict[str, Any]
_apply_retry_policy(state, decision) -> dict[str, Any]
_apply_calculator_policy(state, decision) -> PlannerDecision

Initially call them from plan_node() in the current order. This reduces risk and gives tests stable seams before changing graph topology.

Step 2: Add graph nodes one group at a time¶

Introduce nodes in this order:

enter_iteration
prepare_schema
require_region / require_currency
acquisition_gate
normalize_decision
retry_gate
maybe_calculate

After each group, preserve the existing public graph surface:

build_planner_graph(checkpointer) remains the builder entry point;
run_planner_step() still invokes bootstrap and resume in the same way;
interrupt payload shape from ask_user_node() is unchanged;
PlannerDecision remains the action carrier used by action nodes.

Step 3: Rename `plan_node()` only after behavior is stable¶

Once deterministic gates are graph nodes, plan_node() can stay named plan_node() but should be short and LLM-only. Avoid renaming during the same change unless tests and docs are updated together; existing tests import plan_node directly.

Step 4: Update reference docs¶

After implementation, replace current-graph.md with the new Mermaid graph and move this proposal into an accepted ADR or link it from the planner graph reference.

Tests to update or add¶

Existing tests in tests/planner/test_planner_decisions.py currently exercise many graph-level policies through plan_node(). Move those assertions toward the new policy nodes:

calculator-before-finish routes through maybe_calculate;
search cap fallback routes through retry_gate;
direct aggregate ask/search redirects route through normalize_decision;
blocked calculator component selection routes through acquisition_gate;
region and currency first-question behavior routes through require_region and require_currency;
plan_node() tests should focus on prompt construction, LLM structured output, and LLM failure fallback only.

Keep integration tests for run_planner_step() to verify checkpoint resume, interrupt behavior, and final status still work through the compiled graph.

Consequences¶

Positive¶

Graph matches behavior. The Mermaid graph becomes a useful map of real control flow instead of hiding most routing inside plan.
Smaller LLM node. plan becomes easier to reason about because it only asks the model for a decision when deterministic gates cannot decide.
Better test seams. Retry caps, calculator routing, acquisition policy, and bootstrap questions can be unit-tested without scripting planner LLM responses.
Safer future policy work. New graph-level policies such as U+ref benchmark hops can be added as nodes or gates without enlarging plan_node() again.
Clearer failure behavior. LLM-output failures, calculator-blocked loops, and max-iteration aborts become explicit transitions.

Negative¶

More nodes and edges. The graph becomes visually larger even though the actual behavior is not more complex.
More state handoff discipline. Each node must write only the fields it owns; accidental writes to decision, schema, or status will be easier to introduce if node contracts are not kept tight.
Migration risk. Existing tests import plan_node() and assume it performs all deterministic policies. The refactor should be staged to avoid a large behavior-changing rewrite.
Potential checkpoint compatibility concerns. Adding optional state fields is safe, but removing or changing existing fields in PlannerState would risk old planner checkpoints. This proposal avoids removals.

Non-goals¶

Do not move planner logic into the outer conversation graph.
Do not redesign PlannerDecision action names.
Do not change search backend behavior.
Do not change schema merge or calculator semantics.
Do not add SQL persistence for planner internals.

Source of truth while implementing¶

Current graph reference: docs/planner-graph-ref/current-graph.md
Current implementation: src/venturescope/planner/agent.py
Planner invocation wrapper: src/venturescope/planner/__init__.py
Outer graph integration boundary: src/venturescope/conversation/graph.py
Planner state contract: src/venturescope/planner/schema.py
Acquisition helpers: src/venturescope/planner/acquisition.py
Planner prompt: src/venturescope/planner/prompts.py
Planner decision tests: tests/planner/test_planner_decisions.py
Acquisition tests: tests/planner/test_acquisition.py