> ## Documentation Index
> Fetch the complete documentation index at: https://opensre.com/docs/llms.txt
> Use this file to discover all available pages before exploring further.

# Interactive shell action policy

# Interactive Shell Action Policy (ADR)

## Status

Superseded — Jun 18, 2026. The declarative-rule-pack deterministic mapper and
the regex-based planner postprocessing overrides described in the original
decision have been removed. See "Decision (current): LLM is the sole tool
selector" below. The original decision is retained for historical context.

## Context

The interactive-shell action policy had grown through layered heuristics in
single modules: a regex/keyword deterministic mapper inferred tools from
free-form text, and planner postprocessing rewrote the model's chosen actions
with more regex. These heuristics competed with the LLM and caused
misclassifications (e.g. "investigate a sample test alert?" being treated as an
informational question instead of running the sample alert), and they were a
recurring source of precedence drift.

## Decision (current): The shell action agent is the sole tool selector

1. There is no regex/keyword intent inference. Non-command turns are
   selected entirely by the shell action agent via native tool-calling.
2. Tool selection is driven by the action-agent system prompt
   (`.../orchestration/action_system_prompt.py`) and the per-tool descriptions
   in the tool catalog (`.../orchestration/tools/*`). Keep both precise — they
   are the only selection signal.
3. The action path does not post-hoc rewrite the model's tool calls. Tool calls
   execute as first-class `AgentTool`s through the shared `core.runtime`
   tool-calling loop; argument shape and availability are enforced by the
   AgentTool runtime contract and per-tool gates.
4. When the action-agent prompt overflows the context window, the turn falls
   through to a conversational reply rather than guessing an action. When the
   action-agent LLM itself is unavailable, the REPL renders and persists a
   failed assistant turn so `/resume` can show the outage.
5. `command_dispatch/detection.py` remains terminal-UI policy only: spinner
   suppression and exclusive-stdin gating for literal command text. It must
   never infer intent from natural language and must not become an action
   execution shortcut.

## What this means for changes

* To change how a phrasing maps to a tool, edit the action-agent system prompt and/or
  the relevant tool description — never add a regex.
* To add a new tool, add it to the tool catalog with a clear, self-describing
  `description` and `input_schema`; the action agent selects it from that text
  and receives it as an AgentTool.
* Live turn scenarios under
  `interactive_shell/harness/tests/scenarios/` are the regression
  surface for action-agent behavior. Deterministic scenarios (`intent_class:
  deterministic`) assert literal command dispatch only.

## Original decision (historical, superseded)

1. Deterministic mapping was split into declarative rule packs with one explicit precedence table.
2. Rule matching windows were named typed strategies instead of inline numeric slices.
3. Planner postprocessing ran as pure transforms over a typed `PlannerState`.
4. Fail-closed policy transforms and normalization transforms were registered separately and executed in one ordered list.
5. Legacy planner-result tuple compatibility was collapsed behind a single adapter.
6. Planner contracts included policy-trace artifacts to detect silent precedence drift.

## Integration awareness and LLM-driven read-only discovery

Addendum — Jun 18, 2026.

Factual questions about live state (for example "is sentry installed?") are
answered without adding keyword/regex rules. Two complementary mechanisms:

1. Context grounding (not action planning). At REPL boot, `repl_main`
   (`interactive_shell/entrypoint.py`) hydrates
   `session.configured_integrations` from the shared
   `configured_integration_services()` helper in `integrations/catalog.py`
   (the same source the welcome banner uses, so they never diverge). The chat
   assistant prompt (`_build_environment_block` in
   `interactive_shell/chat/cli_agent.py`) lists the configured set as
   facts, letting the model answer directly when state is already known.
2. LLM-driven discovery. The action-agent system prompt
   (`.../orchestration/action_system_prompt.py`) lets the model, at its own
   discretion, emit a read-only discovery action (for example
   `slash_invoke("/integrations", ["list"])` or `["verify"]`) to discover the
   answer instead of deflecting. There is no keyword mapping for this — the LLM
   decides. Safety is provided by the existing execution-tier policy in
   `execution_policy.py` (`resolve_slash_execution_tier`): `/integrations`
   (list/show) is `SAFE` and auto-runs, while `/integrations verify` is
   `ELEVATED` and prompts for confirmation. No fail-closed regex rule is
   involved; the action agent decides whether to emit a discovery action and the
   execution tier governs safety.

### Observe→answer summary loop

Addendum — Jun 18, 2026.

When the action agent runs a read-only discovery command to answer a question (e.g.
the user asks "is sentry installed?" and the model runs `/integrations`), the
raw command output (a verification table) is not a direct answer on its own.
The pipeline now follows up with a short assistant pass that summarizes that
output:

1. Read-only discovery slash commands stash a compact text view of what they
   found on `session.last_command_observation`
   (`_record_integrations_observation` in
   `interactive_shell/command_registry/integrations.py`).
2. `handle_message_with_agent` resets that field at the start of every action-agent
   turn and, when a discovery command produced an observation and succeeded,
   calls the conversational assistant with `tool_observation=...`
   (inside the handled-turn observation branch in `pipeline.py`). The assistant
   summarizes the output into a direct answer and is instructed not to emit
   further actions.

This only fires when the action-agent tool path executes a read-only discovery command
and records an observation. The pipeline no longer has a pre-agent deterministic
dispatch branch.

Discovery commands also no longer dump validator stack traces into the REPL: a
vendor/config failure during verification (for example a GitHub MCP `401`) is
logged as a one-line warning instead of a full traceback, because
`report_validation_failure` now defaults to `include_traceback=False` while still
capturing the exception to Sentry.

### Auto-launching interactive setup ("can you configure X?")

Addendum — Jun 18, 2026.

When the user asks to configure, connect, set up, or add an integration
("can you configure sentry?", "connect datadog"), the assistant does not just
print a command to copy — it launches the setup wizard for them. The
conversational assistant emits a `run_interactive` action
(`{"action":"run_interactive","command":"/integrations setup <service>"}`, only
`/integrations setup <service>` or `/mcp connect <server>` are allowed). The
model chooses the service; there is no per-vendor hardcoding.

The setup wizard is a child process that needs exclusive stdin, so it cannot run
inline mid-turn (the live prompt is competing for stdin). Instead the action
queues the command via `session.queue_auto_command(...)`, which prefills the next
prompt and marks it for auto-submit. The prompt refresh hook
(`wire_prompt_refresh` in `prompting/prompt_surface.py`) then submits it, so the
command flows through the normal exclusive-stdin turn path of the REPL
(`turn_needs_exclusive_stdin` recognizes `/integrations setup`) — the only
place an interactive child process gets clean stdin. In a non-TTY/scripted
context (no prompt to submit into) the action degrades to telling the user the
command to run.

### Removal of the planning-stage fail-closed safeguard (v0.1)

Addendum — Jun 18, 2026.

The action agent does not deny a turn. Previously, any clause
the old planner could not map to an executable tool — flagged via the `mark_unhandled`
tool, an `UNHANDLED:` text marker, or an unavailable tool call — collapsed the
whole turn into a hard denial that printed *"I couldn't safely decide actions for
that request."* In practice this fired on legitimate input (most often a
conversational question that embedded a quoted, list-style directive such as
*figure out why X is crashing by querying (a) sentry, (b) github, (c) posthog*),
producing a dead end with no safety benefit.

Every terminal action in v0.1 is **read-only**, so an unmatched, ambiguous, or
chatty clause is not a safety risk. The action agent now:

* runs every clause it *can* map to an executable action, and
* lets everything else fall through to the conversational assistant (or simply
  drops a chatty clause in a compound request).

Removed as part of this change: the `denied` field on `ActionPlanningDecision`,
`enforce_plan_fail_closed_policy`, `normalize_terminal_plan`, `render_plan_denied`,
the `mark_unhandled` planner tool, and the `UNHANDLED:` convention. The
`fail_closed`, `has_unhandled_clause`, and `turn.expected_signals` fields were
also removed from turn scenario fixtures, since the oracle never asserted on
them; the fixture `policy` block now carries a single `executes_terminal_action`
`boolean` (true only when a shell action AgentTool is expected to run).

If write/mutating actions are introduced later, gate them with the
execution-stage confirmation policy (`orchestration/execution_policy.py`), **not**
an action-selection denial.
