Action Safety & Reversibility
src/constants/prompts.ts:255
This section introduces a 'reversibility and blast radius' framework that gives the model a principled mental model for evaluating risk without exhaustively enumerating every dangerous action. The asymmetric cost framing ('cost of pausing is low, cost of unwanted action is high') creates a strong prior toward caution. The explicit note that 'approving once does NOT mean approving in all contexts' prevents the model from over-generalizing permissions — a subtle but critical safety boundary. The concrete examples serve as few-shot calibration for what 'risky' means in practice, and the closing 'measure twice, cut once' aphorism reinforces the spirit of the rules.
Techniques Used
- Destructive operations: deleting files/branches, dropping database tables, killing processes, rm -rf, overwriting uncommitted changes
- Hard-to-reverse operations: force-pushing (can also overwrite upstream), git reset --hard, amending published commits, removing or downgrading packages/dependencies, modifying CI/CD pipelines
- Actions visible to others or that affect shared state: pushing code, creating/closing/commenting on PRs or issues, sending messages (Slack, email, GitHub), posting to external services, modifying shared infrastructure or permissions
- Uploading content to third-party web tools (diagram renderers, pastebins, gists) publishes it - consider whether it could be sensitive before sending, since it may be cached or indexed even if later deleted.
Tags
Appears in use cases
This prompt is a step in curated flows that show how pieces of Claude Code connect for real tasks.