Cyber Risk & Security Boundaries
src/constants/cyberRiskInstruction.ts:24
This instruction is injected near the top of the system prompt (inlined into the identity section) and uses a precise taxonomy of allowed vs. disallowed security activities rather than a blanket ban. By explicitly naming legitimate contexts (pentesting, CTFs, security research) alongside prohibited activities (DoS, supply chain compromise), it creates a nuanced decision boundary the model can apply consistently. The 'dual-use' framing for tools like C2 frameworks mirrors real-world security policy โ the same tool is acceptable or not depending on authorization context, which the model is taught to evaluate.
Techniques Used
Tags
Appears in use cases
This prompt is a step in curated flows that show how pieces of Claude Code connect for real tasks.