Day 12. Once an agent is in front of real users, "guardrails" stop being theory — every output has to be safe, on-policy, and free of PII leakage. Picking this up now so it's wired in before scale, not after.
Topics to cover:
- Bedrock Guardrails — content filters, denied topics, PII redaction, contextual grounding checks
- Prompt-level guardrails — system prompt patterns, refusal templates, jailbreak resistance
- Output filtering — post-generation safety checks, regex / model-as-judge, escalation paths
- PII handling — detection, masking, audit logs, GDPR / DPDP basics
- Eval for safety — building a red-team test set, LLM-as-judge for harmful outputs
Plan: Chandana on Bedrock Guardrails + PII / KMS integration, me on prompt-level + output-level guardrails + safety eval, shipping a guardrailed version of one of our existing agents.
Day 12. Once an agent is in front of real users, "guardrails" stop being theory — every output has to be safe, on-policy, and free of PII leakage. Picking this up now so it's wired in before scale, not after.
Topics to cover:
Plan: Chandana on Bedrock Guardrails + PII / KMS integration, me on prompt-level + output-level guardrails + safety eval, shipping a guardrailed version of one of our existing agents.