Skip to content

Day 12 — Guardrails & Responsible AI #15

@PunithVT

Description

@PunithVT

Day 12. Once an agent is in front of real users, "guardrails" stop being theory — every output has to be safe, on-policy, and free of PII leakage. Picking this up now so it's wired in before scale, not after.

Topics to cover:

  1. Bedrock Guardrails — content filters, denied topics, PII redaction, contextual grounding checks
  2. Prompt-level guardrails — system prompt patterns, refusal templates, jailbreak resistance
  3. Output filtering — post-generation safety checks, regex / model-as-judge, escalation paths
  4. PII handling — detection, masking, audit logs, GDPR / DPDP basics
  5. Eval for safety — building a red-team test set, LLM-as-judge for harmful outputs

Plan: Chandana on Bedrock Guardrails + PII / KMS integration, me on prompt-level + output-level guardrails + safety eval, shipping a guardrailed version of one of our existing agents.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions