06Input and Output Guardrails

Input and Output Guardrails

In this section, we will be adding input and output guardrails to the chat API. This is a very important step, as it will help us to prevent the model from generating inappropriate or harmful content. It will also help us to prevent the model from generating content that is not relevant to the conversation.

Add Input GuardRail Pattern in Chat API Route

app/api/chat/route.ts


const BLOCKED_PATTERNS = [
  /bank/i,
  /account number/i,
  /ifsc/i,
  /password/i,
  /otp/i,
  /credit card/i,
  /debit card/i,
  /ssn/i,
  /aadhar/i,
  /pan card/i,
  /who is/i,
  /ignore previous/i,
  /system prompt/i,
  /you are chatgpt/i,
];

function violatesInputPolicy(query: string): boolean {
  return BLOCKED_PATTERNS.some((p) => p.test(query));
}

const SENSITIVE_CONTEXT_PATTERNS = [
  /d{12,16}/g,          // card-like numbers
  /d{9,12}/g,           // ids
  /account number/i,
  /ifsc/i,
  /password/i,
  /secret/i,
  /token/i,
];

function sanitizeContext(text: string): string {
  let sanitized = text;

  for (const pattern of SENSITIVE_CONTEXT_PATTERNS) {
    sanitized = sanitized.replace(pattern, "[REDACTED]");
  }

  return sanitized;
}

const OUTPUT_BLOCK_PATTERNS = [
  // Loosened to prevent false positives while keeping security intent
  /account number/i,
  /ifsc/i,
  /password/i,
  /credit card/i,
  // Refined regex for credit card numbers to avoid matching random long numbers
  /(?:4[0-9]{12}(?:[0-9]{3})?|5[1-5][0-9]{14}|3[47][0-9]{13}|3(?:0[0-5]|[68][0-9])[0-9]{11}|6(?:011|5[0-9]{2})[0-9]{12}|(?:2131|1800|35d{3})d{11})/,
];

function violatesOutputPolicy(text: string): boolean {
  return OUTPUT_BLOCK_PATTERNS.some((p) => p.test(text));
}

Add this line before saving user message to database, so that the input guardrails are called and context is also passed when LLM call is made.

app/api/chat/route.ts

if (violatesInputPolicy(query)) {
    return new Response(
      JSON.stringify({
        error: "This question is not allowed in your Second Brain.",
      }),
      { status: 403 }
    );
  }
  
const context = sanitizeContext(docs.map((doc, i) => `Source ${i + 1} (${citations[i].filePath}, chunk ${citations[i].chunkIndex}):
${doc}`).join("\n\n"));

Create System Prompt

This System Prompt will be given to the LLM by default on every query.

app/api/chat/route.ts

const systemPrompt = `
ROLE: Private Knowledge Assistant

RULES (NON-NEGOTIABLE):
1. Answer ONLY using the provided Context.
2. If the answer is not explicitly present, reply EXACTLY:
   "I don't have that in my Second Brain yet."
3. Do NOT infer, guess, summarize external knowledge.
4. Do NOT reveal personal, financial, or sensitive information.
5. If the question violates rules, respond:
   "This request is not permitted."
6. If the Context contains a YouTube URL, include it in the answer.
7. Do NOT invent video links.
8. Do NOT summarize video content unless explicitly written in Context.
9. If you find an image (using markdown image syntax) in the Context, include it in your answer.
10. Use the provided "**AI Description**" to answer questions about the image.
11. Do NOT describe the image unless asked, but ALWAYS show it if relevant.

FAILURE TO FOLLOW THESE RULES IS A SECURITY BREACH.`.trim();

Next Steps

In the next section, we’ll:

Parse Image

Learn how we can parse image onto our knowledge base and how can we retrieve the image with the LLM call

If you want to know more about this, do checkout our video guide:

Sessions And Mongodb

Image Parsing