Skip to content

API Creation - Part 2 & Frontend Setup

Now that we know how we are storing the contents of the video in the Supabase database via API, now it’s time to work on creating the actual RAG Pipeline along with fetch the response from the database, and starting to create the UI for the same.

Setting the fetch-document api route and service

Section titled “Setting the fetch-document api route and service”

The queryDocument function is the heart of the document retrieval and question-answering service in our application. It processes a user query within a conversation, retrieves relevant documents using embeddings and vector search, incorporates chat history for context, and generates an answer using a Large Language Model (LLM). The function supports streaming responses (for real-time updates) and regular responses, and stores both user queries and AI answers in a Supabase database.

  1. Inside routes, create a file queryDocumentRoutes.js where you will define the route

    queryDocumentRoutes.js
    import express from "express";
    import { queryDocument } from "../services/queryDocumentService.js"; //This will be the actual logic we will be implementing
    const router = express.Router();
    // Handle query document route
    router.post('/', async (req, res) => {
    try {
    const result = await queryDocument(req);
    res.setHeader('Content-Type', 'text/event-stream')
    res.setHeader('Cache-Control', 'no-cache')
    res.setHeader('Connection', 'keep-alive')
    res.status(200).json(result);
    } catch (error) {
    console.error("Error in queryDocument: ", error);
    res.status(500).json({
    error: "An error occurred during the request."
    })
    }
    });
    export default router;
  2. Now import the route in your index.js entry point for the server.

    index.js
    import express from "express";
    import cors from "cors";
    import storeDocumentRoute from "./routes/storeDocumentRoutes.js";
    import queryDocumentRoute from './routes/queryDocumentRoutes.js'
    const app = express();
    //Middleware to parse JSON request bodies
    app.use(express.json())
    //Configure and use CORS Middleware
    const corsOptions = {
    origin: "http://localhost:5173",
    methods: ["GET", "POST", "PUT", "DELETE"]
    allowedHeaders = ["Content-Type", "Authorization"]
    }
    app.use(cors(corsOptions))
    app.use("/store-document", storeDocumentRoute)
    app.use('/query-document', queryDocumentRoute)
    app.listen('7004', () => {
    console.log('Server Running on PORT 7004');
    });
    export default app;

Now we will start with the logic behind the RAG model, I will be explaining each and everything one by one.

  1. Imports:

    import { ChatPromptTemplate, MessagesPlaceholder } from '@langchain/core/prompts'
    import { HumanMessage, AIMessage } from '@langchain/core/messages'
    import { createStuffDocumentsChain } from 'langchain/chains/combine_documents'
    import { createRetrievalChain } from 'langchain/chains/retrieval'
    import { createHistoryAwareRetriever } from 'langchain/chains/history_aware_retriever'
    import { Readable } from 'stream'
    • ChatPromptTemplate, MessagesPlaceholder: For dynamic prompt construction.
    • HumanMessage, AIMessage: For representing chat history.
    • createStuffDocumentsChain, createRetrievalChain: For building the retrieval and QA chains.
    • createHistoryAwareRetriever: For making retrieval history-aware.
    • Readable: For streaming responses.
  2. Input Extraction

    //Get the conversation id and document id from the frontend along with the input query.
    const { conversationId, documentIds, query } = req.body;
    const supabase = createSupabaseClient();

    This helps to know what user is asking and which documents to search.

  3. Store User Query

    await supabase.from('conversation_messages').insert({
    conversation_id: conversationId,
    role: 'user',
    content: query
    });

    This will help store the user’s question

  4. Embeddings && LLM Initialization

    const embeddings = new GoogleGenerativeAIEmbeddings({...})
    const llm = new ChatGoogleGenerativeAI({...})

    This helps prepare models for understanding queries and generating answers. Embeddings will help to find relevant documents; LLM generates human-like responses.

  5. Vector Store Setup

    const vectorStoreConfig = { ... }
    if (documentIds && Array.isArray(documentIds) && documentIds.length > 0) {
    vectorStoreConfig.filter = { document_id: documentIds }
    }
    const vectorStore = new SupabaseVectorStore(embeddings, vectorStoreConfig)

    Here we filter and search documents using AI-powered similarity to find the most relevant information efficiently.

  6. Prompt Construction

    const contextSystemPrompt = 'Given a chat history and latest user question...'
    const prompt = ChatPromptTemplate.fromMessages([
    ['system', contextSystemPrompt],
    new MessagesPlaceholder('chat_history'),
    ['human', '{input}']
    ])

    This will guide the AI on how to handle the question and context with the help of well-crafted prompts to make the AI’s answers accurate and context-aware.

  7. Creating a History-Aware Retriever

    const retriever = vectorStore.asRetriever()
    const historyAwareRetriever = await createHistoryAwareRetriever({
    llm,
    retriever,
    rephrasePrompt: prompt
    })

    Creating this history-aware retriever will help find relevant documents while considering the chat history to improve accuracy when users refer to past conversation points.

  8. Answer Generation Chain

    const systemPrompt = 'You are an assistant for question answering tasks...'
    const qAChain = await createStuffDocumentsChain({...})
    const ragChain = await createRetrievalChain({
    retriever: historyAwareRetriever,
    combineDocsChain: qAChain
    })

    Now we setup a Pipeline that: i. Finds relevant documents. ii. Passes them to the AI for answering. This is done to combine searching and answering in one smooth process which we call (RAG) Retrieval-Augmented Generation

  9. Storing AI Answer

    await supabase.from('conversation_messages').insert({
    conversation_id: conversationId,
    role: 'assistant',
    content: response.answer
    })

    We save the AI’s response to maintain a complete, searchable conversation history.


    queryDocumentService.js
    import { SupabaseVectorStore } from '@langchain/community/vectorstores/supabase'
    import { createSupabaseClient } from '../helpers/supabaseClient.js'
    import {
    ChatGoogleGenerativeAI,
    GoogleGenerativeAIEmbeddings
    } from '@langchain/google-genai'
    import {
    ChatPromptTemplate,
    MessagesPlaceholder
    } from '@langchain/core/prompts'
    import { createHistoryAwareRetriever } from "langchain/chains/history_aware_retriever";
    import { createStuffDocumentsChain } from 'langchain/chains/combine_documents';
    import { createRetrievalChain } from "langchain/chains/retrieval";
    import { HumanMessage, AIMessage } from '@langchain/core/messages'
    import { Readable } from 'stream'
    export async function queryDocument (req) {
    try {
    const { conversationId, query, documentIds } = req.body
    const supabase = createSupabaseClient()
    // Store user quey
    await supabase.from('conversation_messages').insert({
    conversation_id: conversationId,
    role: 'user',
    content: query
    });
    // Grab conversation history
    const { data: previousMessages } = await supabase
    .from('conversation_messages')
    .select('*')
    .eq('conversation_id', conversationId)
    .order('created_at', { ascending: false })
    .limit(14)
    // Initialise embedding models and LLM
    const embeddings = new GoogleGenerativeAIEmbeddings({
    model: 'embedding-001', // ✅ Safe default
    apiKey: process.env.GEMINI_API_KEY
    });
    const llm = new ChatGoogleGenerativeAI({
    model: 'gemini-2.0-flash',
    apiKey: process.env.GEMINI_API_KEY,
    streamUsage: true
    });
    // Initialise the vector store
    const vectorStore = new SupabaseVectorStore(embeddings, {
    client: supabase,
    tableName: 'embedded_documents',
    queryName: 'match_documents',
    filter: {
    document_ids: documentIds
    }
    });
    // Change the prompt based on query and documents
    const contextSystemPrompt =
    'Given a chat history and latest user question ' +
    'which might reference context in the chat history ' +
    'formulate a standalone question which can be understood ' +
    'without the chat history. DO NO answer the question, ' +
    'just reformulate it if needed and otherwise return it as is.'
    // A set of instrucions how to rewrite the question
    const prompt = ChatPromptTemplate.fromMessages([
    ['system', contextSystemPrompt],
    new MessagesPlaceholder('chat_history'),
    ['human', '{input}']
    ]);
    // Retrieve the documents
    const retriever = vectorStore.asRetriever()
    const historyRetriver = createHistoryAwareRetriever({
    llm,
    retriever,
    rephrasePrompt: prompt
    });
    // Pass relevant documents to llm
    const systemPrompt =
    'You are an assistant for question answering tasks. ' +
    'Use the following pieces of retrived context to answer ' +
    'the question. ' +
    '\n\n' +
    '{context}'
    const qaPrompt = ChatPromptTemplate.fromMessages([
    ['system', systemPrompt],
    new MessagesPlaceholder('chat_history'),
    ['human', '{input}']
    ]);
    const qAChain = await createStuffDocumentsChain({
    llm,
    prompt: qaPrompt
    });
    const ragChain = await createRetrievalChain({
    retriever: historyRetriver,
    combineDocsChain: qAChain
    });
    const history = (previousMessages || []).map(msg => {
    return msg.role === 'user'
    ? new HumanMessage(msg.content)
    : new AIMessage(msg.content)
    });
    const response = ragChain.stream({
    input: query,
    chat_history: history
    });
    const responseStream = new Readable({
    async read () {
    for await (const chunkk of response) {
    if (chunkk.answer) {
    console.log(answer)
    this.push(`data: ${JSON.stringify({ content: chunkk.answer })}\n\n`)
    }
    }
    this.push(null)
    }
    });
    return responseStream;
    } catch (error) {
    console.error('❌ queryDocument Error:', error.message)
    throw error
    }
    }

Now that we have all our APIs reading, we will now start setting up the frontend. We will be using Vite.js for our frontend.

  1. Inside the root folder in your project, create a new vite project.

    Terminal window
    npm create vite@latest ./
  2. It will ask you whether you want to remove existing files and ignore and continue. Choose ignore and continue and use the following configurations:

    Terminal window
    Select a Framework: React
    Select a Variant: TypeScript

    and then it will create a project.

  3. Make the following changes in your package.json file as it will consist of all the dependencies we will be using:

    Terminal window
    {
    "name": "youtube-rag",
    "private": true,
    "version": "0.0.0",
    "type": "module",
    "scripts": {
    "dev": "vite",
    "build": "tsc -b && vite build",
    "lint": "eslint .",
    "preview": "vite preview"
    },
    "dependencies": {
    "@google/generative-ai": "^0.24.0",
    "@langchain/community": "^0.3.41",
    "@langchain/core": "^0.3.46",
    "@langchain/google-genai": "^0.2.4",
    "@langchain/textsplitters": "^0.1.0",
    "@supabase/supabase-js": "^2.49.4",
    "@tailwindcss/vite": "^4.1.4",
    "dotenv": "^16.5.0",
    "langchain": "^0.3.23",
    "nodemon": "^3.1.9",
    "react": "^19.0.0",
    "react-dom": "^19.0.0",
    "tailwindcss": "^4.1.4",
    "uuid": "^11.1.0"
    },
    "devDependencies": {
    "@eslint/js": "^9.22.0",
    "@types/react": "^19.0.10",
    "@types/react-dom": "^19.0.4",
    "@vitejs/plugin-react": "^4.3.4",
    "eslint": "^9.22.0",
    "eslint-plugin-react-hooks": "^5.2.0",
    "eslint-plugin-react-refresh": "^0.4.19",
    "globals": "^16.0.0",
    "typescript": "~5.7.2",
    "typescript-eslint": "^8.26.1",
    "vite": "^6.3.1",
    "vite-plugin-environment": "^1.1.3"
    }
    }

    After making the changes, do npm install --legacy-peer-deps and then npm run dev to start the frontend server

  4. The project structure will look somewhat like this:

    • Directorypublic
      • logo.svg
    • Directorynode_modules
      • module1
      • module2
    • Directoryserver
      • node_modules
      • index.js
      • .env
      • package.json
      • package-lock.json
    • Directorysrc
      • api
      • assets
      • App.tsx
      • App.css
    • package.json
    • vite.config.js

Now that we have all our APIs reading, we will now start setting up the frontend. We will be using Vite.js for our frontend.

    Inside App.tsx, add this :

    App.tsx
    import { useState } from "react";
    import { v4 as uuidv4 } from "uuid";
    import { createSupabaseClient } from "./api/api";
    interface Message {
    role: "user" | "assistant";
    content: string;
    }
    const App = () => {
    const [url, setUrl] = useState("");
    const [loading, setLoading] = useState(false);
    const handleSubmit = async (e: React.FormEvent) => {
    e.preventDefault();
    try {
    setLoading(true);
    // Generate ids for conversation and video
    const convId = uuidv4();
    const docId = uuidv4();
    // Generate conversation
    const supabase = createSupabaseClient();
    await supabase.from("conversations").insert({
    id: convId,
    });
    // Generate document id
    await supabase.from("documents").insert({
    id: docId,
    });
    // Link conversation and document
    await supabase.from("conversation_documents").insert({
    conversation_id: convId,
    document_id: docId
    });
    // Store the document
    await fetch("http://localhost:8000/store-document", {
    method: "POST",
    headers: {
    "Content-Type": "application/json",
    },
    body: JSON.stringify({ url, documentId: docId })
    })
    } catch (error) {
    console.error(error);
    } finally {
    setLoading(false);
    }
    };
    return (
    <div className="min-h-screen bg-gradient-to-br from-gray-900 via-gray-800 to-gray-900 flex flex-col items-center justify-center text-white">
    <div className="bg-gray-800 shadow-lg rounded-lg p-8 w-full max-w-md">
    <h1 className="text-4xl font-extrabold text-center mb-6 text-indigo-400">
    AI Chat with YouTube
    </h1>
    <form onSubmit={handleSubmit} className="space-y-4">
    <input
    type="text"
    placeholder="Drop a YouTube URL here..."
    value={url}
    onChange={(e) => setUrl(e.target.value)}
    className="w-full px-4 py-2 rounded-lg bg-gray-700 text-white placeholder-gray-400 focus:outline-none focus:ring-2 focus:ring-indigo-500"
    />
    <button
    type="submit"
    disabled={loading}
    className={`w-full py-2 rounded-lg font-semibold text-white cursor-pointer ${
    loading
    ? "bg-indigo-300 cursor-not-allowed"
    : "bg-indigo-500 hover:bg-indigo-600"
    }`}
    >
    {loading ? "Processing..." : "Submit"}
    </button>
    </form>
    {loading && (
    <div className="mt-4 flex justify-center">
    <div className="loading-spinner border-t-4 border-indigo-500 rounded-full w-8 h-8 animate-spin"></div>
    </div>
    )}
    </div>
    </div>
    );
    };
    export default App;

    This will be your form which will take the user input and make the backend api call. You can refer TailwindCSS Docs to initialise tailwindcss in your project.


In the next section, we’ll:

  • Implement the integration between the frontend and backend to implement the working of the RAG Chat.
  • Deploy the Application.

If you want to know more about this, do checkout our video :