API Creation - Part 2 & Frontend Setup

Now that we know how we are storing the contents of the video in the Supabase database via API, now it’s time to work on creating the actual RAG Pipeline along with fetch the response from the database, and starting to create the UI for the same.

Setting the fetch-document api route and service

The queryDocument function is the heart of the document retrieval and question-answering service in our application. It processes a user query within a conversation, retrieves relevant documents using embeddings and vector search, incorporates chat history for context, and generates an answer using a Large Language Model (LLM). The function supports streaming responses (for real-time updates) and regular responses, and stores both user queries and AI answers in a Supabase database.

Inside routes, create a file queryDocumentRoutes.js where you will define the route

  import express from "express";
  import { queryDocument } from "../services/queryDocumentService.js"; //This will be the actual logic we will be implementing

  const router = express.Router();

  // Handle query document route
  router.post('/', async (req, res) => {
      try {
          const result = await queryDocument(req);
          res.setHeader('Content-Type', 'text/event-stream')
          res.setHeader('Cache-Control', 'no-cache')
          res.setHeader('Connection', 'keep-alive')
          res.status(200).json(result);
      } catch (error) {
          console.error("Error in queryDocument: ", error);
          res.status(500).json({
              error: "An error occurred during the request."
          })
      }
});

      export default router;

Now import the route in your index.js entry point for the server.

    import express from "express";
    import cors from "cors";
    import storeDocumentRoute from "./routes/storeDocumentRoutes.js";
    import queryDocumentRoute from './routes/queryDocumentRoutes.js'


    const app = express();

    //Middleware to parse JSON request bodies
    app.use(express.json())

    //Configure and use CORS Middleware
    const corsOptions = {
        origin: "http://localhost:5173",
        methods: ["GET", "POST", "PUT", "DELETE"]
        allowedHeaders = ["Content-Type", "Authorization"]
    }

    app.use(cors(corsOptions))

    app.use("/store-document", storeDocumentRoute)
    app.use('/query-document', queryDocumentRoute)


    app.listen('7004', () => {
        console.log('Server Running on PORT 7004');
    });

    export default app;

Actual Logic Behind the API (RAG)

Now we will start with the logic behind the RAG model, I will be explaining each and everything one by one.

Imports:

    import { ChatPromptTemplate, MessagesPlaceholder } from '@langchain/core/prompts'
    import { HumanMessage, AIMessage } from '@langchain/core/messages'
    import { createStuffDocumentsChain } from 'langchain/chains/combine_documents'
    import { createRetrievalChain } from 'langchain/chains/retrieval'
    import { createHistoryAwareRetriever } from 'langchain/chains/history_aware_retriever'
    import { Readable } from 'stream'

ChatPromptTemplate, MessagesPlaceholder: For dynamic prompt construction.
HumanMessage, AIMessage: For representing chat history.
createStuffDocumentsChain, createRetrievalChain: For building the retrieval and QA chains.
createHistoryAwareRetriever: For making retrieval history-aware.
Readable: For streaming responses.

Input Extraction

    //Get the conversation id and document id from the frontend along with the input query.
    const { conversationId, documentIds, query } = req.body;
    const supabase = createSupabaseClient();

This helps to know what user is asking and which documents to search.

Store User Query

await supabase.from('conversation_messages').insert({
  conversation_id: conversationId,
  role: 'user',
  content: query
});

This will help store the user’s question

Embeddings && LLM Initialization
```
const embeddings = new GoogleGenerativeAIEmbeddings({...})
const llm = new ChatGoogleGenerativeAI({...})
```
This helps prepare models for understanding queries and generating answers. Embeddings will help to find relevant documents; LLM generates human-like responses.

Vector Store Setup

const vectorStoreConfig = { ... }
if (documentIds && Array.isArray(documentIds) && documentIds.length > 0) {
vectorStoreConfig.filter = { document_id: documentIds }
}
  const vectorStore = new SupabaseVectorStore(embeddings, vectorStoreConfig)

Here we filter and search documents using AI-powered similarity to find the most relevant information efficiently.

Prompt Construction

  const contextSystemPrompt = 'Given a chat history and latest user question...'
  const prompt = ChatPromptTemplate.fromMessages([
      ['system', contextSystemPrompt],
      new MessagesPlaceholder('chat_history'),
      ['human', '{input}']
  ])

This will guide the AI on how to handle the question and context with the help of well-crafted prompts to make the AI’s answers accurate and context-aware.

Creating a History-Aware Retriever
```
  const retriever = vectorStore.asRetriever()
  const historyAwareRetriever = await createHistoryAwareRetriever({
  llm,
  retriever,
  rephrasePrompt: prompt
})
```
Creating this history-aware retriever will help find relevant documents while considering the chat history to improve accuracy when users refer to past conversation points.

Answer Generation Chain

  const systemPrompt = 'You are an assistant for question answering tasks...'
  const qAChain = await createStuffDocumentsChain({...})
  const ragChain = await createRetrievalChain({
  retriever: historyAwareRetriever,
  combineDocsChain: qAChain
})

Now we setup a Pipeline that: i. Finds relevant documents. ii. Passes them to the AI for answering. This is done to combine searching and answering in one smooth process which we call (RAG) Retrieval-Augmented Generation

Storing AI Answer

  await supabase.from('conversation_messages').insert({
  conversation_id: conversationId,
  role: 'assistant',
  content: response.answer
  })

We save the AI’s response to maintain a complete, searchable conversation history.

Complete Source Code

import { SupabaseVectorStore } from '@langchain/community/vectorstores/supabase'
import { createSupabaseClient } from '../helpers/supabaseClient.js'
import {
    ChatGoogleGenerativeAI,
    GoogleGenerativeAIEmbeddings
} from '@langchain/google-genai'
import {
    ChatPromptTemplate,
    MessagesPlaceholder
} from '@langchain/core/prompts'
import { createHistoryAwareRetriever } from "langchain/chains/history_aware_retriever";
import { createStuffDocumentsChain } from 'langchain/chains/combine_documents';
import { createRetrievalChain } from "langchain/chains/retrieval";
import { HumanMessage, AIMessage } from '@langchain/core/messages'
import { Readable } from 'stream'

export async function queryDocument (req) {
    try {
        const { conversationId, query, documentIds } = req.body
        const supabase = createSupabaseClient()

        // Store user quey
        await supabase.from('conversation_messages').insert({
            conversation_id: conversationId,
            role: 'user',
            content: query
        });

        // Grab conversation history
        const { data: previousMessages } = await supabase
        .from('conversation_messages')
        .select('*')
        .eq('conversation_id', conversationId)
        .order('created_at', { ascending: false })
        .limit(14)

        // Initialise embedding models and LLM
        const embeddings = new GoogleGenerativeAIEmbeddings({
            model: 'embedding-001', // ✅ Safe default
            apiKey: process.env.GEMINI_API_KEY
        });

        const llm = new ChatGoogleGenerativeAI({
            model: 'gemini-2.0-flash',
            apiKey: process.env.GEMINI_API_KEY,
            streamUsage: true
        });

        // Initialise the vector store
        const vectorStore = new SupabaseVectorStore(embeddings, {
            client: supabase,
            tableName: 'embedded_documents',
            queryName: 'match_documents',
            filter: {
                document_ids: documentIds
            }
        });

        // Change the prompt based on query and documents
        const contextSystemPrompt =
        'Given a chat history and latest user question ' +
        'which might reference context in the chat history ' +
        'formulate a standalone question which can be understood ' +
        'without the chat history. DO NO answer the question, ' +
        'just reformulate it if needed and otherwise return it as is.'

        // A set of instrucions how to rewrite the question
        const prompt = ChatPromptTemplate.fromMessages([
            ['system', contextSystemPrompt],
            new MessagesPlaceholder('chat_history'),
            ['human', '{input}']
        ]);

        // Retrieve the documents
        const retriever = vectorStore.asRetriever()
        const historyRetriver = createHistoryAwareRetriever({
            llm,
            retriever,
            rephrasePrompt: prompt
        });

        // Pass relevant documents to llm
        const systemPrompt =
        'You are an assistant for question answering tasks. ' +
        'Use the following pieces of retrived context to answer ' +
        'the question. ' +
        '\n\n' +
        '{context}'

        const qaPrompt = ChatPromptTemplate.fromMessages([
            ['system', systemPrompt],
            new MessagesPlaceholder('chat_history'),
            ['human', '{input}']
        ]);

        const qAChain = await createStuffDocumentsChain({
            llm,
            prompt: qaPrompt
        });

        const ragChain = await createRetrievalChain({
            retriever: historyRetriver,
            combineDocsChain: qAChain
        });

        const history = (previousMessages || []).map(msg => {
            return msg.role === 'user'
            ? new HumanMessage(msg.content)
            : new AIMessage(msg.content)
        });

        const response = ragChain.stream({
            input: query,
            chat_history: history
        });

        const responseStream = new Readable({
            async read () {
                for await (const chunkk of response) {
                    if (chunkk.answer) {
                        console.log(answer)
                        this.push(`data: ${JSON.stringify({ content: chunkk.answer })}\n\n`)
                    }
                }
            this.push(null)
            }
        });

        return responseStream;
    } catch (error) {
        console.error('❌ queryDocument Error:', error.message)
        throw error
    }
}

Setting Up the Vite Frontend

Now that we have all our APIs reading, we will now start setting up the frontend. We will be using Vite.js for our frontend.

Inside the root folder in your project, create a new vite project.
Terminal window
```
    npm create vite@latest ./
```
It will ask you whether you want to remove existing files and ignore and continue. Choose ignore and continue and use the following configurations:
Terminal window
```
Select a Framework: React
Select a Variant: TypeScript
```
and then it will create a project.

Make the following changes in your package.json file as it will consist of all the dependencies we will be using:

{
 "name": "youtube-rag",
 "private": true,
 "version": "0.0.0",
 "type": "module",
 "scripts": {
 "dev": "vite",
 "build": "tsc -b && vite build",
 "lint": "eslint .",
 "preview": "vite preview"
 },
 "dependencies": {
     "@google/generative-ai": "^0.24.0",
     "@langchain/community": "^0.3.41",
     "@langchain/core": "^0.3.46",
     "@langchain/google-genai": "^0.2.4",
     "@langchain/textsplitters": "^0.1.0",
     "@supabase/supabase-js": "^2.49.4",
     "@tailwindcss/vite": "^4.1.4",
     "dotenv": "^16.5.0",
     "langchain": "^0.3.23",
     "nodemon": "^3.1.9",
     "react": "^19.0.0",
     "react-dom": "^19.0.0",
     "tailwindcss": "^4.1.4",
     "uuid": "^11.1.0"
 },
 "devDependencies": {
     "@eslint/js": "^9.22.0",
     "@types/react": "^19.0.10",
     "@types/react-dom": "^19.0.4",
     "@vitejs/plugin-react": "^4.3.4",
     "eslint": "^9.22.0",
     "eslint-plugin-react-hooks": "^5.2.0",
     "eslint-plugin-react-refresh": "^0.4.19",
     "globals": "^16.0.0",
     "typescript": "~5.7.2",
     "typescript-eslint": "^8.26.1",
     "vite": "^6.3.1",
     "vite-plugin-environment": "^1.1.3"
 }
}

After making the changes, do npm install --legacy-peer-deps and then npm run dev to start the frontend server

The project structure will look somewhat like this:
- Directorypublic
  - logo.svg
- Directorynode_modules
  - module1
  - module2
- Directoryserver
  - node_modules
  - index.js
  - .env
  - package.json
  - package-lock.json
- Directorysrc
  - api
  - assets
  - App.tsx
  - App.css
- package.json
- vite.config.js

Create the Form to Take User Input

Now that we have all our APIs reading, we will now start setting up the frontend. We will be using Vite.js for our frontend.

Inside App.tsx, add this :

import { useState } from "react";
import { v4 as uuidv4 } from "uuid";
import { createSupabaseClient } from "./api/api";

 interface Message {
 role: "user" | "assistant";
 content: string;
 }

 const App = () => {
 const [url, setUrl] = useState("");
 const [loading, setLoading] = useState(false);

 const handleSubmit = async (e: React.FormEvent) => {
 e.preventDefault();

 try {

     setLoading(true);

     // Generate ids for conversation and video
     const convId = uuidv4();
     const docId = uuidv4();

     // Generate conversation
     const supabase = createSupabaseClient();
     await supabase.from("conversations").insert({
         id: convId,
     });

     // Generate document id
     await supabase.from("documents").insert({
         id: docId,
     });

     // Link conversation and document
     await supabase.from("conversation_documents").insert({
         conversation_id: convId,
         document_id: docId
     });


     // Store the document
     await fetch("http://localhost:8000/store-document", {
         method: "POST",
         headers: {
             "Content-Type": "application/json",
         },
         body: JSON.stringify({ url, documentId: docId })
     })
     } catch (error) {
     console.error(error);
     } finally {
     setLoading(false);
     }
 };

 return (
     <div className="min-h-screen bg-gradient-to-br from-gray-900 via-gray-800 to-gray-900 flex flex-col items-center justify-center text-white">
         <div className="bg-gray-800 shadow-lg rounded-lg p-8 w-full max-w-md">
             <h1 className="text-4xl font-extrabold text-center mb-6 text-indigo-400">
                 AI Chat with YouTube
             </h1>
             <form onSubmit={handleSubmit} className="space-y-4">
                 <input
                 type="text"
                 placeholder="Drop a YouTube URL here..."
                 value={url}
                 onChange={(e) => setUrl(e.target.value)}
                 className="w-full px-4 py-2 rounded-lg bg-gray-700 text-white placeholder-gray-400 focus:outline-none focus:ring-2 focus:ring-indigo-500"
                 />
                 <button
                 type="submit"
                 disabled={loading}
                 className={`w-full py-2 rounded-lg font-semibold text-white cursor-pointer ${
                 loading
                 ? "bg-indigo-300 cursor-not-allowed"
                 : "bg-indigo-500 hover:bg-indigo-600"
                 }`}
                 >
                 {loading ? "Processing..." : "Submit"}
                 </button>
             </form>

             {loading && (
                 <div className="mt-4 flex justify-center">
                     <div className="loading-spinner border-t-4 border-indigo-500 rounded-full w-8 h-8 animate-spin"></div>
                 </div>
             )}
         </div>
     </div>
 );
};

export default App;

This will be your form which will take the user input and make the backend api call. You can refer TailwindCSS Docs to initialise tailwindcss in your project.

⚙️ Next Steps

In the next section, we’ll:

Implement the integration between the frontend and backend to implement the working of the RAG Chat.
Deploy the Application.

If you want to know more about this, do checkout our video :