04API LAYER & FRONTEND

Query API & RAG Pipeline

Now that we know how we are storing the contents of the video in the Supabase database, it's time to work on creating the actual RAG Pipeline.


Setting the fetch-document api route and service

The queryDocument function is the heart of the document retrieval and question-answering service in our application. It processes a user query within a conversation, retrieves relevant documents using embeddings and vector search, incorporates chat history for context, and generates an answer using a Large Language Model (LLM).

  1. Create the Query Route

    Inside routes, create queryDocumentRoutes.js.

    routes/queryDocumentRoutes.js
    import express from "express";
    import { queryDocument } from "../services/queryDocumentService.js";
    
    const router = express.Router();
    
    router.post('/', async (req, res) => {
        try {
            const result = await queryDocument(req);
            res.setHeader('Content-Type', 'text/event-stream');
            res.setHeader('Cache-Control', 'no-cache');
            res.setHeader('Connection', 'keep-alive');
            result.pipe(res);
        } catch (error) {
            console.error("Error in queryDocument: ", error);
            res.status(500).json({
                error: "An error occurred during the request."
            });
        }
    });
    
    export default router;
  2. Register the Route in index.js
    index.js
    import express from "express";
    import cors from "cors";
    import storeDocumentRoute from "./routes/storeDocumentRoutes.js";
    import queryDocumentRoute from './routes/queryDocumentRoutes.js'
    
    const app = express();
    
    app.use(express.json());
    
    const corsOptions = {
        origin: "http://localhost:5173",
        methods: ["GET", "POST", "PUT", "DELETE"],
        allowedHeaders: ["Content-Type", "Authorization"]
    }
    
    app.use(cors(corsOptions));
    
    app.use("/store-document", storeDocumentRoute);
    app.use('/query-document', queryDocumentRoute);
    
    app.listen('7004', () => {
        console.log('Server Running on PORT 7004');
    });
    
    export default app;

Actual Logic Behind the API (RAG)

Let's break down the logic behind the RAG model step by step.

Input Extraction

Capture user query, conversation ID, and relevant document IDs from the request.

Vector Search

Filter and search documents using AI-powered similarity to find context.

History Aware

Consider previous messages to reformulate standalone questions.

RAG Chain

Combine retrieval and generation into a smooth answering process.

  1. Imports
    Imports
    import { ChatPromptTemplate, MessagesPlaceholder } from '@langchain/core/prompts'
    import { HumanMessage, AIMessage } from '@langchain/core/messages'
    import { createStuffDocumentsChain } from 'langchain/chains/combine_documents'
    import { createRetrievalChain } from 'langchain/chains/retrieval'
    import { createHistoryAwareRetriever } from 'langchain/chains/history_aware_retriever'
    import { Readable } from 'stream'
  2. Input & DB Interaction
    Extraction
    const { conversationId, documentIds, query } = req.body;
    const supabase = createSupabaseClient();
    
    await supabase.from('conversation_messages').insert({
      conversation_id: conversationId,
      role: 'user',
      content: query
    });
  3. RAG Pipeline construction
    The Pipeline
    // 1. History Aware Retriever
    const historyAwareRetriever = await createHistoryAwareRetriever({
      llm,
      retriever: vectorStore.asRetriever(),
      rephrasePrompt: prompt
    });
    
    // 2. QA Chain
    const qaChain = await createStuffDocumentsChain({ llm, prompt: qaPrompt });
    
    // 3. Retrieval Chain
    const ragChain = await createRetrievalChain({
      retriever: historyAwareRetriever,
      combineDocsChain: qaChain
    });

Complete Source Code

services/queryDocumentService.js
import { SupabaseVectorStore } from '@langchain/community/vectorstores/supabase'
import { createSupabaseClient } from '../helpers/supabaseClient.js'
import { ChatGoogleGenerativeAI, GoogleGenerativeAIEmbeddings } from '@langchain/google-genai'
import { ChatPromptTemplate, MessagesPlaceholder } from '@langchain/core/prompts'
import { createHistoryAwareRetriever } from "langchain/chains/history_aware_retriever";
import { createStuffDocumentsChain } from 'langchain/chains/combine_documents';
import { createRetrievalChain } from "langchain/chains/retrieval";
import { HumanMessage, AIMessage } from '@langchain/core/messages'
import { Readable } from 'stream'

export async function queryDocument (req) {
    try {
        const { conversationId, query, documentIds } = req.body
        const supabase = createSupabaseClient()

        // Store user query
        await supabase.from('conversation_messages').insert({
            conversation_id: conversationId,
            role: 'user',
            content: query
        });

        // Grab conversation history
        const { data: previousMessages } = await supabase
        .from('conversation_messages')
        .select('*')
        .eq('conversation_id', conversationId)
        .order('created_at', { ascending: false })
        .limit(14)

        // Initialise embedding models and LLM
        const embeddings = new GoogleGenerativeAIEmbeddings({
            model: 'embedding-001'
        });

        const llm = new ChatGoogleGenerativeAI({
            model: 'gemini-2.0-flash',
            apiKey: process.env.GEMINI_API_KEY,
            streamUsage: true
        });

        const vectorStore = new SupabaseVectorStore(embeddings, {
            client: supabase,
            tableName: 'embedded_documents',
            queryName: 'match_documents',
            filter: { document_id: documentIds }
        });

        const contextSystemPrompt = "Given a chat history and latest user question... standalone question."
        const prompt = ChatPromptTemplate.fromMessages([
            ['system', contextSystemPrompt],
            new MessagesPlaceholder('chat_history'),
            ['human', '{input}']
        ]);

        const historyAwareRetriever = await createHistoryAwareRetriever({
            llm,
            retriever: vectorStore.asRetriever(),
            rephrasePrompt: prompt
        });

        const qaPrompt = ChatPromptTemplate.fromMessages([
            ['system', "You are an AI assistant using context: {context}"],
            new MessagesPlaceholder('chat_history'),
            ['human', '{input}']
        ]);

        const qaChain = await createStuffDocumentsChain({ llm, prompt: qaPrompt });
        const ragChain = await createRetrievalChain({
            retriever: historyAwareRetriever,
            combineDocsChain: qaChain
        });

        const history = (previousMessages || []).map(msg => 
            msg.role === 'user' ? new HumanMessage(msg.content) : new AIMessage(msg.content)
        );

        const response = await ragChain.stream({
            input: query,
            chat_history: history
        });

        return new Readable({
            async read() {
                for await (const chunk of response) {
                    if (chunk.answer) {
                        this.push(`data: ${JSON.stringify({ content: chunk.answer })}\n\n`);
                    }
                }
                this.push(null);
            }
        });
    } catch (error) {
        console.error('❌ queryDocument Error:', error.message);
        throw error;
    }
}

Setting Up the Vite Frontend

  1. Initialise Project
    Terminal
    npm create vite@latest ./
  2. Project Structure
    server
    index.js
    services
    src
    api
    App.tsx
    package.json

Create the Form to Take User Input

App.tsx
import { useState } from "react";
import { v4 as uuidv4 } from "uuid";
import { createSupabaseClient } from "./api/api";

const App = () => {
    const [url, setUrl] = useState("");
    const [loading, setLoading] = useState(false);

    const handleSubmit = async (e: React.FormEvent) => {
        e.preventDefault();
        try {
            setLoading(true);
            const convId = uuidv4();
            const docId = uuidv4();
            const supabase = createSupabaseClient();
            
            await supabase.from("conversations").insert({ id: convId });
            await supabase.from("documents").insert({ id: docId });
            await supabase.from("conversation_documents").insert({
                conversation_id: convId,
                document_id: docId
            });

            await fetch("http://localhost:7004/store-document", {
                method: "POST",
                headers: { "Content-Type": "application/json" },
                body: JSON.stringify({ url, documentId: docId })
            });

        } catch (error) {
            console.error(error);
        } finally {
            setLoading(false);
        }
    };

    return (
        <div className="min-h-screen bg-gray-950 flex items-center justify-center text-white">
            <div className="bg-white/5 p-8 rounded-3xl border border-white/10 w-full max-w-md">
                <h1 className="text-3xl font-bold mb-6 text-brand-orange">AI YouTube Chat</h1>
                <form onSubmit={handleSubmit} className="space-y-4">
                    <input
                        type="text"
                        placeholder="YouTube URL..."
                        value={url}
                        onChange={(e) => setUrl(e.target.value)}
                        className="w-full px-4 py-3 rounded-xl bg-black/40 border border-white/10"
                    />
                    <button
                        type="submit"
                        disabled={loading}
                        className={`w-full py-3 rounded-xl font-bold ${loading ? 'bg-gray-600' : 'bg-brand-orange text-black font-black'}`}
                    >
                        {loading ? "Processing..." : "Get Started"}
                    </button>
                </form>
            </div>
        </div>
    );
};

export default App;

⚙️ Next Steps

In the final section, we’ll:

  • • Implement the working RAG Chat UI.
  • • Deploy the complete application to production.