04API LAYER

Store Document API

If you have come to this page, congratulations on successfully setting up the environment and Supabase. Let's now start the actual work 👨🏻‍💻.


Supabase Connection Setup

Now that we have the basic backend server setup already donehere, we will now start to setup our supabase database connection with our server.

  1. Install the @supabase/supabase-js package.
    Terminal
    npm install @supabase/supabase-js
  2. Create a supabase client helper

    Once installed, go to your project root folder and create a file at helpers/supabaseClient.js.

    helpers/supabaseClient.js
    import { createClient } from '@supabase/supabase-js'
    import dotenv from "dotenv";
    
    // To access the api keys in .env
    dotenv.config()
    
    export const createSupabaseClient = () => {
        let supabaseUrl = process.env.SUPABASE_URL
        let supabaseAnonKey = process.env.SUPABASE_ANON_KEY
    
        return createClient(supabaseUrl, supabaseAnonKey)
    }

    Here we are creating a function to connect our node server to supabase client using the createClient method.


Setting up the store-document api route and service

Create 2 folders routes and services inside your server folder.routes will consist of all our api routes and services will consist of all the logic.

  1. Create the Route Definition

    Inside routes, create a file storeDocumentRoutes.js.

    routes/storeDocumentRoutes.js
    import express from "express";
    import { storeDocument } from "../services/storeDocumentService.js";
    
    const router = express.Router();
    
    // Handle store document route
    router.post('/', async (req, res) => {
        try {
            const result = await storeDocument(req);
            res.status(200).json(result);
        } catch (error) {
            console.error("Error in storeDocument: ", error);
            res.status(500).json({
                error: "An error occurred during the request."
            })
        }
    });
    
    export default router;
  2. Import the route in your index.js
    index.js
    import express from "express";
    import cors from "cors";
    import storeDocumentRoute from "./routes/storeDocumentRoutes.js";
    
    const app = express();
    
    app.use(express.json())
    
    const corsOptions = {
        origin: "http://localhost:5173",
        methods: ["GET", "POST", "PUT", "DELETE"],
        allowedHeaders: ["Content-Type", "Authorization"]
    }
    
    app.use(cors(corsOptions))
    
    app.use("/store-document", storeDocumentRoute)
    
    app.listen('7004', () => {
        console.log('Server Running on PORT 7004');
    });
    
    export default app;
  3. Create the Service Placeholder

    Inside services, create storeDocumentService.js.

    services/storeDocumentService.js
    export async function storeDocument(req){
        return {
            ok: true
        }
    }

Initialising Embeddings and Vector Store

  1. Install LangChain Dependencies
    Terminal
    npm install @langchain/google-genai @langchain/community @langchain/textsplitters uuid
  2. Configure Embeddings
    services/storeDocumentService.js
    import { createSupabaseClient } from '../helpers/supabaseClient.js';
    import { GoogleGenerativeAIEmbeddings } from '@langchain/google-genai'
    
    export async function storeDocument(req){
        try {
            //Initialising the Supabase Client
            const supabase = createSupabaseClient();
    
            //Generating Embeddings using the @langchain/google-genai package
            const embeddings = new GoogleGenerativeAIEmbeddings({
                model: "gemini-embedding-001", //The model you want to use to generate embedding
                taskType: "RETRIEVAL_DOCUMENT",
                title: "Youtube Rag"
            });
        } catch (error) {
            console.error(error);
    
            //Return false if there is any error
            return { 
                ok: false,
            }
        }
        return { 
                ok: true
        }
    }
  3. Initialise Vector Store
    services/storeDocumentService.js
    import { SupabaseVectorStore } from '@langchain/community/vectorstores/supabase';
    
    const vectorStore = new SupabaseVectorStore(embeddings, {
        client: supabase,
        tableName: "embedded_documents",
        queryName: "match_documents"
    });

Access YouTube Video

We will be using the Youtube Loader from LangChain Loaders.

services/storeDocumentService.js
import { YoutubeLoader } from '@langchain/community/document_loaders/web/youtube'

    //Get the youtube video url, from the user
    const { url } = req.body;

    //Get the video data from url using YoutubeLoader
    const loader = await YoutubeLoader.createFromUrl(url, {
        addVideoInfo: true
    });

    //Load the data
    const docs = loader.load()
    //You can also print this to get a view of how the data is returned in response.
    console.log('Video Data: ', data);

Split the Document Into Chunks

  1. We will be using the Youtube Loader from @langChain/textsplitters.

    Terminal
    npm install @langchain/textsplitters
    services/storeDocumentService.js
    import { RecursiveCharacterTextSplitter } from '@langchain/textsplitters'
    
    const textSplitter = new RecursiveCharacterTextSplitter({
        chunkSize: 1000,
        chunkOverlap: 200,
    });
    
    const texts = await textSplitter.splitDocuments(docs);

Generating Document ID

  1. To generate a unique id we install a package:

    Terminal
    npm install uuid
  2. Configure Embeddings

    services/storeDocumentService.js
    import { v4 as uuidv4 } from "uuid";
    
    const documentId = uuidv4();
    //Check if it is getting created
    console.log('Generted ID: ', documentId);
    
    const docsWithMetaData = texts.map((text) => ({
        ...text,
        metadata: {
            ...(text.metadata || {}),
            documentId
        }
    }))
    
    await vectorStore.addDocuments(docsWithMetaData)

    With this we are able to generate a unique id for every entry and store the video along with the metadata, transcript and vector embeddings in the supabase database.

Full Implementation

This is how your final storeDocumentService.js will look like after integrating all the steps:

services/storeDocumentService.js
import { SupabaseVectorStore } from '@langchain/community/vectorstores/supabase'
import { createSupabaseClient } from '../helpers/supabaseClient.js'
import { GoogleGenerativeAIEmbeddings } from '@langchain/google-genai'
import { YoutubeLoader } from '@langchain/community/document_loaders/web/youtube'
import { RecursiveCharacterTextSplitter } from '@langchain/textsplitters'
import { v4 as uuidv4 } from 'uuid'

export async function storeDocument(req) {
  try {
    if (!req?.body?.url) {
      throw new Error('URL is required in the request body')
    }

    const { url } = req.body
    const supabase = createSupabaseClient()

    const embeddings = new GoogleGenerativeAIEmbeddings({
      model: 'embedding-001' // ✅ Safe default
    })

    const vectorStore = new SupabaseVectorStore(embeddings, {
      client: supabase,
      tableName: 'embedded_documents',
      queryName: 'match_documents'
    })

    // ✅ Await loader creation
    const loader = await YoutubeLoader.createFromUrl(url, {
      addVideoInfo: true
    })

    const docs = await loader.load()

    if (docs[0]) {
      docs[0].pageContent = `Video title: ${docs[0].metadata.title} | Video context: ${docs[0].pageContent}`
    }

    const textSplitter = new RecursiveCharacterTextSplitter({
      chunkSize: 1000,
      chunkOverlap: 200
    })

    const texts = await textSplitter.splitDocuments(docs)

    if (!texts.length || !texts[0].pageContent) {
      throw new Error('Document has no content to embed.')
    }

    const documentId = uuidv4()
    console.log('Generated DocumentID:', documentId)
    console.log('First chunk preview:', texts[0].pageContent.slice(0, 100))
    
    const docsWithMetaData = texts.map((text) => ({
      ...text,
      metadata: {
        ...(text.metadata || {}),
        documentId
      }
    }))

    await vectorStore.addDocuments(docsWithMetaData)
    
    return { ok: true, documentId }
  } catch (error) {
    console.error('❌ storeDocument Error:', error.message)
  }

  return {
    ok: true
  }
}

⚙️ Next Steps

In the next section, we’ll:

  • Create the conversation id and link the conversation documents in the database
  • Start with fetch-document api to fetch the data based on queries.
  • Create a complete LLM RAG Pipeline.

If you want to know more about this, do checkout our video Guide: