Let's Get Started
Hello Guys! 👋
Section titled “Hello Guys! 👋”Excited for the First AI RAG Project ??
Today, we’re kicking it off with our Chat With YouTube project.
🎯 What are we doing here ?
Section titled “🎯 What are we doing here ?”We are building a tool that helps you chat with any Youtube video. We will be learning on how do you approach this project from a step by step approach and also learn how can we:
- Fetch the transcript of a youtube video from its URL
- Understand Langchain and Document Loaders
- What are embeddings ?
- What do we mean by a Vector Database ?
- What is a Similarity Search ?
- How do we integrate all of it ?
🛠️ The Approach?
Section titled “🛠️ The Approach?”
We use the following approach as given above to start with the project.
🧩 Let’s Understand the approach in detail
Section titled “🧩 Let’s Understand the approach in detail”Get the YouTube Video URL
We’ll take the user’s input — a YouTube video link.Fetch Transcript from the URL
Use YouTube’s transcript API or scraping tools to extract the full transcript.Chunk the Transcript
Split the transcript into manageable parts for efficient storage and retrieval.Convert Chunks into Embeddings
Use an LLM to convert text chunks into embeddings (vectors).Store in a Vector Database (Supabase)
Store the embeddings in Supabase with a document ID.Handle User Queries
Convert user queries into embeddings, perform a similarity search, and retrieve relevant answers.Display Response
Show the final output on the frontend using streaming.
Before moving forward, let us understand the term Vector Database
🧠 What is a Vector Database?
Section titled “🧠 What is a Vector Database?”
Imagine your brain is trying to remember which movie your friend described:
“It’s a sci-fi movie, has robots, and was super emotional.”
Instead of just matching words like “robot” or “sci-fi”, your brain tries to understand the meaning and find the closest match - maybe it thinks: “Sounds like Intersteller or Wall-E!”
That’s what a vector database does — it finds similar things by understanding meaning, not just exact words.
🧮 What’s a “Vector” Anyway?
Section titled “🧮 What’s a “Vector” Anyway?”A vector is just a list of numbers that represent something like:
- a sentence 📝
- an image 🖼️
- a sound 🎵
- or even a product 🎁 For example, the sentence:
“I love ice cream.” 🍦 might be converted into something like: [0.23, -0.88, 1.2, 0.05, 0.77]
This is called embedding — turning things into numbers that machines can understand.
📦 Tech Stack
Section titled “📦 Tech Stack”| Layer | Tech |
|---|---|
| Frontend | Vite + TailwindCSS |
| Backend | Node.js |
| Database | Supabase |
| LLM | Langchain. |
⚙️ Next Steps
Section titled “⚙️ Next Steps”In the next section, we’ll:
- Setup the environment
- Initialise the Backend
- Get the Supabase Credentials from Supabase.
If you want to know more about this, do checkout our video :