01INTRODUCTION

Let's Get Started

A complete walkthrough of building a RAG Model that chats with YouTube videos. Learn how to extract transcripts, create embeddings, and query them with AI.

🎯 What are we doing here?

We are building a tool that helps you chat with any Youtube video. We will be learning on how do you approach this project from a step by step approach.

Fetch Transcript

Extract text from YouTube videos using APIs or scrapers.

Embeddings

Convert text chunks into vector representations using LLMs.

Vector DB

Store and retrieve high-dimensional vectors efficiently.

Similarity Search

Find the most relevant context for user queries.

🛠️ The Approach

Important Note

Before starting any project, it is suggested to start with creating the approach of the project first, creating a complete workflow on your approach.

Workflow Diagram

🧠 What is a Vector Database?

Imagine your brain is trying to remember which movie your friend described:
"It's a sci-fi movie, has robots, and was super emotional."

Instead of just matching words like "robot", your brain tries to understand the meaning. That’s what a vector database does — it finds similar things by understanding meaning, not just exact words.

Example: Embeddings

"I love ice cream" might look like:

[0.23, -0.88, 1.2, 0.05, 0.77, ...]

Important Note

Vector databases are databases that store and retrieve high-dimensional vectors efficiently.

📦 Tech Stack

Vite + Tailwind

Node.js

Supabase

Langchain

Next Steps

In the next section, we’ll:

Setup the environment

Install necessary tools and libraries.

Initialise the Backend

Create the project structure and initial scripts.

Get Supabase Credentials

Connect your project to Supabase.

Backend And Chunking

Environment Setup