Coding Adda Docs

Special Thanks to Rishabh Tripathi for contributing to this guide.

0. How to use this guide

This guide is written like an action plan, not a textbook.

It is divided into phases (0 to 5).
Each phase has:
- Outcome: What you can do at the end.
- Timeline: Rough duration if you are consistent.
- What to learn: Topics in plain language.
- Practice tasks: Concrete things to do.
- Milestone checklist: To verify that you are ready to move on.

You do not need to complete everything perfectly before moving ahead.
Treat it as a loop:

Learn just enough → Build something small → Break it → Fix it → Move to next layer.

1. Big picture: What is an “AI developer” in 2026

Before jumping in, be clear about the target.

An AI developer in 2026 is usually one (or a mix) of these:

ML Engineer
- Works with datasets, training models, evaluating them, and deploying them.
LLM / Generative AI App Developer
- Builds apps using large language models (LLMs) like GPT style models.
- Works with RAG (retrieval augmented generation), tools/agents, APIs, vector databases.
MLOps / AI Platform Engineer
- Focuses on infrastructure, deployment, monitoring, and reliability of AI systems.
AI Product Engineer
- Strong software engineer who integrates AI features into products end to end.

This guide aims to give you enough skills to move into any of these directions, with a bias toward:

Strong foundations in ML and deep learning.
Ability to build and ship real LLM based applications.

2. Roadmap overview

Here is the complete path at a glance.

Phase	Focus	You Become Able To	Rough Duration (part time)
0	Setup + mindset	Learn effectively, without burning out	1 week
1	Programming + math basics	Write Python + basic data code	2 to 4 weeks
2	Core ML	Train and evaluate classical ML models	4 to 6 weeks
3	Deep learning	Build and use neural networks	4 to 6 weeks
4	LLMs & Generative AI	Build real AI powered apps	4 to 8 weeks
5	MLOps + Portfolio	Deploy, monitor, and showcase work	Ongoing

You can compress or stretch these depending on your background and time.

3. Phase 0: Setup and mindset

3.1. Outcome

By the end of this phase you:

Have a working development environment on your laptop.
Know how you will study.
Are mentally prepared for the journey.

3.2. Timeline

3 to 7 days.

3.3. What to set up

Tools on your MacBook:

Python 3.11 or newer.
Package manager: uv or pipx or conda (any one).
VS Code with:
- Python extension.
- Jupyter extension.
Git + GitHub account.
One AI coding assistant (e.g., GitHub Copilot, Codeium).

3.4. Mindset rules

Small loops, not big goals
- Focus on: “Today I will implement linear regression from a tutorial”
  instead of “I will master ML this month”.
Hands on > passive
- If you watch a 30 minute video, spend at least 30 minutes coding something related.
Embrace feeling dumb
- If you never feel confused, you are not learning anything new.

3.5. Phase 0 checklist

You are ready for Phase 1 if:

You can open VS Code and run a simple Python file.
You can create and activate a virtual environment.
You can create a GitHub repo and push a file.
You have blocked at least 1 to 2 hours daily or 8 to 12 hours weekly for learning.

4. Phase 1: Programming & math basics (for ML, not for exams)

4.1. Outcome

You can write basic Python scripts, work with data in memory, and understand the math words used in beginner ML content.

4.2. Timeline

2 to 4 weeks (if you already know Python decently, 1 week of refresh).

4.3. Programming topics

Focus on:

Python basics:
- Variables, data types, conditionals, loops.
- Functions and parameters.
- Lists, dictionaries, sets, tuples.
- Reading/writing files.
Essential libraries:
- NumPy: arrays, indexing, broadcasting, basic operations.
- Pandas: DataFrame, reading CSV, filtering, groupby, describe.
- Matplotlib / Seaborn: Line plots, histograms, scatter plots.

You do not need advanced OOP or metaclasses at this stage.
Just be comfortable manipulating data.

4.4. Math topics

You only need applied intuition, not proofs.

Linear algebra (intuitive)
- Vectors and matrices (think of them as lists and tables).
- Matrix multiplication (what it means, not full proofs).
- Dot product and what “projection” roughly means.
Probability and statistics
- Mean, median, variance, standard deviation.
- What is a distribution (normal distribution as a common example).
- Concept of randomness and sampling.
Calculus (high level)
- What is a derivative: “rate of change”.
- Why gradient descent is “moving in the direction that reduces loss”.

4.5. Practice tasks

Try these small tasks:

CSV Explorer
- Load a CSV using Pandas (any dataset from Kaggle).
- Print:
  - First 10 rows.
  - Basic stats (mean, min, max).
- Plot:
  - Histogram of one numeric column.
  - Bar chart of counts of one categorical column.
Array playground
- Create a few NumPy arrays.
- Practice:
  - Slicing (subarrays).
  - Element wise operations.
  - Matrix multiplication.
Mini calculator
- Write a Python function that:
  - Takes a list of numbers.
  - Returns mean, variance, and standard deviation without using built in functions (manually implement them).

4.6. Phase 1 checklist

You are ready for Phase 2 if you can:

Load a CSV into Pandas and do basic cleaning (drop nulls, filter rows).
Plot simple graphs using Matplotlib or Seaborn.
Explain, roughly, what a vector, matrix, mean, variance, and derivative are.
Write Python functions without copy pasting from the internet for trivial things.

5. Phase 2: Core machine learning

5.1. Outcome

You can:

Take a dataset.
Train classical ML models.
Evaluate them.
Explain your choices.

5.2. Timeline

4 to 6 weeks.

5.3. Core ML concepts (in plain words)

You should understand:

Supervised learning
- Input → Output with labels.
- Two main tasks:
  - Regression: predict numbers (price, temperature).
  - Classification: predict categories (spam or not spam).
Unsupervised learning
- No labels.
- Models try to find structure: groups, patterns.
Overfitting vs underfitting
- Overfitting: Model learns noise and does great on training data but fails on new data.
- Underfitting: Model is too simple and cannot capture real patterns.
Train / validation / test split
- Train: Learn patterns.
- Validation: Tune hyperparameters.
- Test: Final check.
Evaluation metrics
- For regression: MAE, MSE, RMSE.
- For classification: accuracy, precision, recall, F1, confusion matrix.

5.4. Tools to use

Use scikit-learn as your main toolkit:

Models:
- LinearRegression
- LogisticRegression
- DecisionTreeClassifier
- RandomForestClassifier
- RandomForestRegressor
Utilities:
- train_test_split
- StandardScaler
- Pipeline

Later, also try:

XGBoost or LightGBM (boosted trees) for better performance on tabular data.

5.5. Suggested learning order

Load a simple dataset from scikit-learn (for example, Iris).
Train a logistic regression classifier.
Evaluate accuracy and print the confusion matrix.
Try a decision tree on the same data.
Compare their performance and think why one is better.

5.6. Mini projects (very important)

Build at least 3 small but complete notebooks or scripts:

Project A: House price prediction

Task: Predict house prices from features like size, rooms, location.
Steps:
- Load a public housing dataset.
- Explore distributions and clean data.
- Train:
  - Linear Regression.
  - Random Forest Regressor.
- Evaluate using RMSE and MAE.
- Try different features and see how performance changes.

Project B: Customer churn classification

Task: Predict if a customer will leave a service.
Steps:
- Use a telecom churn dataset (plenty on Kaggle).
- Preprocess categorical variables (one hot encoding).
- Train:
  - Logistic Regression.
  - Random Forest Classifier.
  - XGBoost (if comfortable).
- Evaluate using:
  - Accuracy.
  - Precision, recall, F1.
- Plot confusion matrix.

Project C: Customer segmentation with clustering

Task: Group customers into segments.
Steps:
- Use KMeans from scikit-learn.
- Choose a small set of numeric features.
- Decide a value of k (for example, 3 or 4).
- Visualize clusters using 2D plots (dimensionality reduction or pick 2 features).
- Give each cluster a simple business friendly description.

5.7. Phase 2 checklist

You are ready for Phase 3 if:

You can explain the difference between regression and classification in your own words.
You have at least 3 GitHub repos or well documented notebooks with:
- Problem statement.
- Data description.
- Model training code.
- Metrics and simple discussion.
You can look at a confusion matrix and tell what is going on.
You are comfortable using scikit-learn for at least 3 different models.

6. Phase 3: Deep learning

6.1. Outcome

You can:

Build and train basic neural networks.
Use them for images or text.
Use pretrained models with transfer learning.

6.2. Timeline

4 to 6 weeks.

6.3. Key concepts

Understand these intuitively:

Neuron and layer
- A neuron takes numbers as input, multiplies by weights, adds bias, applies activation.
- Layers are groups of neurons.
Activation functions
- ReLU: passes positive values, zero otherwise.
- Sigmoid: squashes values to 0 to 1.
- Softmax: turns a vector into probabilities that sum to 1.
Loss functions
- MSE for regression.
- Cross entropy for classification.
Backpropagation and gradient descent
- Neural net outputs are compared to true labels via loss.
- Gradients tell how to adjust weights to reduce loss.
- Optimizer (like SGD, Adam) updates weights step by step.
Architectures
- Fully connected networks.
- CNNs for images.
- RNNs (or simple sequence models) for time series or sequences.
- Transformers at a high level concept (will revisit in LLM phase).

6.4. Framework choice

Use PyTorch as your main deep learning framework.

Focus on:

Tensors and their shapes.
Building nn.Module based models.
Using DataLoader and Dataset.
Training loops:
- Forward pass.
- Compute loss.
- Backward pass.
- Optimizer step.

6.5. Mini projects

Project D: Image classification on CIFAR-10

Task: Classify images into classes (e.g., cat, dog, car).
Steps:
- Use CIFAR-10 from torchvision datasets.
- Build a small CNN from scratch.
- Train for a few epochs and track training vs validation accuracy.
- Then use a pretrained model (like ResNet) with transfer learning.
- Compare results and training time.

Project E: Text sentiment classification

Task: Classify text as positive or negative.
Steps:
- Use a public sentiment dataset (IMDB, Tweets etc).
- Preprocess text (basic tokenization).
- First baseline:
  - Use bag of words (scikit-learn) + Logistic Regression.
- Then:
  - Simple neural network (embedding + mean pooling + dense layers).
- Compare the performance and complexity.

Project F: Tabular deep learning vs classic ML

Task: Use a same tabular dataset as in Phase 2.
Steps:
- Implement a simple feedforward network in PyTorch.
- Train and evaluate it.
- Compare performance, training time, complexity with XGBoost or RandomForest.
- Document when classical ML is better.

6.6. Phase 3 checklist

You are ready for Phase 4 if:

You can write a basic PyTorch training loop without heavily copy pasting.
You understand what a tensor shape is and can debug shape mismatches.
You have at least 2 deep learning projects (image, text, or tabular) in your portfolio.
You can explain in your own words why we do not always use deep learning for every problem.

7. Phase 4: LLMs and Generative AI applications

This is the most “visible” part of AI right now. Here you will learn to build real apps powered by large language models.

7.1. Outcome

You can:

Use LLM APIs to build chatbots and assistants.
Implement RAG (retrieval augmented generation).
Build simple tool using or agent style workflows with guardrails.
Ship small but real products or demos.

7.2. Timeline

4 to 8 weeks to get solid foundations. Deeper mastery will continue beyond.

7.3. Concepts

Understand:

What is a large language model (LLM).
What is tokenization (text split into tokens).
What context window means.
Prompt and system prompt.
Temperature, top k, top p (controlling randomness).
Difference between:
- Using pretrained LLM via API.
- Fine tuning models.
- Retrieval augmented generation (RAG) using external knowledge.
Basics of:
- Latency.
- Cost per request.
- Rate limiting and quotas.

7.4. Tools and ecosystem

Learn to use:

An LLM API (choose any major one that is stable and documented).
A vector database:
- Could be hosted like Pinecone or self hosted like Qdrant, Milvus, or pgvector.
An orchestration framework:
- For example, LangChain or any similar library popular in the ecosystem.
Embedding models:
- For converting text to vectors for similarity search.

7.5. LLM project ladder

Build these in order, each slightly harder than the previous one.

Project G: Simple chat interface

Task: A basic chat UI over an LLM.
Steps:
- Create a small backend (FastAPI or similar).
- Frontend:
  - Could be simple HTML or a React/Next.js app.
- Implement:
  - Single text box.
  - Conversation history.
- Add:
  - Basic error handling.
  - Logging of requests and responses to a file or database.

Goal: Understand API usage, latency, cost basics, and simple UX.

Project H: RAG based Q&A bot

Task: Bot that answers questions from a document set.
Steps:
- Pick a knowledge base:
  - Your company docs (if allowed).
  - A technical book.
  - Any open documentation.
- Pipeline:
  - Split documents into chunks.
  - Embed chunks using embedding model.
  - Store in vector DB.
  - At query time:
    - Embed the query.
    - Retrieve top k similar chunks.
    - Compose a prompt that includes those chunks as “context”.
    - Ask the LLM to answer using only that context.
- Focus on:
  - Chunk size and overlap.
  - Good system prompt to reduce hallucinations.
  - Evaluating answer quality.

Goal: Learn a pattern that is heavily used in real world LLM products.

Project I: Domain specific assistant

Task: Assistant specialized in one area.
- Examples:
  - “AI coding helper for Python beginners.”
  - “AI assistant for email outreach.”
  - “AI note summarizer for students.”
Steps:
- Reuse your RAG pattern where helpful.
- Add tools:
  - For example, call an external API (weather, crypto, docs search).
- Handle:
  - User authentication (even simple).
  - Rate limiting.
  - Basic usage analytics (how many calls, which features used).

Goal: Think like a product engineer, not just a demo maker.

Project J: Controlled agent workflow

Task: Build an agent style system with clear boundaries.
Steps:
- Define a small set of tools:
  - Web search (if allowed).
  - File retrieval.
  - Simple calculator.
- Implement:
  - An orchestrator that:
    - Allows the LLM to decide which tool to call.
    - Limits number of tool calls.
    - Times out gracefully.
- Log:
  - Every step the agent takes.
  - Inputs and outputs of each tool.

Goal: Learn how to give models some autonomy while staying safe and predictable.

7.6. Phase 4 checklist

You are ready to move to Phase 5 if:

You have at least:
- 1 simple chat app.
- 1 RAG system.
- 1 domain specific assistant or agent style project.
You can explain in your own words:
- What RAG is and why it is used.
- How embeddings and vector search work conceptually.
- What guardrails and constraints you added to your agent, and why.
You can roughly estimate cost and latency of your LLM features.

8. Phase 5: MLOps, deployment, and portfolio

Now you connect everything into real world style projects that you can show to others and use in interviews.

8.1. Outcome

You can:

Deploy AI services behind APIs.
Monitor their basic health.
Present your work as a coherent story: from problem to impact.

8.2. Timeline

Ongoing. Expect 1 to 3 months for first solid version.

8.3. Deployment basics

Learn to:

Containerize apps using Docker.
Use a simple cloud deployment (any platform you are comfortable with).
Serve models and LLM apps via REST APIs.
Store configuration and secrets safely (environment variables, secret managers).

8.4. Monitoring and reliability

At minimum, track:

Latency of requests.
Error rates.
Usage metrics:
- Number of requests.
- Which endpoints are used.
- Approximate token usage or compute cost.

Later, you can add:

Model performance monitoring (for ML models).
Data drift detection (for production ML).

8.5. Portfolio building plan

Aim for 3 to 6 strong projects, not 50 tiny ones.

Recommended mix:

End to end ML system
- Data ingestion script.
- Training pipeline.
- Model artifact storage.
- Serving API.
- Monitoring dashboard (even a simple one with a charting tool).
Productionized LLM app
- Clear user facing interface (web app).
- Backend with:
  - RAG or tools or both.
  - Auth and rate limiting.
- Clear README that explains:
  - Problem.
  - Architecture.
  - Tech stack.
  - How to run locally.
Vertical specific solution
- Choose one domain:
  - Education, marketing, legal, sales, analytics etc.
- Build:
  - A focused assistant or automation that solves a real problem.
- Show:
  - Before and after (what users did manually vs now with AI).
  - How you measured success (even qualitatively).

8.6. Portfolio checklist

Before you start applying seriously, aim for:

GitHub profile with:
- 3 to 6 pinned repositories.
- Each with clean structure and README.
At least 1 deployed, publicly reachable demo.
A short document (or personal site) with:
- Who you are.
- Your roadmap progress.
- Links to projects.
- One paragraph per project explaining:
  - The problem.
  - Your approach.
  - The outcome.

9. Daily and weekly routines that actually work

9.1. Daily 90 minute learning block

Template you can reuse:

20 minutes: read or watch theory (one topic).
40 minutes: code a minimal example related to that topic.
20 minutes: modify it in a new way (change parameters, architecture, dataset).
10 minutes: document what you did (markdown note or README update).

9.2. Weekly rhythm

For each week:

Pick 1 main concept (e.g., logistic regression, CNNs, RAG, Docker).
Pick 1 mini project or extension to implement that uses this concept.
By Sunday:
- Push all code to GitHub.
- Write a short reflection:
  - What you learned.
  - What confused you.
  - What you want to try next week.

This small loop keeps you moving forward even when life is busy.

10. Common pitfalls and how to avoid them

10.1. “I will learn all the math first”

Problem:
- You disappear into textbooks for months and never write ML code.
Solution:
- Limit pure math study to 20 to 30 percent of your time.
- Tie math to specific ML concepts you are learning that week.

10.2. “Tutorial hell”

Problem:
- You follow dozens of tutorials, copy paste code, and cannot build anything from scratch.
Solution:
- For every tutorial:
  - Build a small variation without looking.
  - Change the dataset, model, or objective.

10.3. “I only care about cool models, not boring deployment”

Problem:
- You know a lot about architectures but cannot ship a working app.
Solution:
- For any non trivial model you train, ask:
  - How would I deploy this so a user can call it with an API?
  - How would I know if it is failing?

10.4. “I am waiting to feel ready before I apply”

Problem:
- You delay interviews forever, thinking you need to know everything.
Solution:
- Once you have:
  - Few ML projects.
  - Few LLM apps.
  - One deployed demo.
- Start applying. You grow faster when you prepare for and take interviews.

11. If you feel like a “dummy”

Here is the honest truth:

Most people who are “good at AI” today also felt lost in the beginning.
The field looks intimidating from the outside because:
- A lot of people flex heavy math.
- There is constant hype and jargon.

You do not need to be a genius. You need:

Consistency.
Good direction.
Willingness to break things and learn from it.

If you follow this guide:

Move from Phase 0 to Phase 5.
Build the suggested projects.
Keep iterating for 6 to 12 months.

you will be far ahead of most people who only “watch AI content” but never build.

12. Your immediate next 7 days (starter plan)

Here is a concrete plan to start this week.

Day 1 to 2

Set up:
- Python, VS Code, Git.
- One AI coding assistant.
Write:
- A “hello world” Python script.
- A script that reads a CSV and prints the first 5 rows.

Day 3 to 4

Learn:
- Basic NumPy and Pandas operations.
Do:
- Load a small dataset.
- Calculate mean, max, min of a numeric column.
- Plot a histogram and a scatter plot.

Day 5 to 6

Learn:
- What supervised learning is.
- How to use train_test_split and a simple model like LinearRegression.
Do:
- Train your first regression model.
- Compute MAE and RMSE.
- Try changing the train/test ratio and see what happens.

Day 7

Create:
- A new GitHub repo called something like ml-journey-week-1.
Add:
- All scripts and notebooks from this week.
- A README explaining:
  - What you tried.
  - What you learned.
  - What you want to learn next week.

Repeat this weekly structure while following the phases above, and you will steadily transform from “dummy” to capable AI developer.

Zero to AI Developer in 2026

Ready to Crack the Interview?