Skip to content
Ankit Tomar
Ankit Tomar

AI Products

  • AIML
  • Product Management
  • Interview Prep
    • Data Science Interview Questions and Answers
  • Books
  • Blog
    • Generic
    • GenAI
    • Data Pipeline
    • Education
    • Cloud
    • Working in Netherlands
  • About Me
Schedule
Ankit Tomar

AI Products

Gradient Boosting

Ankit Tomar, July 1, 2025July 1, 2025

As we continue our journey into ML algorithms, in this post, we’ll go deeper into gradient boosting — how it works, what’s happening behind the scenes mathematically, and why it performs so well.


🌟 What is gradient boosting?

Gradient boosting is an ensemble method where multiple weak learners (usually shallow decision trees) are combined sequentially. Each new tree corrects the errors (residuals) of the combined previous trees.


🧠 How does it actually work?

  1. Initial prediction: Start with a simple model, like predicting the mean target value.
  2. Compute residuals: Find the difference between true values and current predictions.
  3. Fit a new tree: Train a tree to predict these residuals (i.e., the model’s mistakes).
  4. Update: Add this new tree’s output to the current prediction, scaled by a learning rate.
  5. Repeat: Build many such trees iteratively.

The final prediction is the sum of all trees.


🧮 Why is it called “gradient” boosting?

At each step, instead of just predicting residuals, the algorithm fits to the negative gradient of the loss function (how error changes as predictions change). This is a form of numerical optimization: we take steps in the direction that most quickly reduces error.

For example, with mean squared error (MSE):

  • The negative gradient is simply the residuals (actual – predicted).
  • But for log loss (classification), the gradient is different.

This makes gradient boosting very flexible — it can optimize almost any differentiable loss function.


✏️ How does it pick the best split in each tree?

When building each tree:

  • For each feature and threshold, it computes how much splitting at that point reduces the chosen loss (e.g., MSE or log loss).
  • It picks the split with the highest improvement.

Efficient calculation: Libraries like XGBoost and LightGBM use clever tricks (histograms, sampling) to make this faster even with large datasets.


📐 Formulas that help in interviews

Gini impurity:

Entropy:

In regression, the typical objective is to minimize mean squared error:

And the negative gradient tells us how to adjust predictions to reduce this error.


⚙️ Why is gradient boosting powerful?

  • Focuses learning on hard-to-predict data.
  • Works with different loss functions.
  • Builds complex nonlinear models.
  • Can handle numerical and categorical data.

But it can overfit, so tuning is essential.


🛡️ How to control overfitting

  • Reduce tree depth.
  • Use lower learning rate.
  • Add subsampling (random rows or columns).
  • Add regularization like shrinkage.

We will discuss XGboost, Catboost and LightGBM in upcoming blogs.

Loading

Post Views: 240
Machine Learning

Post navigation

Previous post
Next post

Related Posts

Machine Learning

Building a Practical Explainable AI Dashboard – From Concept to Reusability 🧰🔍

May 25, 2025June 17, 2025

In today’s world of machine learning, understanding why a model makes a decision is becoming just as important as the decision itself. Interpretability isn’t just a “nice to have” anymore—it’s essential for trust, debugging, fairness, and compliance. That’s why I set out to create a modular, reusable Explainable AI Dashboard….

Loading

Read More
Machine Learning

7. Model Metrics – Classification

June 24, 2025June 24, 2025

Let’s talk about a topic that often gets underestimated — classification metrics in machine learning. I know many of you are eager to dive into LLMs and the shiny new world of GenAI. But here’s the truth: without building a strong foundation in traditional ML, your understanding of advanced systems…

Loading

Read More
Machine Learning

2. How Do Machine Learning Models Get Trained?

June 19, 2025June 10, 2025

So far, we’ve talked about what machine learning models do at a high level—they take in historical data, learn patterns, and help us make predictions. But how exactly does a machine learning model get trained, tested, and prepared for the real world? Let’s walk through that journey step by step….

Loading

Read More

Search

Ankit Tomar

AI product leader, Amsterdam

Archives

  • November 2025
  • August 2025
  • July 2025
  • June 2025
  • May 2025
  • December 2024
  • August 2024
  • July 2024
Tweets by ankittomar_ai
©2025 Ankit Tomar | WordPress Theme by SuperbThemes