Skip to content
Ankit Tomar
Ankit Tomar

AI Products

  • AIML
  • Product Management
  • Interview Prep
    • Data Science Interview Questions and Answers
  • Books
  • Blog
    • Generic
    • GenAI
    • Data Pipeline
    • Education
    • Cloud
    • Working in Netherlands
  • About Me
Schedule
Ankit Tomar

AI Products

3. Validating a Machine Learning Model: Why It Matters and How to Do It Right

Ankit Tomar, June 20, 2025June 10, 2025

Validating a machine learning model is one of the most critical steps in the entire ML lifecycle. After all, you want to be sure your model is doing what it’s supposed to—performing well, generalizing to new data, and delivering real-world business impact.

In this post, let’s explore what model validation really means, the types of datasets involved, how to evaluate performance both statistically and in a business context, and the common pitfalls to avoid.


Three Key Data Types in ML Model Training

Before jumping into validation techniques, it’s important to understand how data is split for model development:

1. Training Set

This is the dataset used to train the model. The algorithm learns patterns from this data by adjusting its internal parameters (like weights). In supervised learning, this includes both features (input variables) and target labels (the outcome you want to predict).

2. Validation Set

This is typically a portion of the training set used during the training process to tune hyperparameters or select the best model. Methods like k-fold cross-validation rely heavily on this. The model has seen similar data but not this exact subset.

3. Test Set

This dataset is completely new to the model. It’s never been seen during training or validation. It simulates how the model will perform in the real world and is used to evaluate final performance. This is the metric you report for true model accuracy or generalization.


How to Evaluate a Model’s Performance

There are two primary lenses through which to assess a machine learning model:

1. Statistical Performance

This focuses on metrics that depend on the type of ML problem you’re solving.

  • For regression tasks (predicting a continuous value):
    Use metrics like R² score, Mean Absolute Error (MAE), or Mean Absolute Percentage Error (MAPE).
  • For classification tasks (predicting categories):
    Use metrics like Accuracy, Precision, Recall, F1-Score, or AUC-ROC.

These metrics help you understand how well the model fits the training data and how it might generalize to unseen data.

2. Business Performance

This is where many data scientists miss the mark.

Statistical performance is important, but business performance is what truly matters. Your model should improve something that the business cares about—be it revenue, conversion rate, delivery time, or customer churn.

If your model improves R² score but doesn’t lead to better business outcomes, then it’s just good math—not a good product.


Aligning Metrics with Business Goals

It’s crucial to have this discussion early in the project, ideally during the scoping phase:

  • What is the goal of this model?
  • How will we measure success?
  • What are the key metrics that both the data science team and the business team can align on?

Example:
If you’re building a model to increase customer retention, then a good statistical metric might be ROC-AUC, but a strong business metric would be an actual increase in retention rate over X months.


Real-World Tip: Metrics Can Evolve

Sometimes, the metric you start with isn’t the right one after all.

During model development, you might discover that the selected metric doesn’t reflect the business needs or the actual model behavior. It’s perfectly okay to refine or update your metrics—just make sure you’re not doing it to force a better result, but to reflect better understanding.

Example from my own experience:
We once deployed a model with a high R² score. Everything looked great on paper. But six months later, business KPIs started to decline. Upon closer inspection, we realized we were optimizing R² but ignoring error variance, so we added MAPE as an additional metric. This helped surface hidden issues and bring the model back in line with business goals.


Post-Deployment: Keep Tracking

The work doesn’t end when the model goes live.

Models in production face data drift and concept drift. The input data may change, or the target variable may evolve over time. It’s important to continue monitoring metrics—both statistical and business—on an ongoing basis.


In Summary

  • Understand your data splits: training, validation, test.
  • Evaluate from two angles: statistical accuracy and business impact.
  • Choose the right metrics: align them with your project’s goal.
  • Be flexible: refine metrics as your understanding improves.
  • Keep monitoring: post-deployment tracking is a must.

Machine learning is not just about building great models—it’s about building models that make a difference.

Next time you’re working on a project, ask yourself:
What business value will this model create, and how will I measure it?

Loading

Post Views: 84
Machine Learning ML

Post navigation

Previous post
Next post

Related Posts

Career

🚀 Don’t Just Be a Data Scientist — Become a Full-Stack Data Scientist

June 11, 2025June 6, 2025

Over the past decade, data science has emerged as one of the most sought-after fields in technology. We’ve seen incredible advances in how businesses use data to inform decisions, predict outcomes, and automate systems. But here’s the catch: most data scientists stop halfway.They build models, generate insights, and maybe make…

Read More
Machine Learning

7. Model Metrics – Classification

June 24, 2025June 24, 2025

Let’s talk about a topic that often gets underestimated — classification metrics in machine learning. I know many of you are eager to dive into LLMs and the shiny new world of GenAI. But here’s the truth: without building a strong foundation in traditional ML, your understanding of advanced systems…

Loading

Read More
Career

Data Science and AI: Real Career Challenges You Should Know

June 16, 2025June 6, 2025

Over the past decade, I’ve worked across various domains and seen the field of data science evolve dramatically—from traditional analytics to today’s GenAI capabilities. There’s no doubt we’ve come a long way, and yet, I still find myself answering the same questions over and over again—on YouTube, LinkedIn, and even…

Loading

Read More

Search

Ankit Tomar

AI product leader, Amsterdam

Archives

  • August 2025
  • July 2025
  • June 2025
  • May 2025
  • December 2024
  • August 2024
  • July 2024
Tweets by ankittomar_ai
©2025 Ankit Tomar | WordPress Theme by SuperbThemes