Solving a Puzzle - Machine Learning Algorithm Selection
A helpful note on choosing the right learning algorithm!
There are many types of machine learning models, from linear models, tree-based, ensembles, to neural networks. Knowing which model to pick up while approaching a given problem can be a battle.
In this write-up, I want to share some takes that can hopefully reduce your modeling curve. While I will list different factors to consider, a model selection is a no-free lunch scenario - There is no model that is guaranteed to solve a problem before you try and evaluate different models.
The first thing to consider is the scope of the project. Let's unpack it and we will see other considerations along the way.
1. The scope of the project at hand
This is the first thing to think about while choosing a learning algorithm. Some of this might be obvious. Take an example, if you are going to build a face detection application that can detect and recognize the people entering a building, you are not going to look into logistic regression. Neural networks, Convolutional Neural Network specifically, has shown potential in vision problems and will probably be your primary choice for object detection.
On the other hand, if a local housing agency asks you to build a model which can predict the price of a house given its properties (size, region, number of bedrooms, etc..) for their 300 hundred units, you would probably take a step back and look into linear models or perhaps other heuristic methods.
2. The size of the dataset
The bigger the dataset, the more likely you will turn to complex models such as for ensembles or neural networks. While with a small dataset, linear models can give you huge results without having to spend time juggling through complex algorithms.
To understand the size of your dataset, you may consider the number of examples or data points, and the number of features.
3. Beyond size, the type of the dataset
if you are dealing with the linear dataset or a dataset in which there is a linear relationship between the predictors (input features) and the output, then linear models such as linear or logistic regression can do the trick. Otherwise, if you have nonlinear datasets, neural networks can be a good choice.
4. The level of model interpretability
If you want the results of your learning algorithms to be explainable, neural networks may not be part of your selection. This is not to say that neural nets are not explainable, but the more you go deep, the harder it becomes to explain how they do what they do.
4. Training time
You guessed it. Complex models, like neural networks and ensemble models, will take too long to train. It can be in a fraction of hours or multiple hours, days, weeks, months, or even years. How long it will be will also depend on the capacity of your machine.
To give you an example of how shocking this can be, GPT-3, a recent state-of-the-art model in Natural Language Processing would take 355 years to train GPT-3 on a Tesla V100, which is the fastest GPU on the computation market. And it would also take $4.6M to train it using the cheapest cloud provider. If interested in learning more about GPT-3, read its technical review here.
What that says is that the deeper you go, the long and expensive training your model will become.
Resources to Learn more
I would like to end this article by sharing these great and free resources that I found helpful when it comes to the understanding of the machine learning project cycle.
- Machine Learning Engineering Book - Andriy Burkov
- Machine Learning Yearning Book - Andrew Ng
- The People + AI Guidebook for AI and Interpretability
I actively share some ML tips and best practices on my social channels, Twitter and LinkedIn. If you would like to connect/chat, you're welcome. Every day, I share one or two things that you may find insightful.
Thank you. Talk to you next week!