I like this roadmap from Daniel Bourke’s Youtube video.
The roadmap can be accessed here.

I think this roadmap provides a good overview of how to learn ML. One-stop-shop for everything. The roadmap covers the basic explanation of ML, the available tools and resources and link to the maths behind them if required.

I also like how the video encourages you to approach the learning as a “cook” rather than a “chemist”. Start small. Step-by-step. Learn by doing rather than understanding everything in detail.

Quick overview

Basically, machine learning gives the ability for a machine to learn without being explicitly programmed.

  1. Problems with long lists of rules
    Eg: it will be complex to program a self-driving car via traditional programming

  2. Continually changing environment
    Eg: self-driving car can adapt if there is a new road or traffic sign

  3. Discovering insights within large collections of data
    Eg: it will be too much to go to every single transaction manually if you are Amazon

ML problems

Types of learning for ML:

  1. Supervised learning
    You have data and labels. The model tries to learn the relationship between data and label.

  2. Unsupervised learning
    You have data but no labels. The model tries to find patterns in data without something to reference on.

  3. Transfer learning
    Take an existing ML model, then adjust it on your own and use it for your own problem.

  4. Reinforcement learning
    When an agent perform an action and being rewarded or penalised based on if the action is favourable or not

ML problem domains:

  1. Classification
    The model will use training dataset to learn and then use them to best map the input to the output/label.
    Eg: classify a mail as spam or not spam

  2. Regression
    The model will identify the relationship between the dependent and independent variables.
    Eg: the price of a stock over time

  3. Sequence-to-sequence
    Usually in languages for translation.
    Eg: Given a sequence in English, translate it to Spanish.

  4. Clustering
    Typically an unsupervised problem. Where the model groups data points based on similarity.
    Eg: Sort a soccer player based on their attributes (striker/defender/goalkeeper etc)

  5. Dimensionality reduction
    If you have so much input (100 variables), find the 10 most important variables.
    Eg: by using PCA (principal component analysis)

ML process

First, you need to collect some data. It is important here to recognise the type of data you need. Data can be structured data in a table (eg: categorical, numerical, ordinal data etc) or unstructured data (eg: images, speech etc). Remember, rubbish in, rubbish out.

Second, you need to prepare the data. Typical data preparation steps:

The third step is to train the model.

Next, we evaluate the model based on available metrics. There are several considerations here:

Once we are confident, we can serve the model. We will not know how it performs until we put it out for real. Use different tools whether the final goal is an a mobile app or a web based application.

Finally, we need to continue evaluating the model and retrain the model if needed. The model may change if the data source has changed (such as new road) or data source has been upgraded (such as new hardware used).

What have I learnt from this roadmap?

Good news. I know the basic. I have used Python before and familiar with the libraries. I guess I am closer to intermediate in ML. I have watched or read or have done tutorial covering some of these ML algorithms.

I guess I will start here:

To do lists (probably in this order):