I like this roadmap from Daniel Bourke’s Youtube video.
The roadmap can be accessed here.
I think this roadmap provides a good overview of how to learn ML. One-stop-shop for everything. The roadmap covers the basic explanation of ML, the available tools and resources and link to the maths behind them if required.
I also like how the video encourages you to approach the learning as a “cook” rather than a “chemist”. Start small. Step-by-step. Learn by doing rather than understanding everything in detail.
Quick overview
Basically, machine learning gives the ability for a machine to learn without being explicitly programmed.
Problems with long lists of rules
Eg: it will be complex to program a self-driving car via traditional programming
Continually changing environment
Eg: self-driving car can adapt if there is a new road or traffic sign
Discovering insights within large collections of data
Eg: it will be too much to go to every single transaction manually if you are Amazon
ML problems
Types of learning for ML:
Supervised learning
You have data and labels. The model tries to learn the relationship between data and label.
Unsupervised learning
You have data but no labels. The model tries to find patterns in data without something to reference on.
Transfer learning
Take an existing ML model, then adjust it on your own and use it for your own problem.
Reinforcement learning
When an agent perform an action and being rewarded or penalised based on if the action is favourable or not
ML problem domains:
Classification
The model will use training dataset to learn and then use them to best map the input to the output/label.
Eg: classify a mail as spam or not spam
Regression
The model will identify the relationship between the dependent and independent variables.
Eg: the price of a stock over time
Sequence-to-sequence
Usually in languages for translation.
Eg: Given a sequence in English, translate it to Spanish.
Clustering
Typically an unsupervised problem. Where the model groups data points based on similarity.
Eg: Sort a soccer player based on their attributes (striker/defender/goalkeeper etc)
Dimensionality reduction
If you have so much input (100 variables), find the 10 most important variables.
Eg: by using PCA (principal component analysis)
ML process
First, you need to collect some data. It is important here to recognise the type of data you need. Data can be structured data in a table (eg: categorical, numerical, ordinal data etc) or unstructured data (eg: images, speech etc). Remember, rubbish in, rubbish out.
Second, you need to prepare the data. Typical data preparation steps:
Exploratory data analysis (EDA)
This process involves understanding your data. Including exploring whether there are outliers and missing data.
The third step is to train the model.
Next, we evaluate the model based on available metrics. There are several considerations here:
Once we are confident, we can serve the model. We will not know how it performs until we put it out for real. Use different tools whether the final goal is an a mobile app or a web based application.
Finally, we need to continue evaluating the model and retrain the model if needed. The model may change if the data source has changed (such as new road) or data source has been upgraded (such as new hardware used).
What have I learnt from this roadmap?
Good news. I know the basic. I have used Python before and familiar with the libraries. I guess I am closer to intermediate in ML. I have watched or read or have done tutorial covering some of these ML algorithms.
I guess I will start here:
To do lists (probably in this order):