Solution of the exercises[Chapter 1: The Machine Learning Landscape]: Hands-On-Machine-Learning-with-Scikit-Learn-Keras-and-Tensorflow

7 min readNov 17, 2021

Chapter 1: The Machine Learning Landscape

I am a new learner of the book Hands-on Machine Learning with Scikit-Learn, Keras, and TensorFlow. After exploring chapter 1, I have tried to solve the exercises.

The exercises are solved in a detailed way so anyone can get help from any point of the exercise. I will try to complete the book to get deep knowledge on ML and Deep Learning.

1. How would you define Machine Learning?

Machine Learning is the science (and art) of programming computers so they can learn from data. (that gives computers the ability to learn without being explicitly programmed.)
In the traditional way, you would write a spam detection algorithm for each of the patterns that you noticed, and your program would flag emails as spam if a number of these patterns are detected.
In the ML approach, the program automatically learns which words and phrases are good predictors of spam by detecting unusually frequent patterns of words in the spam examples.

2.Can you name four types of problems where it shines?

A problem that using traditional approach will likely become a long list of complex rules. Ex: Email Spam detection
A problem that is continuously changing the patterns and is pretty hard to maintain. Ex: Spam pattern changes
Problems that either are too complex for traditional approaches or have no known algorithm. Ex: Speech Recognition
Getting Insights about complex problems and large amounts of data. Digging into large amounts of data can help discover patterns that were not immediately apparent. Ex: Data mining

3.What is a labeled training set?

In supervised learning, the training data you feed to the algorithm includes the desired solutions, called labels.

4.What is a feature/predictor?

To predict a target numeric value, such as the price of a car, given a set of features (mileage, age, brand, etc.) called predictors.
This sort of task is called regression. To train the system, you need to give it many examples of cars, including both their predictors and their labels (i.e., their prices).
In Machine Learning an attribute is a data type (e.g., “Mileage”), while a feature has several meanings depending on the context, but generally means an attribute plus its value (e.g., “Mileage =15,000”).

5. What are the two most common supervised tasks?

Classification & Prediction/Regression.

6.Can you name four common unsupervised tasks?

Clustering : Detect the group of similar visitors from the website visitors. For example, it might notice that 40% of your visitors are males who love comic books and generally read your blog in the evening, while 20% are young sci-fi lovers who visit during the weekends, and so on.
Visualization: You feed them a lot of complex and unlabeled data, and they output a 2D or 3D representation of your data that can easily be plotted. For Example, notice how animals are rather well separated from vehicles, how horses are close to deer but far from birds, and so on.
Dimensionality reduction: The goal is to simplify the data set without losing too much information. Merge several correlated features into one. For example, a car’s “mileage” may be very correlated with its “age”, so the dimensionality reduction algorithm will merge them into one feature that represents the car’s wear and tear. This is called feature extraction.
Anomaly detection: Catching manufacturing defects, or automatically removing outliers from a dataset before feeding it to another learning algorithm.

7.What type of Machine Learning algorithm would you use to allow a robot to walk in various unknown terrains?

Reinforcement Learning — an agent in this context, can observe the environment, select and perform actions, and get rewards in return (or penalties in the form of negative rewards. It must then learn by itself what is the best strategy, called a policy, to get the most reward over time.

8.What type of algorithm would you use to segment your customers into multiple groups?

Unsupervised Learning — Clustering Algorithm.

9.Would you frame the problem of spam detection as a supervised learning problem or an unsupervised learning problem?

Supervised learning problem because the train set contains the labeled data.

10.What is an online learning system?

Train the system incrementally by feeding it data instances sequentially, either individually or by small groups called mini-batches. Online learning is great for systems that receive data as a continuous flow (e.g., stock prices) and need to adapt to change rapidly or autonomously. It is also a good option if you have limited computing resources: once an online learning system has learned about “new data instances”, it does not need them anymore, so you can discard them (unless you want to be able to roll back to a previous state and “replay” the data).

11.What is out-of-core learning?
12.What is generalize?

One more way to categorize Machine Learning systems is by how they generalize. Most Machine Learning tasks are about making predictions. This means that given a number of training examples, the system needs to be able to generalize to examples it has never seen before.
Having a good performance measure on the training data is good, but insufficient; the true goal is to perform well on new instances.

13.What type of learning algorithm relies on a similarity measure to make predictions?

Instance based learning algorithm because it learns simply by learn by heart. SO that it can make prediction based on identical data from the training examples by measuring the similarity among the train set and test set examples.

14.What is the difference between a model parameter and a learning algorithm’s hyperparameter?

In model-based learning to predict something (e.g.: does money make people happier?) from the training examples, a model needs to be built. To build a “linear model”, there must be a trend (relationship) among the data (money and happier life). Fig shows few possible Linear Models.
Building a linear model means you can represent a linear equation.

life_satisfaction = θ0 + θ1 × GDP_ per_capita

This model has two model parameters, θ0 and θ1.

Before using the model, you need to do performance measure to specify how good or bad your model is. By tweaking these parameters, you can make your model best fit to your data.
Linear Regression algorithm comes in: you feed it your training examples and it finds the parameters that make the linear model fit best to your data. This is called training the model. So, training a model means using an ML algorithm on your training examples data set to build a linear model that will best fit to your test data set.
For linear regression problems, people typically use a cost function that measures the distance between the linear model’s predictions and the training examples; the objective is to minimize this distance. For example, linear model predicts that the according to GDP per capita, Life satisfaction of USA is 5.7 but in the training example it was 7.2. So, we can say the model is not best fit to your data. It is called the performance measure.
Constraining a model to make it simpler and reduce the risk of overfitting is called regularization.
This technique can be used in such a way that it will allow to maintain all variables or features in the model by reducing the weight / magnitude of the variables.Regularization allows to be more generalized the model to new data set.The amount of regularization to apply during learning can be controlled by a hyperparameter.
A hyperparameter is a parameter of a learning algorithm (Eg: Regression Algorithm). As such, it is not affected by the learning algorithm itself; it must be set prior to training and remains constant during training. If you set the regularization hyperparameter to a very large value, you will get an almost flat model (a slope close to zero); the learning algorithm will almost certainly not overfit the training data, but it will be less likely to find a good solution. Tuning hyperparameters is an important part of building a Machine Learning system.
So, the basic difference is , model parameter is needed to train the model and best fit the model with the training data. On the other hand, hyperparameter is needed by the learning algorithm to maintain regularization of the model to maintain the risk of overfitting the model on test data.

15. What do model-based learning algorithms search for? What is the most common strategy they use to succeed? How do they make predictions?

The model based learning algorithms search for the best value configuration of model parameters that best fit with the training data. They tweak the model parameters to be succeed. They measure the distance between the training examples and new set data examples.

16. Can you name four of the main challenges in Machine Learning?

Insufficient quantity of data

Poor quality of data

Non Representative data

Irrelevant features

17. If your model performs great on the training data but generalizes poorly to new instances, what is happening? Can you name three possible solutions?

It’s called model overfitting. Possible solutions are

Regularization/Constraint the model by reducing the number of attributes

Gather more train data

Build the model with fewer model parameters.

18. What is a test set and why would you want to use it?

A test set is a new case that is used to evaluate the model.

19. What is the purpose of a validation set?

Normally the generalization error gets measured through the test set, and the different models (linear model, polynomial model) and hyperparameters (for the learning algorithm) are applied to the same test set to adapt the best model. So when the model is applied to the production test set, it is unlikely to perform well on new data. So, to overcome this problem a second hold-out set will be derived that is called the validation set.

20. What can go wrong if you tune hyperparameters using the test set?

When multiple models and hyperparameters are used in the test set to adapt the best model, then it actually fits for the test set but it might not perform well on new data.

21. What is cross-validation and why would you prefer it to a validation set?

To avoid “wasting” too much training data in validation sets, a common technique is to use cross-validation: the training set is split into complementary subsets, and each model is trained against a different combination of these subsets and validated against the remaining parts.

Solution of the exercises[Chapter 1: The Machine Learning Landscape]: Hands-On-Machine-Learning-with-Scikit-Learn-Keras-and-Tensorflow

Chapter 1: The Machine Learning Landscape

life_satisfaction = θ0 + θ1 × GDP_ per_capita

Written by Anjan Debnath