How to solve Multi-Class Classification Problems in Deep Learning with Tensorflow & Keras?

3 года назад

2,270 Просмотров

Access all tutorials at https://www.muratkarakaya.net
Code: https://colab.research.google.com/drive/1KsNfjXyR6A_8pN4PCc0fVhLiZoY1XJPY?usp=sharing
Classification Tutorials: https://youtube.com/playlist?list=PLQflnv_s49v-RGv6jb_sFOb_of9VOEpKG
Keras Tutorials: https://youtube.com/playlist?list=PLQflnv_s49v9EcZVg2EShHbZ_49cvnWu0

How to solve Multi-Class Classification Problems in Deep Learning with Tensorflow & Keras?
Which Activation & Loss functions should be used in Multi-ClassClassification Problems?
This is the third part of the "How to solve Classification Problems in Keras?" series.
If you have not gone over Part A and Part B, please review them before continuing with this tutorial.

The link to all parts is provided in the video description.

In this tutorial, we will focus on how to solve Multi-Class Classification Problems in Deep Learning with Tensorflow & Keras.

First, we will download the MNIST dataset.

In multi-class classification problems, we have two options to encode the true labels by using either:

integer numbers, or
one-hot vector
We will experiment with both encodings to observe the effect of the combinations of various last layer activation functions and loss functions on a Keras CNN model's performance.

In both experiments, we will discuss the relationship between Activation & Loss functions, label encodings, and accuracy metrics in detail.

We will understand why sometimes we could get surprising results when using different parameter settings other than the generally recommended ones.

As a result, we will gain insight into activation and loss functions and their interactions.

In the end, we will summarize the experiment results in a cheat table.

If you would like to learn more about Deep Learning with practical coding examples, please subscribe to my YouTube Channel or follow my blog on Medium. Do not forget to turn on notifications so that you will be notified when new parts are uploaded.

You can access this Colab Notebook using the link given in the video description below.

If you are ready, let's get started!

Load a Multi-Label Dataset
I pick the MNIST dataset a famous multi-label dataset
First let's load the MNIST dataset from Tensorflow Datasets
Transfer Learning by importing VGG16
True (Actual) Labels are encoded with a single integer number
When there is no activation function is used in the model's last layer, we need to set from_logits=True in cross-entropy loss functions as we discussed above. Thus, cross-entropy loss functions will apply a sigmoid transformation on predicted label values:
IMPORTANT: We need to use keras.metrics.SparseCategoricalAccuracy() for measuring the accuracy since it calculates how often predictions match integer labels.

As we mentioned above, Keras does not define a single accuracy metric, but several different ones, among them: accuracy, binary_accuracy and categorical_accuracy.

What happens under the hood is that, if you select mistakenly categorical cross-entropy as your loss function in binary classification and if you do not specify a particular accuracy metric by just writing

metrics="Accuracy"

Keras (wrongly...) infers that you are interested in the categorical_accuracy, and this is what it returns - while in fact you are interested in the binary_accuracy since our problem is a binary classification.

In summary;

to get model.fit() and model.evaulate() run correctly (without mixing the loss function and the classification problem at hand) we need to specify the actual accuracy metric!
if the true (actual) labels are encoded with integer numbers, you need to use keras.metrics.SparseCategoricalAccuracy() for measuring the accuracy since it calculates how often how often predictions match integer labels.
Why do BinaryCrossentropy & CategoricalCrossentropy loss functions generate errors?
Why do softmax & sigmoid activation functions with SparseCategoricalCrossentropy loss lead to the same accuracy?

Why does SparseCategoricalCrossentropy loss functions with from_logits=True lead to good accuracy without any activation function?
Notice that if we do not apply any Activation function at the last layer, we need to inform the cross entropy loss functions by setting the parameter from_logits=True so that the cross entropy loss functions will apply a sigmoid transformation onto the given logits by themselves!
In summary:
We can conclude that, if the task is multi-class classification and true (actual) labels are encoded as a single integer number we have 2 options to go:

Option 1:

activation = sigmoid or softmax

loss =SparseCategoricalCrossentropy()

accuracy metric= SparseCategoricalAccuracy()

Option 2:

activation = None

loss =SparseCategoricalCrossentropy(from_logits=True)

accuracy metric= SparseCategoricalAccuracy()

Тэги:

#activation_function #loss_function #keras #metrics #encoding #tensorflow #binary_classification #keras_activation_function #keras_loss_function #multi_class #multi-class #maulti-class_classification #multi_label_keras #multi_class_image_classification_keras #multi_class_image_classification_tensorflow #multiclass_image_classification #multiclass_classification #softmax_keras #sigmoid_keras #sparse_categorical_cross_entropy #keras_example #sparse_categorical_cross_entropy_explained

Скачать видео

Комментарии:

@tak68tak - 30.03.2021 06:17

This is really amazing video. The Google official tutorial pages should have links to his videos.
Your tutorials are really awesome, not personal but if you could improve the accent you would have more PVs.
Once again, your videos are the best on the planet.

Ответить