"2022-09-09 11:53:44.744528: I tensorflow/core/platform/cpu_feature_guard.cc:193] This TensorFlow binary is optimized with oneAPI Deep Neural Network Library (oneDNN) to use the following CPU instructions in performance-critical operations: AVX2 AVX512F FMA\n",
"To enable them in other operations, rebuild TensorFlow with the appropriate compiler flags.\n",
"2022-09-09 11:53:44.949703: W tensorflow/stream_executor/platform/default/dso_loader.cc:64] Could not load dynamic library 'libcudart.so.11.0'; dlerror: libcudart.so.11.0: cannot open shared object file: No such file or directory\n",
"2022-09-09 11:53:44.949739: I tensorflow/stream_executor/cuda/cudart_stub.cc:29] Ignore above cudart dlerror if you do not have a GPU set up on your machine.\n",
"2022-09-09 11:53:44.987045: E tensorflow/stream_executor/cuda/cuda_blas.cc:2981] Unable to register cuBLAS factory: Attempting to register factory for plugin cuBLAS when one has already been registered\n",
"2022-09-09 11:53:46.016855: W tensorflow/stream_executor/platform/default/dso_loader.cc:64] Could not load dynamic library 'libnvinfer.so.7'; dlerror: libnvinfer.so.7: cannot open shared object file: No such file or directory\n",
"2022-09-09 11:53:46.016974: W tensorflow/stream_executor/platform/default/dso_loader.cc:64] Could not load dynamic library 'libnvinfer_plugin.so.7'; dlerror: libnvinfer_plugin.so.7: cannot open shared object file: No such file or directory\n",
"2022-09-09 11:53:46.016987: W tensorflow/compiler/tf2tensorrt/utils/py_utils.cc:38] TF-TRT Warning: Cannot dlopen some TensorRT libraries. If you would like to use Nvidia GPU with TensorRT, please make sure the missing libraries mentioned above are installed properly.\n"
]
},
{
"ename": "AssertionError",
"evalue": "Wrong Tensorflow version: expected 2.3.1, available 2.10.0",
"\u001b[0;31mAssertionError\u001b[0m: Wrong Tensorflow version: expected 2.3.1, available 2.10.0"
]
}
],
"source": [
"source": [
"import numpy as np\n",
"import numpy as np\n",
"import matplotlib.pyplot as plt\n",
"import matplotlib.pyplot as plt\n",
...
@@ -487,7 +513,7 @@
...
@@ -487,7 +513,7 @@
],
],
"metadata": {
"metadata": {
"kernelspec": {
"kernelspec": {
"display_name": "Python 3",
"display_name": "Python 3 (ipykernel)",
"language": "python",
"language": "python",
"name": "python3"
"name": "python3"
},
},
...
@@ -501,7 +527,7 @@
...
@@ -501,7 +527,7 @@
"name": "python",
"name": "python",
"nbconvert_exporter": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"pygments_lexer": "ipython3",
"version": "3.8.6"
"version": "3.9.12"
}
}
},
},
"nbformat": 4,
"nbformat": 4,
...
...
%% Cell type:markdown id: tags:
%% Cell type:markdown id: tags:
# Convolutional Neural Networks (CNN)
# Convolutional Neural Networks (CNN)
Convolutional neural networks (CNN) are a type of artificial neural network especially designed for processing images, i.e. matrices of color intensity values. There are various kinds of layers in a CNN, but a typical architecture is to build a sequence of *convolutional* layers that find patterns in individual areas of the input matrix and *pooling* layers that aggregate these patterns. The final layers *flatten* the matrix data in order to perform classification with a standard *dense* or *fully connected* layer.
Convolutional neural networks (CNN) are a type of artificial neural network especially designed for processing images, i.e. matrices of color intensity values. There are various kinds of layers in a CNN, but a typical architecture is to build a sequence of *convolutional* layers that find patterns in individual areas of the input matrix and *pooling* layers that aggregate these patterns. The final layers *flatten* the matrix data in order to perform classification with a standard *dense* or *fully connected* layer.
This exercise applies CNNs to a very small dataset in order to avoid heavy calculations that would require a GPU on your laptop. The dataset is called [messy vs clean](https://www.kaggle.com/cdawn1/messy-vs-clean-room?) and contains images of either messy or clean rooms. Our job is it to predict whether a given image shows a messy or clean room - if only my bedroom whould look as clean as the following ...
This exercise applies CNNs to a very small dataset in order to avoid heavy calculations that would require a GPU on your laptop. The dataset is called [messy vs clean](https://www.kaggle.com/cdawn1/messy-vs-clean-room?) and contains images of either messy or clean rooms. Our job is it to predict whether a given image shows a messy or clean room - if only my bedroom whould look as clean as the following ...


%% Cell type:markdown id: tags:
%% Cell type:markdown id: tags:
### Import Libraries
### Import Libraries
If an error message shows up saying that a particular library is not available, you can install it using `pip install [library]`. This notebook uses the newest TensorFlow version 2.3.1. If you have installed an older version, make sure to update it.
If an error message shows up saying that a particular library is not available, you can install it using `pip install [library]`. This notebook uses the newest TensorFlow version 2.3.1. If you have installed an older version, make sure to update it.
%% Cell type:code id: tags:
%% Cell type:code id: tags:
``` python
``` python
importnumpyasnp
importnumpyasnp
importmatplotlib.pyplotasplt
importmatplotlib.pyplotasplt
# Auxiliary libraries for system paths, time measurement, ...
# Auxiliary libraries for system paths, time measurement, ...
importos
importos
importtime
importtime
fromglobimportglob
fromglobimportglob
fromcollectionsimportCounter
fromcollectionsimportCounter
# Libraries for interactive elements in Jupyter notebooks
# Libraries for interactive elements in Jupyter notebooks
fromipywidgetsimportinteract
fromipywidgetsimportinteract
# Libraries for deep learning
# Libraries for deep learning
importtensorflowastf
importtensorflowastf
fromtensorflowimportkeras
fromtensorflowimportkeras
fromtensorflow.kerasimportdatasets,layers,models
fromtensorflow.kerasimportdatasets,layers,models
fromtensorflow.keras.optimizersimportSGD,Adam
fromtensorflow.keras.optimizersimportSGD,Adam
asserttf.__version__=='2.3.1','Wrong Tensorflow version: expected {}, available {}'.format('2.3.1',tf.__version__)
asserttf.__version__=='2.3.1','Wrong Tensorflow version: expected {}, available {}'.format('2.3.1',tf.__version__)
```
```
%% Output
2022-09-09 11:53:44.744528: I tensorflow/core/platform/cpu_feature_guard.cc:193] This TensorFlow binary is optimized with oneAPI Deep Neural Network Library (oneDNN) to use the following CPU instructions in performance-critical operations: AVX2 AVX512F FMA
To enable them in other operations, rebuild TensorFlow with the appropriate compiler flags.
2022-09-09 11:53:44.949703: W tensorflow/stream_executor/platform/default/dso_loader.cc:64] Could not load dynamic library 'libcudart.so.11.0'; dlerror: libcudart.so.11.0: cannot open shared object file: No such file or directory
2022-09-09 11:53:44.949739: I tensorflow/stream_executor/cuda/cudart_stub.cc:29] Ignore above cudart dlerror if you do not have a GPU set up on your machine.
2022-09-09 11:53:44.987045: E tensorflow/stream_executor/cuda/cuda_blas.cc:2981] Unable to register cuBLAS factory: Attempting to register factory for plugin cuBLAS when one has already been registered
2022-09-09 11:53:46.016855: W tensorflow/stream_executor/platform/default/dso_loader.cc:64] Could not load dynamic library 'libnvinfer.so.7'; dlerror: libnvinfer.so.7: cannot open shared object file: No such file or directory
2022-09-09 11:53:46.016974: W tensorflow/stream_executor/platform/default/dso_loader.cc:64] Could not load dynamic library 'libnvinfer_plugin.so.7'; dlerror: libnvinfer_plugin.so.7: cannot open shared object file: No such file or directory
2022-09-09 11:53:46.016987: W tensorflow/compiler/tf2tensorrt/utils/py_utils.cc:38] TF-TRT Warning: Cannot dlopen some TensorRT libraries. If you would like to use Nvidia GPU with TensorRT, please make sure the missing libraries mentioned above are installed properly.
We set the values for training parameters, i.e. how many epochs the model should be trained (1 epoch = 1 iteration over the dataset) and the batch size (number of samples used for gradient computation).
We set the values for training parameters, i.e. how many epochs the model should be trained (1 epoch = 1 iteration over the dataset) and the batch size (number of samples used for gradient computation).
%% Cell type:code id: tags:
%% Cell type:code id: tags:
``` python
``` python
epochs=30
epochs=30
batch_size=16
batch_size=16
initial_lr=0.001
initial_lr=0.001
img_height=128
img_height=128
img_width=128
img_width=128
train_path=os.getcwd()+"/data/train"
train_path=os.getcwd()+"/data/train"
test_path=os.getcwd()+"/data/test"
test_path=os.getcwd()+"/data/test"
num_classes=2
num_classes=2
```
```
%% Cell type:markdown id: tags:
%% Cell type:markdown id: tags:
## Load the Dataset
## Load the Dataset
For loading the dataset we can directly use TensorFlow's `tf.data.Dataset` API, which provides us with many utility methods. To load an image dataset from the directory, we use the `image_dataset_from_directory()` method. We can specify the `image_size` which automatically resizes the loaded images. Read more about the method and its parameters [here](https://www.tensorflow.org/api_docs/python/tf/keras/preprocessing/image_dataset_from_directory). We will load the training dataset (`train_ds`) first, by setting `subset` to `training`. Afterwards, the validation dataset is loaded (`val_ds`), by setting `subset` to `validation`. For both datasets, a validation split parameter is needed, which states how much of the dataset is used for training and how much for validation.
For loading the dataset we can directly use TensorFlow's `tf.data.Dataset` API, which provides us with many utility methods. To load an image dataset from the directory, we use the `image_dataset_from_directory()` method. We can specify the `image_size` which automatically resizes the loaded images. Read more about the method and its parameters [here](https://www.tensorflow.org/api_docs/python/tf/keras/preprocessing/image_dataset_from_directory). We will load the training dataset (`train_ds`) first, by setting `subset` to `training`. Afterwards, the validation dataset is loaded (`val_ds`), by setting `subset` to `validation`. For both datasets, a validation split parameter is needed, which states how much of the dataset is used for training and how much for validation.
Let us next examine the loaded data. We start by checking if the expected classes are in the dataset and therefore print the variable `class_names` from the dataset.
Let us next examine the loaded data. We start by checking if the expected classes are in the dataset and therefore print the variable `class_names` from the dataset.
%% Cell type:code id: tags:
%% Cell type:code id: tags:
``` python
``` python
class_names=train_ds.class_names
class_names=train_ds.class_names
assertnum_classes==len(class_names)
assertnum_classes==len(class_names)
class_names
class_names
```
```
%% Cell type:markdown id: tags:
%% Cell type:markdown id: tags:
Let us examine the distribution of labels in both the train and the validation set. Remember that we can only use *accuracy* as quality metric when we are given balanced data.
Let us examine the distribution of labels in both the train and the validation set. Remember that we can only use *accuracy* as quality metric when we are given balanced data.
_=plt.title("Distribution of classes in the VALIDATION dataset")
_=plt.title("Distribution of classes in the VALIDATION dataset")
```
```
%% Cell type:markdown id: tags:
%% Cell type:markdown id: tags:
The datasets are nicely balanced. We next plot a few images with corresponding labels. To do this, we loop over the dataset one image at a time. This can be achieved by calling `.take(1)` from the dataset, which returns $1$ batch of image/labels at a time.
The datasets are nicely balanced. We next plot a few images with corresponding labels. To do this, we loop over the dataset one image at a time. This can be achieved by calling `.take(1)` from the dataset, which returns $1$ batch of image/labels at a time.
%% Cell type:code id: tags:
%% Cell type:code id: tags:
``` python
``` python
# Specify the figure size of the individual plot
# Specify the figure size of the individual plot
plt.figure(figsize=(10,10))
plt.figure(figsize=(10,10))
# Loop over the dataset by taking one image and label at a time
# Loop over the dataset by taking one image and label at a time
Now we are ready to create our model and print a summary of it. We will start with creating a sequential model, meaning the model is a stack of linear layers with exactly one input and output. Then we first add a rescaling operation to the model, which ensures that all intensity values passed to the model are rescaled to [0, 1]. This is also called a preprocessing layer. After that, we can start with our first convolution layer, which consists of a `Conv2D` layer (convolutional operation) followed by a `MaxPool2D` layer (max-pooling operation). The model consists of three convolution layers in total. Then we add an operation to flatten the output to 1D, to then add two fully connected layers for classification. The last layer produces a probability distribution over the two classes using a softmax. After we finished creating our CNN, we need to compile it using an optimizer, loss function and metrics.
Now we are ready to create our model and print a summary of it. We will start with creating a sequential model, meaning the model is a stack of linear layers with exactly one input and output. Then we first add a rescaling operation to the model, which ensures that all intensity values passed to the model are rescaled to [0, 1]. This is also called a preprocessing layer. After that, we can start with our first convolution layer, which consists of a `Conv2D` layer (convolutional operation) followed by a `MaxPool2D` layer (max-pooling operation). The model consists of three convolution layers in total. Then we add an operation to flatten the output to 1D, to then add two fully connected layers for classification. The last layer produces a probability distribution over the two classes using a softmax. After we finished creating our CNN, we need to compile it using an optimizer, loss function and metrics.
%% Cell type:code id: tags:
%% Cell type:code id: tags:
``` python
``` python
# Create a sequential model, which we will use to add the different layers
# Create a sequential model, which we will use to add the different layers
model=models.Sequential()
model=models.Sequential()
# Rescale the image to range [0, 1] instead of [0, 255]
# Rescale the image to range [0, 1] instead of [0, 255]
The model can be trained using `fit`, where we specify the training dataset (`train_ds`) as well as the validation dataset (`val_ds`) along with the number of epochs for training.
The model can be trained using `fit`, where we specify the training dataset (`train_ds`) as well as the validation dataset (`val_ds`) along with the number of epochs for training.
%% Cell type:code id: tags:
%% Cell type:code id: tags:
``` python
``` python
start=time.time()
start=time.time()
# Fit the model on the training dataset and validate it on the validation set
# Fit the model on the training dataset and validate it on the validation set
history=# ...
history=# ...
end=time.time()
end=time.time()
print("Training took {0:.2f} seconds".format(end-start))
print("Training took {0:.2f} seconds".format(end-start))
```
```
%% Cell type:code id: tags:
%% Cell type:code id: tags:
``` python
``` python
start=time.time()
start=time.time()
# Fit the model on the training dataset and validate it on the validation set
# Fit the model on the training dataset and validate it on the validation set
history=model.fit(
history=model.fit(
train_ds,
train_ds,
validation_data=val_ds,
validation_data=val_ds,
epochs=epochs,
epochs=epochs,
shuffle=True
shuffle=True
)
)
end=time.time()
end=time.time()
print("Training took {0:.2f} seconds".format(end-start))
print("Training took {0:.2f} seconds".format(end-start))
```
```
%% Cell type:markdown id: tags:
%% Cell type:markdown id: tags:
We plot accuracy and loss of the model on both of the datasets (training and validation).
We plot accuracy and loss of the model on both of the datasets (training and validation).
An important aspect of machine learning is to interpret these so-called learning curves. It is very obvious to us that the model greatly overfits the training data. The training accuracy (in the plot on the left) is rising steeply, while the validation accuracy oscillates around $\approx 0.7$. In other word, performance on predicting already seen data improves, but there is no improvement on fresh data. This typically happens when a machine learning model memorizes data instead of learning patterns and structures, i.e. it leans the training data by heart. This problem is known as **overfitting** and is one of the most common issues to tackle in machine learning. There are many ways to address overfitting; one could:
An important aspect of machine learning is to interpret these so-called learning curves. It is very obvious to us that the model greatly overfits the training data. The training accuracy (in the plot on the left) is rising steeply, while the validation accuracy oscillates around $\approx 0.7$. In other word, performance on predicting already seen data improves, but there is no improvement on fresh data. This typically happens when a machine learning model memorizes data instead of learning patterns and structures, i.e. it leans the training data by heart. This problem is known as **overfitting** and is one of the most common issues to tackle in machine learning. There are many ways to address overfitting; one could:
- add dropout to the model
- add dropout to the model
- constrain the weights of the model
- constrain the weights of the model
- use data augmentation
- use data augmentation
- add regularization (e.g. L1, L2)
- add regularization (e.g. L1, L2)
- reduce model complexity
- reduce model complexity
- ...
- ...
However, in this case, we should simply **acquire more data**. The model mostly overfits because there is not enough data to train a reasonable classifier.
However, in this case, we should simply **acquire more data**. The model mostly overfits because there is not enough data to train a reasonable classifier.
%% Cell type:markdown id: tags:
%% Cell type:markdown id: tags:
# Test the Model on unseen Data
# Test the Model on unseen Data
Despite of the expected low performance on unseen data, we can now use the trained model and feed it images from the test set.
Despite of the expected low performance on unseen data, we can now use the trained model and feed it images from the test set.
Convolutional neural networks are very data hungry and require a lot of computational resources for training. Hence, we cannot train any deeper models on larger datasets without giving you access to GPU resources in the cloud. Still we hopefully could convince you that there is no magic behind image classification with deep learning. Cheers!
Convolutional neural networks are very data hungry and require a lot of computational resources for training. Hence, we cannot train any deeper models on larger datasets without giving you access to GPU resources in the cloud. Still we hopefully could convince you that there is no magic behind image classification with deep learning. Cheers!