Skip to content
Snippets Groups Projects
Commit b967346e authored by Mirko Birbaumer's avatar Mirko Birbaumer
Browse files

O-rings completed with more explanations

parent 0ea2175a
No related branches found
No related tags found
No related merge requests found
%% Cell type:markdown id: tags: %% Cell type:markdown id: tags:
# Exercise 1 : Conversion from Celsius to Fahrenheit (Simple Regression Analysis) # Exercise 1 : Conversion from Celsius to Fahrenheit (Simple Regression Analysis)
%% Cell type:markdown id: tags: %% Cell type:markdown id: tags:
The problem we will solve is to convert from Celsius to Fahrenheit, where the approximate formula is: The problem we will solve is to convert from Celsius to Fahrenheit, where the approximate formula is:
$$ f = c \times 1.8 + 32 $$ $$ f = c \times 1.8 + 32 $$
Of course, it would be simple enough to create a conventional Python function that directly performs this calculation, but that wouldn't be machine learning. Of course, it would be simple enough to create a conventional Python function that directly performs this calculation, but that wouldn't be machine learning.
Instead, we will give TensorFlow some sample Celsius values (0, 8, 15, 22, 38) and their corresponding Fahrenheit values (32, 46, 59, 72, 100). Instead, we will give TensorFlow some sample Celsius values (0, 8, 15, 22, 38) and their corresponding Fahrenheit values (32, 46, 59, 72, 100).
Then, we will train a model that figures out the above formula through the training process. This is a _simple regression analysis_ problem. Then, we will train a model that figures out the above formula through the training process. This is a _simple regression analysis_ problem.
%% Cell type:markdown id: tags: %% Cell type:markdown id: tags:
## Import dependencies ## Import dependencies
First, import TensorFlow. Here, we're calling it `tf` for ease of use. We also tell it to only display errors. First, import TensorFlow. Here, we're calling it `tf` for ease of use. We also tell it to only display errors.
Next, import [NumPy](http://www.numpy.org/) as `np`. Numpy helps us to represent our data as highly performant lists. Next, import [NumPy](http://www.numpy.org/) as `np`. Numpy helps us to represent our data as highly performant lists.
%% Cell type:code id: tags: %% Cell type:code id: tags:
``` python ``` python
from __future__ import absolute_import, division, print_function, unicode_literals from __future__ import absolute_import, division, print_function, unicode_literals
import tensorflow as tf import tensorflow as tf
import numpy as np import numpy as np
``` ```
%% Cell type:code id: tags: %% Cell type:code id: tags:
``` python ``` python
import logging import logging
logger = tf.get_logger() logger = tf.get_logger()
logger.setLevel(logging.ERROR) logger.setLevel(logging.ERROR)
``` ```
%% Cell type:markdown id: tags: %% Cell type:markdown id: tags:
## Set up training data ## Set up training data
As we saw before, supervised Machine Learning is all about figuring out an algorithm given a set of inputs and outputs. Since the task in this Codelab is to create a model that can give the temperature in Fahrenheit when given the degrees in Celsius, we create two lists `celsius_q` and `fahrenheit_a` that we can use to train our model. As we saw before, supervised Machine Learning is all about figuring out an algorithm given a set of inputs and outputs. Since the task in this Codelab is to create a model that can give the temperature in Fahrenheit when given the degrees in Celsius, we create two lists `celsius_q` and `fahrenheit_a` that we can use to train our model.
%% Cell type:code id: tags: %% Cell type:code id: tags:
``` python ``` python
celsius_q = np.array([-40, -10, 0, 8, 15, 22, 38], dtype=float) celsius_q = np.array([-40, -10, 0, 8, 15, 22, 38], dtype=float)
fahrenheit_a = np.array([-40, 14, 32, 46, 59, 72, 100], dtype=float) fahrenheit_a = np.array([-40, 14, 32, 46, 59, 72, 100], dtype=float)
for i,c in enumerate(celsius_q): for i,c in enumerate(celsius_q):
print("{} degrees Celsius = {} degrees Fahrenheit".format(c, fahrenheit_a[i])) print("{} degrees Celsius = {} degrees Fahrenheit".format(c, fahrenheit_a[i]))
``` ```
%% Output %% Output
-40.0 degrees Celsius = -40.0 degrees Fahrenheit -40.0 degrees Celsius = -40.0 degrees Fahrenheit
-10.0 degrees Celsius = 14.0 degrees Fahrenheit -10.0 degrees Celsius = 14.0 degrees Fahrenheit
0.0 degrees Celsius = 32.0 degrees Fahrenheit 0.0 degrees Celsius = 32.0 degrees Fahrenheit
8.0 degrees Celsius = 46.0 degrees Fahrenheit 8.0 degrees Celsius = 46.0 degrees Fahrenheit
15.0 degrees Celsius = 59.0 degrees Fahrenheit 15.0 degrees Celsius = 59.0 degrees Fahrenheit
22.0 degrees Celsius = 72.0 degrees Fahrenheit 22.0 degrees Celsius = 72.0 degrees Fahrenheit
38.0 degrees Celsius = 100.0 degrees Fahrenheit 38.0 degrees Celsius = 100.0 degrees Fahrenheit
%% Cell type:markdown id: tags: %% Cell type:markdown id: tags:
### Some Machine Learning terminology ### Some Machine Learning terminology
- **Feature** — The input(s) to our model. In this case, a single value — the degrees in Celsius. - **Feature** — The input(s) to our model. In this case, a single value — the degrees in Celsius.
- **Labels/response variable** — The output our model predicts. In this case, a single value — the degrees in Fahrenheit. In a classification setting, we would predict labels (discrete classes), in a regression setting, we predict a continuous response variable, such as Fahrenheit. - **Labels/response variable** — The output our model predicts. In this case, a single value — the degrees in Fahrenheit. In a classification setting, we would predict labels (discrete classes), in a regression setting, we predict a continuous response variable, such as Fahrenheit.
- **Example** — A pair of inputs/outputs used during training. In our case a pair of values from `celsius_q` and `fahrenheit_a` at a specific index, such as `(22,72)`. - **Example** — A pair of inputs/outputs used during training. In our case a pair of values from `celsius_q` and `fahrenheit_a` at a specific index, such as `(22,72)`.
%% Cell type:markdown id: tags: %% Cell type:markdown id: tags:
## 1. Define the Network ## 1. Define the Network
Next create the model. We will use simplest possible model we can, a Dense network. Since the problem is straightforward, this network will require only a single layer, with a single neuron. Next create the model. We will use simplest possible model we can, a Dense network. Since the problem is straightforward, this network will require only a single layer, with a single neuron.
### Build a layer ### Build a layer
We'll call the layer `l0` and create it by instantiating `tf.keras.layers.Dense` with the following configuration: We'll call the layer `l0` and create it by instantiating `tf.keras.layers.Dense` with the following configuration:
* `input_shape=[1]` — This specifies that the input to this layer is a single value. That is, the shape is a one-dimensional array with one member. Since this is the first (and only) layer, that input shape is the input shape of the entire model. The single value is a floating point number, representing degrees Celsius. * `input_shape=[1]` — This specifies that the input to this layer is a single value. That is, the shape is a one-dimensional array with one member. Since this is the first (and only) layer, that input shape is the input shape of the entire model. The single value is a floating point number, representing degrees Celsius.
* `units=1` — This specifies the number of neurons in the layer. The number of neurons defines how many internal variables the layer has to try to learn how to solve the problem (more later). Since this is the final layer, it is also the size of the model's output — a single float value representing degrees Fahrenheit. (In a multi-layered network, the size and shape of the layer would need to match the `input_shape` of the next layer.) * `units=1` — This specifies the number of neurons in the layer. The number of neurons defines how many internal variables the layer has to try to learn how to solve the problem (more later). Since this is the final layer, it is also the size of the model's output — a single float value representing degrees Fahrenheit. (In a multi-layered network, the size and shape of the layer would need to match the `input_shape` of the next layer.)
%% Cell type:code id: tags: %% Cell type:code id: tags:
``` python ``` python
l0 = tf.keras.layers.Dense(units=1, input_shape=[1]) l0 = tf.keras.layers.Dense(units=1, input_shape=[1])
``` ```
%% Cell type:markdown id: tags: %% Cell type:markdown id: tags:
### Assemble layers into the model ### Assemble layers into the model
Once layers are defined, they need to be assembled into a model. The Sequential model definition takes a list of layers as argument, specifying the calculation order from the input to the output. Once layers are defined, they need to be assembled into a model. The Sequential model definition takes a list of layers as argument, specifying the calculation order from the input to the output.
This model has just a single layer, l0. This model has just a single layer, l0.
%% Cell type:code id: tags: %% Cell type:code id: tags:
``` python ``` python
model = <--------- Your Code here --------------> model = <--------- Your Code here -------------->
``` ```
%% Cell type:markdown id: tags: %% Cell type:markdown id: tags:
## 2. Compile the network, with loss and optimizer functions ## 2. Compile the network, with loss and optimizer functions
Before training, the model has to be compiled. When compiled for training, the model is given: Before training, the model has to be compiled. When compiled for training, the model is given:
- **Loss function** — A way of measuring how far off predictions are from the desired outcome. (The measured difference is called the "loss".) - **Loss function** — A way of measuring how far off predictions are from the desired outcome. (The measured difference is called the "loss".)
- **Optimizer function** — A way of adjusting internal values in order to reduce the loss. - **Optimizer function** — A way of adjusting internal values in order to reduce the loss.
%% Cell type:markdown id: tags: %% Cell type:markdown id: tags:
model.compile(loss='mean_squared_error', model.compile(loss='mean_squared_error',
optimizer=tf.keras.optimizers.Adam(0.1)) optimizer=tf.keras.optimizers.Adam(0.1))
%% Cell type:markdown id: tags: %% Cell type:markdown id: tags:
These are used during training (`model.fit()`, below) to first calculate the loss at each point, and then improve it. In fact, the act of calculating the current loss of a model and then improving it is precisely what training is. These are used during training (`model.fit()`, below) to first calculate the loss at each point, and then improve it. In fact, the act of calculating the current loss of a model and then improving it is precisely what training is.
During training, the optimizer function is used to calculate adjustments to the model's internal variables. The goal is to adjust the internal variables until the model (which is really a math function) mirrors the actual equation for converting Celsius to Fahrenheit. During training, the optimizer function is used to calculate adjustments to the model's internal variables. The goal is to adjust the internal variables until the model (which is really a math function) mirrors the actual equation for converting Celsius to Fahrenheit.
TensorFlow uses numerical analysis to perform this tuning, and all this complexity is hidden from you so we will not go into the details here. What is useful to know about these parameters are: TensorFlow uses numerical analysis to perform this tuning, and all this complexity is hidden from you so we will not go into the details here. What is useful to know about these parameters are:
The loss function ([mean squared error](https://en.wikipedia.org/wiki/Mean_squared_error)) and the optimizer ([Adam](https://machinelearningmastery.com/adam-optimization-algorithm-for-deep-learning/)) used here are standard for simple models like this one, but many others are available. It is not important to know how these specific functions work at this point. The loss function ([mean squared error](https://en.wikipedia.org/wiki/Mean_squared_error)) and the optimizer ([Adam](https://machinelearningmastery.com/adam-optimization-algorithm-for-deep-learning/)) used here are standard for simple models like this one, but many others are available. It is not important to know how these specific functions work at this point.
One part of the Optimizer you may need to think about when building your own models is the learning rate (`0.1` in the code above). This is the step size taken when adjusting values in the model. If the value is too small, it will take too many iterations to train the model. Too large, and accuracy goes down. Finding a good value often involves some trial and error, but the range is usually within 0.001 (default), and 0.1 One part of the Optimizer you may need to think about when building your own models is the learning rate (`0.1` in the code above). This is the step size taken when adjusting values in the model. If the value is too small, it will take too many iterations to train the model. Too large, and accuracy goes down. Finding a good value often involves some trial and error, but the range is usually within 0.001 (default), and 0.1
%% Cell type:markdown id: tags: %% Cell type:markdown id: tags:
## 3. Fit the model ## 3. Fit the model
Train the model by calling the `fit` method. Train the model by calling the `fit` method.
During training, the model takes in Celsius values, performs a calculation using the current internal variables (called "weights") and outputs values which are meant to be the Fahrenheit equivalent. Since the weights are initially set randomly, the output will not be close to the correct value. The difference between the actual output and the desired output is calculated using the loss function, and the optimizer function directs how the weights should be adjusted. During training, the model takes in Celsius values, performs a calculation using the current internal variables (called "weights") and outputs values which are meant to be the Fahrenheit equivalent. Since the weights are initially set randomly, the output will not be close to the correct value. The difference between the actual output and the desired output is calculated using the loss function, and the optimizer function directs how the weights should be adjusted.
This cycle of calculate, compare, adjust is controlled by the `fit` method. The first argument is the inputs, the second argument is the desired outputs. The `epochs` argument specifies how many times this cycle should be run, and the `verbose` argument controls how much output the method produces. This cycle of calculate, compare, adjust is controlled by the `fit` method. The first argument is the inputs, the second argument is the desired outputs. The `epochs` argument specifies how many times this cycle should be run, and the `verbose` argument controls how much output the method produces.
%% Cell type:code id: tags: %% Cell type:code id: tags:
``` python ``` python
history = model.fit(<--- your code here --->, <--- your code here --->, epochs=500, verbose=False) history = model.fit(<--- your code here --->, <--- your code here --->, epochs=500, verbose=False)
print("Finished training the model") print("Finished training the model")
``` ```
%% Cell type:markdown id: tags: %% Cell type:markdown id: tags:
## 4. Evaluate the Model - Display training statistics ## 4. Evaluate the Model - Display training statistics
The `fit` method returns a history object. We can use this object to plot how the loss of our model goes down after each training epoch. A high loss means that the Fahrenheit degrees the model predicts is far from the corresponding value in `fahrenheit_a`. The `fit` method returns a history object. We can use this object to plot how the loss of our model goes down after each training epoch. A high loss means that the Fahrenheit degrees the model predicts is far from the corresponding value in `fahrenheit_a`.
We'll use [Matplotlib](https://matplotlib.org/) to visualize this (you could use another tool). As you can see, our model improves very quickly at first, and then has a steady, slow improvement until it is very near "perfect" towards the end. We'll use [Matplotlib](https://matplotlib.org/) to visualize this (you could use another tool). As you can see, our model improves very quickly at first, and then has a steady, slow improvement until it is very near "perfect" towards the end.
%% Cell type:code id: tags: %% Cell type:code id: tags:
``` python ``` python
import matplotlib.pyplot as plt import matplotlib.pyplot as plt
plt.xlabel('Epoch Number') plt.xlabel('Epoch Number')
plt.ylabel("Loss Magnitude") plt.ylabel("Loss Magnitude")
plt.plot(history.history['loss']) plt.plot(history.history['loss'])
``` ```
%% Cell type:markdown id: tags: %% Cell type:markdown id: tags:
## 5. Use the model to predict values ## 5. Use the model to predict values
Now you have a model that has been trained to learn the relationship between `celsius_q` and `fahrenheit_a`. You can use the predict method to have it calculate the Fahrenheit degrees for a previously unknown Celsius degrees. Now you have a model that has been trained to learn the relationship between `celsius_q` and `fahrenheit_a`. You can use the predict method to have it calculate the Fahrenheit degrees for a previously unknown Celsius degrees.
So, for example, if the Celsius value is 100, what do you think the Fahrenheit result will be? Take a guess before you run this code. So, for example, if the Celsius value is 100, what do you think the Fahrenheit result will be? Take a guess before you run this code.
%% Cell type:code id: tags: %% Cell type:code id: tags:
``` python ``` python
print(model.predict(<---- your code here ---->)) print(model.predict(<---- your code here ---->))
``` ```
%% Cell type:markdown id: tags: %% Cell type:markdown id: tags:
The correct answer is $100 \times 1.8 + 32 = 212$, so our model is doing really well. The correct answer is $100 \times 1.8 + 32 = 212$, so our model is doing really well.
### To review ### To review
* We created a model with a Dense layer * We created a model with a Dense layer
* We trained it with 3500 examples (7 pairs, over 500 epochs). * We trained it with 3500 examples (7 pairs, over 500 epochs).
Our model tuned the variables (weights) in the Dense layer until it was able to return the correct Fahrenheit value for any Celsius value. (Remember, 100 Celsius was not part of our training data.) Our model tuned the variables (weights) in the Dense layer until it was able to return the correct Fahrenheit value for any Celsius value. (Remember, 100 Celsius was not part of our training data.)
%% Cell type:markdown id: tags: %% Cell type:markdown id: tags:
## Looking at the layer weights ## Looking at the layer weights
Finally, let's print the internal variables of the Dense layer. Finally, let's print the internal variables of the Dense layer.
%% Cell type:code id: tags: %% Cell type:code id: tags:
``` python ``` python
print("These are the layer variables: {}".format(l0.get_weights())) print("These are the layer variables: {}".format(l0.get_weights()))
``` ```
%% Cell type:markdown id: tags: %% Cell type:markdown id: tags:
The first variable is close to ~1.8 and the second to ~32. These values (1.8 and 32) are the actual variables in the real conversion formula. The first variable is close to ~1.8 and the second to ~32. These values (1.8 and 32) are the actual variables in the real conversion formula.
This is really close to the values in the conversion formula. We'll explain this in an upcoming video where we show how a Dense layer works, but for a single neuron with a single input and a single output, the internal math looks the same as [the equation for a line](https://en.wikipedia.org/wiki/Linear_equation#Slope%E2%80%93intercept_form), $y = mx + b$, which has the same form as the conversion equation, $f = 1.8c + 32$. This is really close to the values in the conversion formula. We'll explain this in an upcoming video where we show how a Dense layer works, but for a single neuron with a single input and a single output, the internal math looks the same as [the equation for a line](https://en.wikipedia.org/wiki/Linear_equation#Slope%E2%80%93intercept_form), $y = mx + b$, which has the same form as the conversion equation, $f = 1.8c + 32$.
Since the form is the same, the variables should converge on the standard values of 1.8 and 32, which is exactly what happened. Since the form is the same, the variables should converge on the standard values of 1.8 and 32, which is exactly what happened.
With additional neurons, additional inputs, and additional outputs, the formula becomes much more complex, but the idea is the same. With additional neurons, additional inputs, and additional outputs, the formula becomes much more complex, but the idea is the same.
### A little experiment ### A little experiment
Just for fun, what if we created more Dense layers with different units, which therefore also has more variables? Just for fun, what if we created more Dense layers with different units, which therefore also has more variables?
%% Cell type:code id: tags: %% Cell type:code id: tags:
``` python ``` python
l0 = tf.keras.layers.Dense(units=4, input_shape=[1]) l0 = tf.keras.layers.Dense(units=4, input_shape=[1])
l1 = tf.keras.layers.Dense(units=4) l1 = tf.keras.layers.Dense(units=4)
l2 = tf.keras.layers.Dense(units=1) l2 = tf.keras.layers.Dense(units=1)
model = tf.keras.Sequential([l0, l1, l2]) model = tf.keras.Sequential([l0, l1, l2])
model.compile(loss='mean_squared_error', optimizer=tf.keras.optimizers.Adam(0.1)) model.compile(loss='mean_squared_error', optimizer=tf.keras.optimizers.Adam(0.1))
model.fit(celsius_q, fahrenheit_a, epochs=500, verbose=False) model.fit(celsius_q, fahrenheit_a, epochs=500, verbose=False)
print("Finished training the model") print("Finished training the model")
print(model.predict([100.0])) print(model.predict([100.0]))
print("Model predicts that 100 degrees Celsius is: {} degrees Fahrenheit".format(model.predict([100.0]))) print("Model predicts that 100 degrees Celsius is: {} degrees Fahrenheit".format(model.predict([100.0])))
print("These are the l0 variables: {}".format(l0.get_weights())) print("These are the l0 variables: {}".format(l0.get_weights()))
print("These are the l1 variables: {}".format(l1.get_weights())) print("These are the l1 variables: {}".format(l1.get_weights()))
print("These are the l2 variables: {}".format(l2.get_weights())) print("These are the l2 variables: {}".format(l2.get_weights()))
``` ```
%% Cell type:markdown id: tags: %% Cell type:markdown id: tags:
As you can see, this model is also able to predict the corresponding Fahrenheit value really well. But when you look at the variables (weights) in the `l0` and `l1` layers, they are nothing even close to ~1.8 and ~32. The added complexity hides the "simple" form of the conversion equation. As you can see, this model is also able to predict the corresponding Fahrenheit value really well. But when you look at the variables (weights) in the `l0` and `l1` layers, they are nothing even close to ~1.8 and ~32. The added complexity hides the "simple" form of the conversion equation.
%% Cell type:markdown id: tags: %% Cell type:markdown id: tags:
# Exercise 2 : O-Rings seen with Logistic Regression # Exercise 2 : O-Rings seen with Logistic Regression
%% Cell type:markdown id: tags: %% Cell type:markdown id: tags:
This notebook calculates a logistic regression using Keras. It's basically meant to show the principles of Keras. This notebook calculates a logistic regression using Keras. It's basically meant to show the principles of Keras.
### Datset ### Datset
We investigate the data set of the challenger flight with broken O-rings (`Y=1`) versus start temperature. We investigate the data set of the challenger flight with broken O-rings (`Y=1`) versus start temperature.
%% Cell type:code id: tags: %% Cell type:code id: tags:
``` python ``` python
%matplotlib inline %matplotlib inline
import numpy as np import numpy as np
import tensorflow as tf import tensorflow as tf
import matplotlib.pyplot as plt import matplotlib.pyplot as plt
import matplotlib.image as imgplot import matplotlib.image as imgplot
import numpy as np import numpy as np
import pandas as pd import pandas as pd
import tempfile import tempfile
data = np.asarray(pd.read_csv('./challenger.txt', sep=','), dtype='float32') data = np.asarray(pd.read_csv('./challenger.txt', sep=','), dtype='float32')
plt.plot(data[:,0], data[:,1], 'o') plt.plot(data[:,0], data[:,1], 'o')
plt.axis([40, 85, -0.1, 1.2]) plt.axis([40, 85, -0.1, 1.2])
plt.xlabel('Temperature [F]') plt.xlabel('Temperature [F]')
plt.ylabel('Broken O-rings') plt.ylabel('Broken O-rings')
``` ```
%% Output %% Output
Text(0, 0.5, 'Broken O-rings') Text(0, 0.5, 'Broken O-rings')
%% Cell type:code id: tags: %% Cell type:code id: tags:
``` python ``` python
y_values = data[:,1] y_values = data[:,1]
print(y_values) print(y_values)
``` ```
%% Output %% Output
[0. 1. 0. 0. 0. 0. 0. 0. 1. 1. 1. 0. 0. 1. 0. 0. 0. 0. 0. 0. 1. 0. 1.] [0. 1. 0. 0. 0. 0. 0. 0. 1. 1. 1. 0. 0. 1. 0. 0. 0. 0. 0. 0. 1. 0. 1.]
%% Cell type:markdown id: tags: %% Cell type:markdown id: tags:
## Mathematical Notes ## Mathematical Notes
We are considering the probability $P(y_i=1|x_i)$ for the class $y_i=1$ given the $i-$th data point $x_i$ ($x_i$ could be a vector). This is given by: We are considering the probability $P(y_i=1|x_i)$ for the class $y_i=1$ given the $i-$th data point $x_i$ ($x_i$ could be a vector). This is given by:
$ $
P(y_i=1 | x_i) = \frac{e^{(b + x_i w)}}{1 + e^{(b + x_i w)}} = [1 + e^{-(b + x_i w)}]^{-1} P(y_i=1 | x_i) = \frac{e^{(b + x_i w)}}{1 + e^{(b + x_i w)}} = [1 + e^{-(b + x_i w)}]^{-1}
$ $
If we have more than one data point, which we usually do, we have to apply the equation above to each of the N data points. In this case we can used a vectorized version with $x=(x_1,x_2,\ldots,x_N)$ and $y=(y_1,y_2,\ldots,y_N$) If we have more than one data point, which we usually do, we have to apply the equation above to each of the N data points. In this case we can used a vectorized version with $x=(x_1,x_2,\ldots,x_N)$ and $y=(y_1,y_2,\ldots,y_N$)
%% Cell type:markdown id: tags: %% Cell type:markdown id: tags:
### Numpy code ### Numpy code
This numpy code, shows the calculation for one value using numpy (like a single forward pass) This numpy code, shows the calculation for one value using numpy (like a single forward pass)
%% Cell type:code id: tags: %% Cell type:code id: tags:
``` python ``` python
# Data # Data
N = len(data) N = len(data)
x = data[:,0] x = data[:,0]
y = data[:,1] y = data[:,1]
# Initial Value for the weights # Initial Value for the weights
w = -0.20 w = -0.20
b = 20.0 b = 20.0
# predicted probabilities # predicted probabilities
p_1 = 1 / (1 + np.exp(-x*w - b)) p_1 = 1 / (1 + np.exp(-x*w - b))
# cross-entropy loss function # cross-entropy loss function
cross_entropy = y * np.log(p_1) + (1-y) * np.log(1-p_1) cross_entropy = y * np.log(p_1) + (1-y) * np.log(1-p_1)
print(-np.mean(cross_entropy)) print(-np.mean(cross_entropy))
print(np.round(p_1,3)) print(np.round(p_1,3))
``` ```
%% Output %% Output
3.882916 3.882916
[0.999 0.998 0.998 0.998 0.999 0.996 0.996 0.998 1. 0.999 0.998 0.988 [0.999 0.998 0.998 0.998 0.999 0.996 0.996 0.998 1. 0.999 0.998 0.988
0.999 1. 0.999 0.993 0.998 0.978 0.992 0.985 0.993 0.992 1. ] 0.999 1. 0.999 0.993 0.998 0.978 0.992 0.985 0.993 0.992 1. ]
%% Cell type:markdown id: tags: %% Cell type:markdown id: tags:
## Better values from intuition ## Better values from intuition
Now lets try to find better values for $W$ and $b$. Lets assume $W$ is given with $-1$. We want the probability Now lets try to find better values for $W$ and $b$. Lets assume $W$ is given with $-1$. We want the probability
for a damage $P(y_i=1 | x_i)$ to be $0.5$. for a damage $P(y_i=1 | x_i)$ to be $0.5$.
Determine an appropriate value for $b$. Determine an appropriate value for $b$.
Hint: at which $x$ value should $P(y_i=1 | x_i)$ be $0.5$, look at the data. At this $x$ value the term $1 + e^{-(b + W’ x_i)}$ must be $2$. Hint: at which $x$ value should $P(y_i=1 | x_i)$ be $0.5$, look at the data. At this $x$ value the term $1 + e^{-(b + W’ x_i)}$ must be $2$.
**Solution** **Solution**
$P(y=1 | x) = 0.5$ at $x \approx 65$ $P(y=1 | x) = 0.5$ at $x \approx 65$
$-(b + (-1) x_i) = 0 \rightarrow b = 65$ $-(b + (-1) x_i) = 0 \rightarrow b = 65$
%% Cell type:code id: tags: %% Cell type:code id: tags:
``` python ``` python
w_val = -1 w_val = -1
b_val = 65 b_val = 65
plt.plot(data[:,0], data[:,1], 'o') plt.plot(data[:,0], data[:,1], 'o')
plt.axis([40, 85, -0.1, 1.2]) plt.axis([40, 85, -0.1, 1.2])
x_pred = np.linspace(40,85) x_pred = np.linspace(40,85)
x_pred = np.resize(x_pred,[len(x_pred),1]) x_pred = np.resize(x_pred,[len(x_pred),1])
y_pred = 1 / (1 + np.exp(-x_pred*w_val - b_val)) y_pred = 1 / (1 + np.exp(-x_pred*w_val - b_val))
plt.plot(x_pred, y_pred) plt.plot(x_pred, y_pred)
# predicted probabilities # predicted probabilities
p_1 = 1 / (1 + np.exp(-x*w_val - b_val)) p_1 = 1 / (1 + np.exp(-x*w_val - b_val))
# cross-entropy loss function # cross-entropy loss function
cross_entropy = -np.mean(y * np.log(p_1) + (1-y) * np.log(1-p_1)) cross_entropy = -np.mean(y * np.log(p_1) + (1-y) * np.log(1-p_1))
print(cross_entropy) print(cross_entropy)
print(np.round(p_1,3)) print(np.round(p_1,3))
``` ```
%% Output %% Output
0.9094435 0.9094435
[0.269 0.007 0.018 0.047 0.119 0.001 0. 0.007 1. 0.881 0.007 0. [0.269 0.007 0.018 0.047 0.119 0.001 0. 0.007 1. 0.881 0.007 0.
0.119 1. 0.119 0. 0.007 0. 0. 0. 0. 0. 0.999] 0.119 1. 0.119 0. 0.007 0. 0. 0. 0. 0. 0.999]
%% Cell type:markdown id: tags: %% Cell type:markdown id: tags:
We can see that the value of the cross-entropy has decreased from 3.882916 to 0.9094435. We can see that the value of the cross-entropy has decreased from 3.882916 to 0.9094435.
%% Cell type:markdown id: tags: %% Cell type:markdown id: tags:
## TODO : determine the accuracy of this logistic regression model and the value of the cross-entropy function ## TODO : determine the accuracy of this logistic regression model and the value of the cross-entropy function
%% Cell type:markdown id: tags: %% Cell type:markdown id: tags:
## TODO : set up a Keras model ## TODO : set up a Keras model
If there are two labels, we use `binary_crossentropy` as loss function. In this case, we use If there are two labels, we use `binary_crossentropy` as loss function. In this case, we use `sigmoid` as output layer.
%% Cell type:code id: tags: %% Cell type:code id: tags:
``` python ``` python
l0 = tf.keras.layers.Dense(<--- your code here ---->) l0 = tf.keras.layers.Dense(<--- your code here ---->)
model = tf.keras.Sequential([l0]) model = tf.keras.Sequential([l0])
model.compile(loss='binary_crossentropy', optimizer=tf.keras.optimizers.Adam(0.01)) model.compile(loss='binary_crossentropy', optimizer=tf.keras.optimizers.Adam(0.01))
model.fit(x, y, epochs=10000, verbose=False) model.fit(x, y, epochs=10000, verbose=False)
``` ```
%% Output %% Output
Using TensorFlow backend. Using TensorFlow backend.
<tensorflow.python.keras.callbacks.History at 0x14081c048> <tensorflow.python.keras.callbacks.History at 0x14081c048>
%% Cell type:code id: tags: %% Cell type:code id: tags:
``` python ``` python
plt.plot(data[:,0], data[:,1], 'o') plt.plot(data[:,0], data[:,1], 'o')
plt.axis([40, 85, -0.1, 1.2]) plt.axis([40, 85, -0.1, 1.2])
x_pred = np.linspace(40,85) x_pred = np.linspace(40,85)
x_pred = np.resize(x_pred,[len(x_pred),1]) x_pred = np.resize(x_pred,[len(x_pred),1])
y_pred = <---- your code here ----> y_pred = <---- your code here ---->
plt.plot(x_pred, y_pred) plt.plot(x_pred, y_pred)
``` ```
%% Cell type:markdown id: tags: %% Cell type:markdown id: tags:
# Exercise 3 : MNIST and Multinomial Logistic Regression # Exercise 3 : MNIST and Multinomial Logistic Regression
%% Cell type:markdown id: tags: %% Cell type:markdown id: tags:
In this exercise we use multinomial logistic regression to predict the handwritten digits of the MNIST dataset. In this exercise we use multinomial logistic regression to predict the handwritten digits of the MNIST dataset.
%% Cell type:markdown id: tags: %% Cell type:markdown id: tags:
## TODO : read MNIST data and compute validation accuracy for a multinomial logistic regression model, see [Multinomial Logistic Regression](https://en.wikipedia.org/wiki/Multinomial_logistic_regression) ## TODO : read MNIST data and compute validation accuracy for a multinomial logistic regression model, see [Multinomial Logistic Regression](https://en.wikipedia.org/wiki/Multinomial_logistic_regression)
If there are several labels, then we use `categorical_crossentropy` as loss function and the output layer should be a `softmax` layer.
%% Cell type:code id: tags: %% Cell type:code id: tags:
``` python ``` python
from __future__ import absolute_import, division, print_function, unicode_literals from __future__ import absolute_import, division, print_function, unicode_literals
from tensorflow.keras.datasets import mnist from tensorflow.keras.datasets import mnist
from tensorflow.keras.utils import to_categorical from tensorflow.keras.utils import to_categorical
# Import TensorFlow and TensorFlow Datasets # Import TensorFlow and TensorFlow Datasets
import tensorflow as tf import tensorflow as tf
# Helper libraries # Helper libraries
import math import math
import numpy as np import numpy as np
import matplotlib.pyplot as plt import matplotlib.pyplot as plt
# Load MNIST data # Load MNIST data
(X_train, y_train), (X_test, y_test) = mnist.load_data() (X_train, y_train), (X_test, y_test) = mnist.load_data()
# One-hot-encoded label vector # One-hot-encoded label vector
y_train_cat = to_categorical(y_train, 10) y_train_cat = to_categorical(y_train, 10)
y_test_cat = to_categorical(y_test, 10) y_test_cat = to_categorical(y_test, 10)
model = tf.keras.Sequential() model = tf.keras.Sequential()
model.add(tf.keras.layers.Flatten(<---- your code here ---->)) model.add(tf.keras.layers.Flatten(<---- your code here ---->))
model.add(tf.keras.layers.Dense(<---- your code here ---->)) model.add(tf.keras.layers.Dense(<---- your code here ---->))
model.compile(<---- your code here ---->) model.compile(<---- your code here ---->)
history = model.fit(<---- your code here ---->) history = model.fit(<---- your code here ---->)
``` ```
%% Cell type:markdown id: tags: %% Cell type:markdown id: tags:
## TODO : use different regularization terms, see [Keras Regularizer](https://keras.io/regularizers/) ## TODO : use different regularization terms, see [Keras Regularizer](https://keras.io/regularizers/)
%% Cell type:code id: tags: %% Cell type:code id: tags:
``` python ``` python
# Import TensorFlow and TensorFlow Datasets # Import TensorFlow and TensorFlow Datasets
import tensorflow as tf import tensorflow as tf
# Helper libraries # Helper libraries
import math import math
import numpy as np import numpy as np
import matplotlib.pyplot as plt import matplotlib.pyplot as plt
# Load MNIST data # Load MNIST data
(X_train, y_train), (X_test, y_test) = mnist.load_data() (X_train, y_train), (X_test, y_test) = mnist.load_data()
# One-hot-encode label vector # One-hot-encode label vector
y_train_cat = to_categorical(y_train, 10) y_train_cat = to_categorical(y_train, 10)
y_test_cat = to_categorical(y_test, 10) y_test_cat = to_categorical(y_test, 10)
# Define Network # Define Network
model = tf.keras.Sequential() model = tf.keras.Sequential()
model.add(tf.keras.layers.Flatten(<---- your code here ---->) model.add(tf.keras.layers.Flatten(<---- your code here ---->)
model.add(tf.keras.layers.Dense(<---- your code here ----> model.add(tf.keras.layers.Dense(<---- your code here ---->
kernel_regularizer=<---- your code here ---->)) kernel_regularizer=<---- your code here ---->))
# Compile Network # Compile Network
model.compile(<---- your code here ---->) model.compile(<---- your code here ---->)
# Fit Network # Fit Network
history = model.fit(<---- your code here ---->) history = model.fit(<---- your code here ---->)
``` ```
......
This diff is collapsed.
0% Loading or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment