Skip to content
Snippets Groups Projects
Commit 5329f8eb authored by Mirko Birbaumer's avatar Mirko Birbaumer
Browse files

Added chapter on geometric interpretation of matrix multiplication

parent 1dc63594
No related branches found
No related tags found
No related merge requests found
%% Cell type:markdown id: tags: %% Cell type:markdown id: tags:
## Functions, Conditionals, and Iteration in Python ## Functions, Conditionals, and Iteration in Python
Let us create a Python function, and call it. Let us create a Python function, and call it.
%% Cell type:code id: tags: %% Cell type:code id: tags:
``` python ``` python
def HelloWorldXY(x, y): def HelloWorldXY(x, y):
if (x < 10): if (x < 10):
print("Hello World, x was < 10") print("Hello World, x was < 10")
elif (x < 20): elif (x < 20):
print("Hello World, x was >= 10 but < 20") print("Hello World, x was >= 10 but < 20")
else: else:
print("Hello World, x was >= 20") print("Hello World, x was >= 20")
return x + y return x + y
print(HelloWorldXY(1,2)) print(HelloWorldXY(1,2))
``` ```
%% Output %% Output
Hello World, x was < 10 Hello World, x was < 10
3 3
%% Cell type:markdown id: tags: %% Cell type:markdown id: tags:
Now let us call the function `HelloWorldXY()` from a loop: Now let us call the function `HelloWorldXY()` from a loop:
%% Cell type:code id: tags: %% Cell type:code id: tags:
``` python ``` python
for i in range(8, 25, 5): # i=8, 13, 18, 23 (start, stop, step) for i in range(8, 25, 5): # i=8, 13, 18, 23 (start, stop, step)
print("\n--- Now running with i: {}".format(i)) print("\n--- Now running with i: {}".format(i))
r = HelloWorldXY(i,i) r = HelloWorldXY(i,i)
print("Result from HelloWorld: {}".format(r)) print("Result from HelloWorld: {}".format(r))
``` ```
%% Output %% Output
--- Now running with i: 8 --- Now running with i: 8
Hello World, x was < 10 Hello World, x was < 10
Result from HelloWorld: 16 Result from HelloWorld: 16
--- Now running with i: 13 --- Now running with i: 13
Hello World, x was >= 10 but < 20 Hello World, x was >= 10 but < 20
Result from HelloWorld: 26 Result from HelloWorld: 26
--- Now running with i: 18 --- Now running with i: 18
Hello World, x was >= 10 but < 20 Hello World, x was >= 10 but < 20
Result from HelloWorld: 36 Result from HelloWorld: 36
--- Now running with i: 23 --- Now running with i: 23
Hello World, x was >= 20 Hello World, x was >= 20
Result from HelloWorld: 46 Result from HelloWorld: 46
%% Cell type:markdown id: tags: %% Cell type:markdown id: tags:
If you want a loop starting at 0 to 2 (exclusive) you could do any of the following: If you want a loop starting at 0 to 2 (exclusive) you could do any of the following:
%% Cell type:code id: tags: %% Cell type:code id: tags:
``` python ``` python
print("Iterate over the items. `range(2)` is like a list [0,1].") print("Iterate over the items. `range(2)` is like a list [0,1].")
for i in range(2): for i in range(2):
print(i) print(i)
print("Iterate over an actual list.") print("Iterate over an actual list.")
for i in [0,1]: for i in [0,1]:
print(i) print(i)
``` ```
%% Output %% Output
Iterate over the items. `range(2)` is like a list [0,1]. Iterate over the items. `range(2)` is like a list [0,1].
0 0
1 1
Iterate over an actual list. Iterate over an actual list.
0 0
1 1
%% Cell type:code id: tags: %% Cell type:code id: tags:
``` python ``` python
print("While works") print("While works")
i = 0 i = 0
while i < 2: while i < 2:
print(i) print(i)
i += 1 i += 1
print("Python supports standard key words like continue and break") print("Python supports standard key words like continue and break")
while True: while True:
print("Entered while") print("Entered while")
break break
print("while broken") print("while broken")
``` ```
%% Output %% Output
While works While works
0 0
1 1
Python supports standard key words like continue and break Python supports standard key words like continue and break
Entered while Entered while
while broken while broken
%% Cell type:markdown id: tags: %% Cell type:markdown id: tags:
## NumPy ## NumPy
### Introducing NumPy ### Introducing NumPy
Python is convenient, but it can also be slow. However, it does Python is convenient, but it can also be slow. However, it does
allow you to access libraries that execute faster code written in allow you to access libraries that execute faster code written in
languages like C. NumPy is one such library: it provides fast alternatives languages like C. NumPy is one such library: it provides fast alternatives
to math operations in Python and is designed to work efficiently with to math operations in Python and is designed to work efficiently with
groups of numbers - like matrices. groups of numbers - like matrices.
NumPy is a large library and we are only going to scratch the surface NumPy is a large library and we are only going to scratch the surface
of it here. If you plan on doing much math with Python, you should of it here. If you plan on doing much math with Python, you should
definitely spend some time exploring its documentation to learn more. definitely spend some time exploring its documentation to learn more.
### Importing Numpy ### Importing Numpy
When importing the NumPy library, the convention you will see When importing the NumPy library, the convention you will see
used most often - including here - is to name it `np`, like so: used most often - including here - is to name it `np`, like so:
%% Cell type:code id: tags: %% Cell type:code id: tags:
``` python ``` python
import numpy as np import numpy as np
``` ```
%% Cell type:markdown id: tags: %% Cell type:markdown id: tags:
Now you can use the library by prefixing the names of functions and Now you can use the library by prefixing the names of functions and
types with `np`, which you will see in the following examples. types with `np`, which you will see in the following examples.
### Data Types and Shapes ### Data Types and Shapes
The most common way to work with numbers in NumPy is through `ndarray` The most common way to work with numbers in NumPy is through `ndarray`
objects. They are similar to Python lists, but can have any number of objects. They are similar to Python lists, but can have any number of
dimensions. Also, `ndarray` supports fast math operations, which dimensions. Also, `ndarray` supports fast math operations, which
is just what we want. is just what we want.
Since it can store any number of dimensions, you can use `ndarrays` Since it can store any number of dimensions, you can use `ndarrays`
to represent any of the data types : scalars, vectors, to represent any of the data types : scalars, vectors,
matrices, or tensors. matrices, or tensors.
%% Cell type:markdown id: tags: %% Cell type:markdown id: tags:
### Scalars ### Scalars
Scalars in NumPy are a bit more involved than in Python. Instead of Scalars in NumPy are a bit more involved than in Python. Instead of
Python's basic types like `int`, `float`, etc., NumPy lets Python's basic types like `int`, `float`, etc., NumPy lets
you specify signed and unsigned types, as well as different sizes. you specify signed and unsigned types, as well as different sizes.
So instead of Python's `int`, you have access to types So instead of Python's `int`, you have access to types
like `uint8`, `int8`, `uint16`, `int16`, and so on. like `uint8`, `int8`, `uint16`, `int16`, and so on.
These types are important because every object you make These types are important because every object you make
(vectors, matrices, tensors) eventually stores scalars. And when you (vectors, matrices, tensors) eventually stores scalars. And when you
create a NumPy array, you can specify the type - _but every item in the create a NumPy array, you can specify the type - _but every item in the
array must have the same type_. In this regard, NumPy arrays are more array must have the same type_. In this regard, NumPy arrays are more
like C arrays than Python lists. like C arrays than Python lists.
If you want to create a NumPy array that holds a scalar, you do so If you want to create a NumPy array that holds a scalar, you do so
by passing the value to NumPy's `array` function, as follows: by passing the value to NumPy's `array` function, as follows:
%% Cell type:code id: tags: %% Cell type:code id: tags:
``` python ``` python
# Scalar # Scalar
s = np.array(5) s = np.array(5)
s s
``` ```
%% Output %% Output
array(5) array(5)
%% Cell type:markdown id: tags: %% Cell type:markdown id: tags:
You can display the number of axes of a NumPy `array` via the `ndim` attribute; You can display the number of axes of a NumPy `array` via the `ndim` attribute;
a scalar array has $0$ axes (`ndim` == 0). The number of axes of an array is also a scalar array has $0$ axes (`ndim` == 0). The number of axes of an array is also
called its _rank_. called its _rank_.
%% Cell type:code id: tags: %% Cell type:code id: tags:
``` python ``` python
s.ndim s.ndim
``` ```
%% Output %% Output
0 0
%% Cell type:markdown id: tags: %% Cell type:markdown id: tags:
You can see the shape of your arrays by checking their `shape` attribute. So if You can see the shape of your arrays by checking their `shape` attribute. So if
you executed this code: you executed this code:
%% Cell type:code id: tags: %% Cell type:code id: tags:
``` python ``` python
s.shape s.shape
``` ```
%% Output %% Output
() ()
%% Cell type:markdown id: tags: %% Cell type:markdown id: tags:
it would print out the result, an empty pair of parenthesis, `()`. This it would print out the result, an empty pair of parenthesis, `()`. This
indicates that it has zero dimensions. indicates that it has zero dimensions.
Even though scalars are inside arrays, you still use them like a normal scalar. Even though scalars are inside arrays, you still use them like a normal scalar.
So you could type: So you could type:
%% Cell type:code id: tags: %% Cell type:code id: tags:
``` python ``` python
x = s + 3 x = s + 3
type(x) type(x)
``` ```
%% Output %% Output
numpy.int64 numpy.int64
%% Cell type:markdown id: tags: %% Cell type:markdown id: tags:
and $x$ would now equal $8$. If you were to check the type of and $x$ would now equal $8$. If you were to check the type of
$x$, you would find it is probably `numPy.int64`, because $x$, you would find it is probably `numPy.int64`, because
it is working with NumPy types, not Python types. it is working with NumPy types, not Python types.
By the way, even scalar types support most of the array functions. By the way, even scalar types support most of the array functions.
So you can call `x.shape` and it would return `()` because So you can call `x.shape` and it would return `()` because
it has zero dimensions, even though it is not an array. If you tried it has zero dimensions, even though it is not an array. If you tried
that with a normal Python scalar, you would get an error. that with a normal Python scalar, you would get an error.
%% Cell type:markdown id: tags: %% Cell type:markdown id: tags:
### Vectors ### Vectors
To create a vector, you would pass a Python list to the `array` function, like this: To create a vector, you would pass a Python list to the `array` function, like this:
%% Cell type:code id: tags: %% Cell type:code id: tags:
``` python ``` python
v = np.array([1,2,3]) v = np.array([1,2,3])
v v
``` ```
%% Output %% Output
array([1, 2, 3]) array([1, 2, 3])
%% Cell type:markdown id: tags: %% Cell type:markdown id: tags:
This vector has three entries and so is called a 3-dimensional vector. If you check a vector's `shape` attribute, it will return a single number representing the This vector has three entries and so is called a 3-dimensional vector. If you check a vector's `shape` attribute, it will return a single number representing the
vector's one-dimensional length. In the above example, vector's one-dimensional length. In the above example,
%% Cell type:code id: tags: %% Cell type:code id: tags:
``` python ``` python
v.shape v.shape
``` ```
%% Output %% Output
(3,) (3,)
%% Cell type:markdown id: tags: %% Cell type:markdown id: tags:
Don’t confuse a 3D vector with a 3D array. A 3D vector has only one axis and has three dimensions along its axis, whereas a 3D array has three axes (and may have any number of dimensions along each axis). Don’t confuse a 3D vector with a 3D array. A 3D vector has only one axis and has three dimensions along its axis, whereas a 3D array has three axes (and may have any number of dimensions along each axis).
%% Cell type:code id: tags: %% Cell type:code id: tags:
``` python ``` python
v.ndim v.ndim
``` ```
%% Output %% Output
1 1
%% Cell type:markdown id: tags: %% Cell type:markdown id: tags:
Dimensionality can denote either the number of entries along a specific Dimensionality can denote either the number of entries along a specific
axis (as in the case of our 3D vector) or the number of axes in an array (such as a axis (as in the case of our 3D vector) or the number of axes in an array (such as a
3D array), which can be confusing at times. 3D array), which can be confusing at times.
You can access an element within the vector using indices, like this: You can access an element within the vector using indices, like this:
%% Cell type:code id: tags: %% Cell type:code id: tags:
``` python ``` python
v[1] v[1]
``` ```
%% Output %% Output
2 2
%% Cell type:markdown id: tags: %% Cell type:markdown id: tags:
NumPy also supports advanced indexing techniques. For example, to access the items from the NumPy also supports advanced indexing techniques. For example, to access the items from the
second element onward, you would say: second element onward, you would say:
%% Cell type:code id: tags: %% Cell type:code id: tags:
``` python ``` python
v[1:] v[1:]
``` ```
%% Output %% Output
array([2, 3]) array([2, 3])
%% Cell type:markdown id: tags: %% Cell type:markdown id: tags:
NumPy slicing is quite powerful, NumPy slicing is quite powerful,
allowing you to access any combination of items in an `ndarray`. But it can also be a bit complicated, allowing you to access any combination of items in an `ndarray`. But it can also be a bit complicated,
so you should read up on it in the documentation. so you should read up on it in the documentation.
%% Cell type:markdown id: tags: %% Cell type:markdown id: tags:
### Matrices ### Matrices
You create matrices using `NumPy`'s array function, just you did for vectors. However, instead You create matrices using `NumPy`'s array function, just you did for vectors. However, instead
of just passing in a list, you need to supply a list of lists, where each list represents of just passing in a list, you need to supply a list of lists, where each list represents
a row. So to create a $3\times 3$ matrix containing the numbers one through nine, you could a row. So to create a $3\times 3$ matrix containing the numbers one through nine, you could
do this: do this:
%% Cell type:code id: tags: %% Cell type:code id: tags:
``` python ``` python
m = np.array([[1,2,3], [4,5,6], [7,8,9]]) m = np.array([[1,2,3], [4,5,6], [7,8,9]])
m m
``` ```
%% Output %% Output
array([[1, 2, 3], array([[1, 2, 3],
[4, 5, 6], [4, 5, 6],
[7, 8, 9]]) [7, 8, 9]])
%% Cell type:markdown id: tags: %% Cell type:markdown id: tags:
The entries from the first axis are called the _rows_, and the entries from The entries from the first axis are called the _rows_, and the entries from
the second axis are called the _columns_. A matrix thus has two axes or _rank_ 2: the second axis are called the _columns_. A matrix thus has two axes or _rank_ 2:
%% Cell type:code id: tags: %% Cell type:code id: tags:
``` python ``` python
m.ndim m.ndim
``` ```
%% Output %% Output
2 2
%% Cell type:markdown id: tags: %% Cell type:markdown id: tags:
Checking its shape attribute would return the tuple `(3, 3)` to indicate it has two dimensions, each length $3$: Checking its shape attribute would return the tuple `(3, 3)` to indicate it has two dimensions, each length $3$:
%% Cell type:code id: tags: %% Cell type:code id: tags:
``` python ``` python
m.shape m.shape
``` ```
%% Output %% Output
(3, 3) (3, 3)
%% Cell type:markdown id: tags: %% Cell type:markdown id: tags:
You can access elements of matrices just like vectors, but using additional index values. So to find You can access elements of matrices just like vectors, but using additional index values. So to find
the number $6$ in the above matrix, you would access the number $6$ in the above matrix, you would access
%% Cell type:code id: tags: %% Cell type:code id: tags:
``` python ``` python
m[1][2] m[1][2]
``` ```
%% Output %% Output
6 6
%% Cell type:markdown id: tags: %% Cell type:markdown id: tags:
### Tensors ### Tensors
Tensors are just like vectors and matrices, but they can have more dimensions. For example, to Tensors are just like vectors and matrices, but they can have more dimensions. For example, to
create a $3\times 3\times 2\times 1$ tensor, you could do the following: create a $3\times 3\times 2\times 1$ tensor, you could do the following:
%% Cell type:code id: tags: %% Cell type:code id: tags:
``` python ``` python
t = np.array([[[[1],[2]],[[3],[4]],[[5],[6]]],[[[7],[8]],\ t = np.array([[[[1],[2]],[[3],[4]],[[5],[6]]],[[[7],[8]],\
[[9],[10]],[[11],[12]]],[[[13],[14]],[[15],[16]],[[17],[17]]]]) [[9],[10]],[[11],[12]]],[[[13],[14]],[[15],[16]],[[17],[17]]]])
t t
``` ```
%% Output %% Output
array([[[[ 1], array([[[[ 1],
[ 2]], [ 2]],
[[ 3], [[ 3],
[ 4]], [ 4]],
[[ 5], [[ 5],
[ 6]]], [ 6]]],
[[[ 7], [[[ 7],
[ 8]], [ 8]],
[[ 9], [[ 9],
[10]], [10]],
[[11], [[11],
[12]]], [12]]],
[[[13], [[[13],
[14]], [14]],
[[15], [[15],
[16]], [16]],
[[17], [[17],
[17]]]]) [17]]]])
%% Cell type:markdown id: tags: %% Cell type:markdown id: tags:
And `t.shape` returns `(3, 3, 2, 1)` and `t.ndim` indicates that we are dealing with a rank 4 tensor. And `t.shape` returns `(3, 3, 2, 1)` and `t.ndim` indicates that we are dealing with a rank 4 tensor.
%% Cell type:code id: tags: %% Cell type:code id: tags:
``` python ``` python
t.shape t.shape
``` ```
%% Output %% Output
(3, 3, 2, 1) (3, 3, 2, 1)
%% Cell type:code id: tags: %% Cell type:code id: tags:
``` python ``` python
t.ndim t.ndim
``` ```
%% Output %% Output
4 4
%% Cell type:markdown id: tags: %% Cell type:markdown id: tags:
You can access items just like with matrices, but with more indices. So You can access items just like with matrices, but with more indices. So
%% Cell type:code id: tags: %% Cell type:code id: tags:
``` python ``` python
t[2][1][1][0] t[2][1][1][0]
``` ```
%% Output %% Output
16 16
%% Cell type:markdown id: tags: %% Cell type:markdown id: tags:
### Changing Shapes ### Changing Shapes
Sometimes you will need to change the shape of your data without actually changing Sometimes you will need to change the shape of your data without actually changing
its contents. For example, you may have a vector, which is one-dimensional, but need its contents. For example, you may have a vector, which is one-dimensional, but need
a matrix, which is two-dimensional. There are two ways you can do that. a matrix, which is two-dimensional. There are two ways you can do that.
Let's say you have the following vector: Let's say you have the following vector:
%% Cell type:code id: tags: %% Cell type:code id: tags:
``` python ``` python
v = np.array([1,2,3,4]) v = np.array([1,2,3,4])
v.shape v.shape
``` ```
%% Output %% Output
(4,) (4,)
%% Cell type:markdown id: tags: %% Cell type:markdown id: tags:
Calling `v.shape` would return `(4,)`. But what if you want a $1\times 4$ matrix? Calling `v.shape` would return `(4,)`. But what if you want a $1\times 4$ matrix?
You can accomplish that with the `reshape` function, like so: You can accomplish that with the `reshape` function, like so:
%% Cell type:code id: tags: %% Cell type:code id: tags:
``` python ``` python
x = v.reshape(1,4) x = v.reshape(1,4)
x x
``` ```
%% Output %% Output
array([[1, 2, 3, 4]]) array([[1, 2, 3, 4]])
%% Cell type:markdown id: tags: %% Cell type:markdown id: tags:
Calling `x.shape` would return `(1,4)`. If you wanted a $4\times 1$ matrix, you Calling `x.shape` would return `(1,4)`. If you wanted a $4\times 1$ matrix, you
could do this: could do this:
%% Cell type:code id: tags: %% Cell type:code id: tags:
``` python ``` python
x = v.reshape(4,1) x = v.reshape(4,1)
x x
``` ```
%% Output %% Output
array([[1], array([[1],
[2], [2],
[3], [3],
[4]]) [4]])
%% Cell type:markdown id: tags: %% Cell type:markdown id: tags:
The `reshape` function works for more than just adding a dimension of size $1$. Check out its The `reshape` function works for more than just adding a dimension of size $1$. Check out its
documentation for more examples. documentation for more examples.
One more thing about reshaping NumPy arrays: if you see code from experienced NumPy users, you One more thing about reshaping NumPy arrays: if you see code from experienced NumPy users, you
will often see them use a special slicing syntax instead of calling `reshape`. Using this will often see them use a special slicing syntax instead of calling `reshape`. Using this
syntax, the previous two examples would look like this: syntax, the previous two examples would look like this:
%% Cell type:code id: tags: %% Cell type:code id: tags:
``` python ``` python
x = v[None, :] x = v[None, :]
x x
``` ```
%% Output %% Output
array([[1, 2, 3, 4]]) array([[1, 2, 3, 4]])
%% Cell type:code id: tags: %% Cell type:code id: tags:
``` python ``` python
x.shape x.shape
``` ```
%% Output %% Output
(1, 4) (1, 4)
%% Cell type:code id: tags: %% Cell type:code id: tags:
``` python ``` python
x = v[:, None] x = v[:, None]
x x
``` ```
%% Output %% Output
array([[1], array([[1],
[2], [2],
[3], [3],
[4]]) [4]])
%% Cell type:code id: tags: %% Cell type:code id: tags:
``` python ``` python
x.shape x.shape
``` ```
%% Output %% Output
(4, 1) (4, 1)
%% Cell type:markdown id: tags: %% Cell type:markdown id: tags:
Those lines create a slice that looks at all of the items of `v` but asks NumPy to add a new dimension Those lines create a slice that looks at all of the items of `v` but asks NumPy to add a new dimension
of size $1$ for the associated axis. It may look strange to you now, but it's a common technique so of size $1$ for the associated axis. It may look strange to you now, but it's a common technique so
it's good to be aware of it. it's good to be aware of it.
%% Cell type:markdown id: tags: %% Cell type:markdown id: tags:
### Element-wise Operations ### Element-wise Operations
#### The Python Way #### The Python Way
Suppose you had a list of numbers, and you wanted to add $5$ to every item in the list. Suppose you had a list of numbers, and you wanted to add $5$ to every item in the list.
Without NumPy, you might do something like this: Without NumPy, you might do something like this:
%% Cell type:code id: tags: %% Cell type:code id: tags:
``` python ``` python
import time import time
start = time.time() start = time.time()
values = [1,2,3,4,5] values = [1,2,3,4,5]
for i in range(len(values)): for i in range(len(values)):
values[i] += 5 values[i] += 5
end = time.time() end = time.time()
print(end - start) print(end - start)
print(values) print(values)
``` ```
%% Output %% Output
0.0002567768096923828 0.0002567768096923828
[6, 7, 8, 9, 10] [6, 7, 8, 9, 10]
%% Cell type:markdown id: tags: %% Cell type:markdown id: tags:
That makes sense, but it's a lot of code to write and it runs slowly because That makes sense, but it's a lot of code to write and it runs slowly because
it's pure Python. it's pure Python.
__Note:__ Just in case you aren't used to using operators like `+=`, that just __Note:__ Just in case you aren't used to using operators like `+=`, that just
means _add these two items and then store the result in the left item._ It is a more means _add these two items and then store the result in the left item._ It is a more
succinct way of writing `values[i] = values[i] + 5`. The code you see in these examples succinct way of writing `values[i] = values[i] + 5`. The code you see in these examples
makes use of such operators whenever possible. makes use of such operators whenever possible.
%% Cell type:markdown id: tags: %% Cell type:markdown id: tags:
#### The NumPy Way #### The NumPy Way
In NumPy, we could do the following: In NumPy, we could do the following:
%% Cell type:code id: tags: %% Cell type:code id: tags:
``` python ``` python
start = time.time() start = time.time()
values = [1,2,3,4,5] values = [1,2,3,4,5]
values = np.array(values) + 5 values = np.array(values) + 5
end = time.time() end = time.time()
print(end - start) print(end - start)
print(values) print(values)
``` ```
%% Output %% Output
0.0005545616149902344 0.0005545616149902344
[ 6 7 8 9 10] [ 6 7 8 9 10]
%% Cell type:markdown id: tags: %% Cell type:markdown id: tags:
Creating that array may seem odd, but normally you'll be storing your data in `ndarrays` Creating that array may seem odd, but normally you'll be storing your data in `ndarrays`
anyway. So if you already had an `ndarray` named `values`, you could have just done: anyway. So if you already had an `ndarray` named `values`, you could have just done:
%% Cell type:code id: tags: %% Cell type:code id: tags:
``` python ``` python
start = time.time() start = time.time()
values = [1,2,3,4,5] values = [1,2,3,4,5]
values = np.array(values) values = np.array(values)
values += 5 values += 5
end = time.time() end = time.time()
print(end - start) print(end - start)
print(values) print(values)
``` ```
%% Output %% Output
0.00030875205993652344 0.00030875205993652344
[ 6 7 8 9 10] [ 6 7 8 9 10]
%% Cell type:markdown id: tags: %% Cell type:markdown id: tags:
We should point out, NumPy actually has functions for things like adding, multiplying, etc. We should point out, NumPy actually has functions for things like adding, multiplying, etc.
But it also supports using the standard math operators. So the following two lines are equivalent: But it also supports using the standard math operators. So the following two lines are equivalent:
%% Cell type:code id: tags: %% Cell type:code id: tags:
``` python ``` python
x = np.multiply(v, 5) x = np.multiply(v, 5)
x x
``` ```
%% Output %% Output
array([ 5, 10, 15, 20]) array([ 5, 10, 15, 20])
%% Cell type:code id: tags: %% Cell type:code id: tags:
``` python ``` python
x = v * 5 x = v * 5
x x
``` ```
%% Output %% Output
array([ 5, 10, 15, 20]) array([ 5, 10, 15, 20])
%% Cell type:markdown id: tags: %% Cell type:markdown id: tags:
We will usually use the operators instead of the functions because they are more convenient to We will usually use the operators instead of the functions because they are more convenient to
type and easier to read, but it's really just personal preference. type and easier to read, but it's really just personal preference.
One more example of operating with scalars and `ndarrays`. Let's say you have a matrix `m` and you want One more example of operating with scalars and `ndarrays`. Let's say you have a matrix `m` and you want
to reuse it, but first you need to set all its values to zero. Easy, just multiply by zero and assign to reuse it, but first you need to set all its values to zero. Easy, just multiply by zero and assign
the result back to the matrix, like this: the result back to the matrix, like this:
%% Cell type:code id: tags: %% Cell type:code id: tags:
``` python ``` python
m *= 0 m *= 0
m m
``` ```
%% Output %% Output
array([[0, 0, 0], array([[0, 0, 0],
[0, 0, 0], [0, 0, 0],
[0, 0, 0]]) [0, 0, 0]])
%% Cell type:markdown id: tags: %% Cell type:markdown id: tags:
### Element-wise Matrix Operations ### Element-wise Matrix Operations
The same functions and operators that work with scalars and matrices also work with The same functions and operators that work with scalars and matrices also work with
other dimensions. You just need to make sure that the items you perform the operation other dimensions. You just need to make sure that the items you perform the operation
on have compatible shapes. on have compatible shapes.
Let's say you want to get the squared values of a matrix. That's simply `x = m * m` (or if you Let's say you want to get the squared values of a matrix. That's simply `x = m * m` (or if you
want to assign the value back to m, it's just `m *= m` want to assign the value back to m, it's just `m *= m`
This works because it's an element-wise multiplication between two identically-shaped matrices. This works because it's an element-wise multiplication between two identically-shaped matrices.
(In this case, they are shaped the same because they are actually the same object.) (In this case, they are shaped the same because they are actually the same object.)
Here's another example: Here's another example:
%% Cell type:code id: tags: %% Cell type:code id: tags:
``` python ``` python
a = np.array([[1,3],[5,7]]) a = np.array([[1,3],[5,7]])
a a
``` ```
%% Output %% Output
array([[1, 3], array([[1, 3],
[5, 7]]) [5, 7]])
%% Cell type:code id: tags: %% Cell type:code id: tags:
``` python ``` python
b = np.array([[2,4],[6,8]]) b = np.array([[2,4],[6,8]])
b b
``` ```
%% Output %% Output
array([[2, 4], array([[2, 4],
[6, 8]]) [6, 8]])
%% Cell type:code id: tags: %% Cell type:code id: tags:
``` python ``` python
a + b a + b
``` ```
%% Output %% Output
array([[ 3, 7], array([[ 3, 7],
[11, 15]]) [11, 15]])
%% Cell type:markdown id: tags: %% Cell type:markdown id: tags:
And if you try working with incompatible shapes, you would get an error: And if you try working with incompatible shapes, you would get an error:
%% Cell type:code id: tags: %% Cell type:code id: tags:
``` python ``` python
a = np.array([[1,3],[5,7]]) a = np.array([[1,3],[5,7]])
a a
``` ```
%% Output %% Output
array([[1, 3], array([[1, 3],
[5, 7]]) [5, 7]])
%% Cell type:code id: tags: %% Cell type:code id: tags:
``` python ``` python
c = np.array([[2,3,6],[4,5,9],[1,8,7]]) c = np.array([[2,3,6],[4,5,9],[1,8,7]])
c c
``` ```
%% Output %% Output
array([[2, 3, 6], array([[2, 3, 6],
[4, 5, 9], [4, 5, 9],
[1, 8, 7]]) [1, 8, 7]])
%% Cell type:code id: tags: %% Cell type:code id: tags:
``` python ``` python
a.shape a.shape
``` ```
%% Output %% Output
(2, 2) (2, 2)
%% Cell type:code id: tags: %% Cell type:code id: tags:
``` python ``` python
c.shape c.shape
``` ```
%% Output %% Output
(3, 3) (3, 3)
%% Cell type:code id: tags: %% Cell type:code id: tags:
``` python ``` python
a + c a + c
``` ```
%% Output %% Output
--------------------------------------------------------------------------- ---------------------------------------------------------------------------
ValueError Traceback (most recent call last) ValueError Traceback (most recent call last)
<ipython-input-224-e81e582b6fa9> in <module> <ipython-input-224-e81e582b6fa9> in <module>
----> 1 a + c ----> 1 a + c
ValueError: operands could not be broadcast together with shapes (2,2) (3,3) ValueError: operands could not be broadcast together with shapes (2,2) (3,3)
%% Cell type:code id: tags: %% Cell type:code id: tags:
``` python ``` python
m = np.array([[1,2,3],[4,5,6]]) m = np.array([[1,2,3],[4,5,6]])
m m
``` ```
%% Output %% Output
array([[1, 2, 3], array([[1, 2, 3],
[4, 5, 6]]) [4, 5, 6]])
%% Cell type:code id: tags: %% Cell type:code id: tags:
``` python ``` python
n = m * 0.25 n = m * 0.25
n n
``` ```
%% Output %% Output
array([[0.25, 0.5 , 0.75], array([[0.25, 0.5 , 0.75],
[1. , 1.25, 1.5 ]]) [1. , 1.25, 1.5 ]])
%% Cell type:code id: tags: %% Cell type:code id: tags:
``` python ``` python
m * n m * n
``` ```
%% Output %% Output
array([[0.25, 1. , 2.25], array([[0.25, 1. , 2.25],
[4. , 6.25, 9. ]]) [4. , 6.25, 9. ]])
%% Cell type:code id: tags: %% Cell type:code id: tags:
``` python ``` python
np.multiply(m, n) np.multiply(m, n)
``` ```
%% Output %% Output
array([[0.25, 1. , 2.25], array([[0.25, 1. , 2.25],
[4. , 6.25, 9. ]]) [4. , 6.25, 9. ]])
%% Cell type:markdown id: tags: %% Cell type:markdown id: tags:
To find the matrix product, you use NumPy's `matmul` function. To find the matrix product, you use NumPy's `matmul` function.
If you have compatible shapes, then it's as simple as this: If you have compatible shapes, then it's as simple as this:
%% Cell type:code id: tags: %% Cell type:code id: tags:
``` python ``` python
a = np.array([[1,2,3,4],[5,6,7,8]]) a = np.array([[1,2,3,4],[5,6,7,8]])
a a
``` ```
%% Output %% Output
array([[1, 2, 3, 4], array([[1, 2, 3, 4],
[5, 6, 7, 8]]) [5, 6, 7, 8]])
%% Cell type:code id: tags: %% Cell type:code id: tags:
``` python ``` python
a.shape a.shape
``` ```
%% Output %% Output
(2, 4) (2, 4)
%% Cell type:code id: tags: %% Cell type:code id: tags:
``` python ``` python
b = np.array([[1,2,3],[4,5,6],[7,8,9],[10,11,12]]) b = np.array([[1,2,3],[4,5,6],[7,8,9],[10,11,12]])
b b
``` ```
%% Output %% Output
array([[ 1, 2, 3], array([[ 1, 2, 3],
[ 4, 5, 6], [ 4, 5, 6],
[ 7, 8, 9], [ 7, 8, 9],
[10, 11, 12]]) [10, 11, 12]])
%% Cell type:code id: tags: %% Cell type:code id: tags:
``` python ``` python
b.shape b.shape
``` ```
%% Output %% Output
(4, 3) (4, 3)
%% Cell type:code id: tags: %% Cell type:code id: tags:
``` python ``` python
c = np.matmul(a, b) c = np.matmul(a, b)
c c
``` ```
%% Output %% Output
array([[ 70, 80, 90], array([[ 70, 80, 90],
[158, 184, 210]]) [158, 184, 210]])
%% Cell type:code id: tags: %% Cell type:code id: tags:
``` python ``` python
c.shape c.shape
``` ```
%% Output %% Output
(2, 3) (2, 3)
%% Cell type:markdown id: tags: %% Cell type:markdown id: tags:
If your matrices have incompatible shapes, you'll get an error, like the following: If your matrices have incompatible shapes, you'll get an error, like the following:
%% Cell type:code id: tags: %% Cell type:code id: tags:
``` python ``` python
np.matmul(b, a) np.matmul(b, a)
``` ```
%% Output %% Output
--------------------------------------------------------------------------- ---------------------------------------------------------------------------
ValueError Traceback (most recent call last) ValueError Traceback (most recent call last)
<ipython-input-235-af3b88aa2232> in <module> <ipython-input-235-af3b88aa2232> in <module>
----> 1 np.matmul(b, a) ----> 1 np.matmul(b, a)
ValueError: matmul: Input operand 1 has a mismatch in its core dimension 0, with gufunc signature (n?,k),(k,m?)->(n?,m?) (size 2 is different from 3) ValueError: matmul: Input operand 1 has a mismatch in its core dimension 0, with gufunc signature (n?,k),(k,m?)->(n?,m?) (size 2 is different from 3)
%% Cell type:markdown id: tags: %% Cell type:markdown id: tags:
### NumPy's `dot` function ### NumPy's `dot` function
You may sometimes see NumPy's `dot` function in places where you would expect a `matmul`. You may sometimes see NumPy's `dot` function in places where you would expect a `matmul`.
It turns out that the results of `dot` and `matmul` are the same if the matrices are two dimensional. It turns out that the results of `dot` and `matmul` are the same if the matrices are two dimensional.
So these two results are equivalent: So these two results are equivalent:
%% Cell type:code id: tags: %% Cell type:code id: tags:
``` python ``` python
a = np.array([[1,2],[3,4]]) a = np.array([[1,2],[3,4]])
a a
``` ```
%% Output %% Output
array([[1, 2], array([[1, 2],
[3, 4]]) [3, 4]])
%% Cell type:code id: tags: %% Cell type:code id: tags:
``` python ``` python
np.dot(a,a) np.dot(a,a)
``` ```
%% Output %% Output
array([[ 7, 10], array([[ 7, 10],
[15, 22]]) [15, 22]])
%% Cell type:code id: tags: %% Cell type:code id: tags:
``` python ``` python
a.dot(a) a.dot(a)
``` ```
%% Output %% Output
array([[ 7, 10], array([[ 7, 10],
[15, 22]]) [15, 22]])
%% Cell type:code id: tags: %% Cell type:code id: tags:
``` python ``` python
np.matmul(a,a) np.matmul(a,a)
``` ```
%% Output %% Output
array([[ 7, 10], array([[ 7, 10],
[15, 22]]) [15, 22]])
%% Cell type:markdown id: tags: %% Cell type:markdown id: tags:
While these functions return the same results for two dimensional data, you should be careful While these functions return the same results for two dimensional data, you should be careful
about which you choose when working with other data shapes. You can read more about the about which you choose when working with other data shapes. You can read more about the
differences, and find links to other NumPy functions, in the `matmul` and `dot` documentation. differences, and find links to other NumPy functions, in the `matmul` and `dot` documentation.
### Transpose ### Transpose
Getting the transpose of a matrix is really easy in NumPy. Simply access Getting the transpose of a matrix is really easy in NumPy. Simply access
its `T` attribute. There is also a `transpose()` function which its `T` attribute. There is also a `transpose()` function which
returns the same thing, but you will rarely see that used anywhere because returns the same thing, but you will rarely see that used anywhere because
typing `T` is so much easier. typing `T` is so much easier.
For example: For example:
%% Cell type:code id: tags: %% Cell type:code id: tags:
``` python ``` python
m = np.array([[1,2,3,4], [5,6,7,8], [9,10,11,12]]) m = np.array([[1,2,3,4], [5,6,7,8], [9,10,11,12]])
m m
``` ```
%% Output %% Output
array([[ 1, 2, 3, 4], array([[ 1, 2, 3, 4],
[ 5, 6, 7, 8], [ 5, 6, 7, 8],
[ 9, 10, 11, 12]]) [ 9, 10, 11, 12]])
%% Cell type:code id: tags: %% Cell type:code id: tags:
``` python ``` python
m.T m.T
``` ```
%% Output %% Output
array([[ 1, 5, 9], array([[ 1, 5, 9],
[ 2, 6, 10], [ 2, 6, 10],
[ 3, 7, 11], [ 3, 7, 11],
[ 4, 8, 12]]) [ 4, 8, 12]])
%% Cell type:markdown id: tags: %% Cell type:markdown id: tags:
NumPy does this without actually moving any data in memory - NumPy does this without actually moving any data in memory -
it simply changes the way it indexes the original matrix - it simply changes the way it indexes the original matrix -
so it's quite efficient. so it's quite efficient.
However, that also means you need to be careful with how you modify objects, However, that also means you need to be careful with how you modify objects,
because they are sharing the same data. For example, with the same matrix `m` because they are sharing the same data. For example, with the same matrix `m`
from above, let us make a new variable `m_t` that stores `m`'s transpose. from above, let us make a new variable `m_t` that stores `m`'s transpose.
Then look what happens if we modify a value in `m_t`: Then look what happens if we modify a value in `m_t`:
%% Cell type:code id: tags: %% Cell type:code id: tags:
``` python ``` python
m_t = m.T m_t = m.T
m_t[3][1] = 200 m_t[3][1] = 200
m_t m_t
``` ```
%% Output %% Output
array([[ 1, 5, 9], array([[ 1, 5, 9],
[ 2, 6, 10], [ 2, 6, 10],
[ 3, 7, 11], [ 3, 7, 11],
[ 4, 200, 12]]) [ 4, 200, 12]])
%% Cell type:code id: tags: %% Cell type:code id: tags:
``` python ``` python
m m
``` ```
%% Output %% Output
array([[ 1, 2, 3, 4], array([[ 1, 2, 3, 4],
[ 5, 6, 7, 200], [ 5, 6, 7, 200],
[ 9, 10, 11, 12]]) [ 9, 10, 11, 12]])
%% Cell type:markdown id: tags: %% Cell type:markdown id: tags:
Notice how it modified both the transpose and the original matrix, too. Notice how it modified both the transpose and the original matrix, too.
That's because they are sharing the same copy of data. So remember to That's because they are sharing the same copy of data. So remember to
consider the transpose just as a different view of your matrix, rather consider the transpose just as a different view of your matrix, rather
than a different matrix entirely. than a different matrix entirely.
%% Cell type:markdown id: tags: %% Cell type:markdown id: tags:
### Geometric Interpretation of Matrix Operation ### Geometric Interpretation of Matrix Operation
%% Cell type:markdown id: tags: %% Cell type:markdown id: tags:
For means of simple demonstration how matrix operations can be interpreted geometrically we will apply a couple of transformations to manipulate the $x$ and $y$ coordinates of the following points which have three dimensional components of $x$, $y$ and ascii character index similar to the way an image pixel has 3 dimensional components of $x$, $y$, and frequency (or intensity). For means of simple demonstration how matrix operations can be interpreted geometrically we will apply a couple of transformations to manipulate the $x$ and $y$ coordinates of the following points which have three dimensional components of $x$, $y$ and ascii character index similar to the way an image pixel has 3 dimensional components of $x$, $y$, and frequency (or intensity).
$$ $$
\begin{align} \begin{align}
a & = (0, 1, 0) \\ a & = (0, 1, 0) \\
b & = (1, 0, 1) \\ b & = (1, 0, 1) \\
c & = (0, -1, 2) \\ c & = (0, -1, 2) \\
d & = (-1, 0, 3) d & = (-1, 0, 3)
\end{align} \end{align}
$$ $$
The transformations for this example will be Scaling by $2$ in all directions and rotation of $90$ degrees clockwise. First we will perform the transformations individually to show the direct effect each has on moving the points around then we will combine the transformations and apply them in one action. The transformations for this example will be Scaling by $2$ in all directions and rotation of $90$ degrees clockwise. First we will perform the transformations individually to show the direct effect each has on moving the points around then we will combine the transformations and apply them in one action.
To begin we want to build a Numpy array (some may call this a matrix) with each row representing the point where the first column is the $x$, the second the $y$, and the third is the index of its letter in the ascii character set similar to the table shown below. Next we use Matplotlib to plot the points - after applying the unchanging _Identity transformation_ To begin we want to build a Numpy array (some may call this a matrix) with each row representing the point where the first column is the $x$, the second the $y$, and the third is the index of its letter in the ascii character set similar to the table shown below. Next we use Matplotlib to plot the points - after applying the unchanging _Identity transformation_
$$I= $$I=
\left(\begin{array}{rrr} \left(\begin{array}{rrr}
1 & 0 & 0 \\ 1 & 0 & 0 \\
0 & 1 & 0 \\ 0 & 1 & 0 \\
0 & 0 & 1 \\ 0 & 0 & 1 \\
\end{array}\right) \end{array}\right)
$$ $$
to give a baseline visual of where we stand. to give a baseline visual of where we stand.
%% Cell type:code id: tags: %% Cell type:code id: tags:
``` python ``` python
import matplotlib.pyplot as plt import matplotlib.pyplot as plt
import numpy as np import numpy as np
import string import string
# points a, b and, c # points a, b and, c
a, b, c, d = (0, 1, 0), (1, 0, 1), (0, -1, 2), (-1, 0, 3) a, b, c, d = (0, 1, 0), (1, 0, 1), (0, -1, 2), (-1, 0, 3)
# matrix with row vectors of points # matrix with row vectors of points
A = np.array([a, b, c, d]) A = np.array([a, b, c, d])
print(A) print(A)
# 3x3 Identity transformation matrix # 3x3 Identity transformation matrix
I = np.eye(3) I = np.eye(3)
print(I) print(I)
``` ```
%% Output %% Output
[[ 0 1 0] [[ 0 1 0]
[ 1 0 1] [ 1 0 1]
[ 0 -1 2] [ 0 -1 2]
[-1 0 3]] [-1 0 3]]
[[1. 0. 0.] [[1. 0. 0.]
[0. 1. 0.] [0. 1. 0.]
[0. 0. 1.]] [0. 0. 1.]]
%% Cell type:code id: tags: %% Cell type:code id: tags:
``` python ``` python
color_lut = 'rgbc' color_lut = 'rgbc'
fig = plt.figure() fig = plt.figure()
ax = plt.gca() ax = plt.gca()
xs = [] xs = []
ys = [] ys = []
for row in A: for row in A:
output_row = np.matmul(I, row) output_row = np.matmul(I, row)
x, y, i = output_row x, y, i = output_row
xs.append(x) xs.append(x)
ys.append(y) ys.append(y)
i = int(i) # convert float to int for indexing i = int(i) # convert float to int for indexing
c = color_lut[i] c = color_lut[i]
plt.scatter(x, y, color=c) plt.scatter(x, y, color=c)
plt.text(x + 0.15, y, f"{string.ascii_letters[i]}") plt.text(x + 0.15, y, f"{string.ascii_letters[i]}")
xs.append(xs[0]) xs.append(xs[0])
ys.append(ys[0]) ys.append(ys[0])
plt.plot(xs, ys, color="gray", linestyle='dotted') plt.plot(xs, ys, color="gray", linestyle='dotted')
ax.set_xticks(np.arange(-2.5, 3, 0.5)) ax.set_xticks(np.arange(-2.5, 3, 0.5))
ax.set_yticks(np.arange(-2.5, 3, 0.5)) ax.set_yticks(np.arange(-2.5, 3, 0.5))
plt.grid() plt.grid()
plt.show() plt.show()
fig.savefig('identity_transformation.eps')
``` ```
%% Output %% Output
%% Cell type:markdown id: tags: %% Cell type:markdown id: tags:
The three points $a$, $b$, and $c$ plotted on a grid after applying the Identity transformation to them via a simple vector matrix dot product leaving them unchanged. The three points $a$, $b$, and $c$ plotted on a grid after applying the Identity transformation to them via a simple vector matrix dot product leaving them unchanged.
We will now move on to creating a _scaling transformation_ matrix We will now move on to creating a _scaling transformation_ matrix
$T_s$, as shown below, which scales the placement of the points in all directions. $T_s$, as shown below, which scales the placement of the points in all directions.
$$T_s= $$T_s=
\left(\begin{array}{rrr} \left(\begin{array}{rrr}
2 & 0 & 0 \\ 2 & 0 & 0 \\
0 & 2 & 0 \\ 0 & 2 & 0 \\
0 & 0 & 1 \\ 0 & 0 & 1 \\
\end{array}\right) \end{array}\right)
$$ $$
Now we will move on to plotting the transformed points similar to what was done with the original points unaltered by the Identity transformation but, this time we will apply the _scaling transformation_ matrix defined above. For a better visualization, we plot a dotted line connecting the points. Now we will move on to plotting the transformed points similar to what was done with the original points unaltered by the Identity transformation but, this time we will apply the _scaling transformation_ matrix defined above. For a better visualization, we plot a dotted line connecting the points.
%% Cell type:code id: tags: %% Cell type:code id: tags:
``` python ``` python
# create the scaling transformation matrix # create the scaling transformation matrix
T_s = np.array([[2, 0, 0], [0, 2, 0], [0, 0, 1]]) T_s = np.array([[2, 0, 0], [0, 2, 0], [0, 0, 1]])
fig = plt.figure() fig = plt.figure()
ax = plt.gca() ax = plt.gca()
xs_s = [] xs_s = []
ys_s = [] ys_s = []
for row in A: for row in A:
output_row = np.matmul(T_s, row) output_row = np.matmul(T_s, row)
x, y, i = row x, y, i = row
x_s, y_s, i_s = output_row x_s, y_s, i_s = output_row
xs_s.append(x_s) xs_s.append(x_s)
ys_s.append(y_s) ys_s.append(y_s)
i, i_s = int(i), int(i_s) # convert float to int for indexing i, i_s = int(i), int(i_s) # convert float to int for indexing
c, c_s = color_lut[i], color_lut[i_s] # these are the same but, its good to be explicit c, c_s = color_lut[i], color_lut[i_s] # these are the same but, its good to be explicit
plt.scatter(x, y, color=c) plt.scatter(x, y, color=c)
plt.scatter(x_s, y_s, color=c_s) plt.scatter(x_s, y_s, color=c_s)
plt.text(x + 0.15, y, f"{string.ascii_letters[int(i)]}") plt.text(x + 0.15, y, f"{string.ascii_letters[int(i)]}")
plt.text(x_s + 0.15, y_s, f"{string.ascii_letters[int(i_s)]}'") plt.text(x_s + 0.15, y_s, f"{string.ascii_letters[int(i_s)]}'")
xs_s.append(xs_s[0]) xs_s.append(xs_s[0])
ys_s.append(ys_s[0]) ys_s.append(ys_s[0])
plt.plot(xs, ys, color="gray", linestyle='dotted') plt.plot(xs, ys, color="gray", linestyle='dotted')
plt.plot(xs_s, ys_s, color="gray", linestyle='dotted') plt.plot(xs_s, ys_s, color="gray", linestyle='dotted')
ax.set_xticks(np.arange(-2.5, 3, 0.5)) ax.set_xticks(np.arange(-2.5, 3, 0.5))
ax.set_yticks(np.arange(-2.5, 3, 0.5)) ax.set_yticks(np.arange(-2.5, 3, 0.5))
``` ```
%% Output %% Output
[<matplotlib.axis.YTick at 0x7f0708290e50>, [<matplotlib.axis.YTick at 0x7f070290c250>,
<matplotlib.axis.YTick at 0x7f070397eed0>, <matplotlib.axis.YTick at 0x7f070290e810>,
<matplotlib.axis.YTick at 0x7f07087db3d0>, <matplotlib.axis.YTick at 0x7f0703fc39d0>,
<matplotlib.axis.YTick at 0x7f07082f4590>, <matplotlib.axis.YTick at 0x7f07028fa710>,
<matplotlib.axis.YTick at 0x7f0703969bd0>, <matplotlib.axis.YTick at 0x7f07028fa6d0>,
<matplotlib.axis.YTick at 0x7f0703b22c10>, <matplotlib.axis.YTick at 0x7f07028e8350>,
<matplotlib.axis.YTick at 0x7f0703b22ad0>, <matplotlib.axis.YTick at 0x7f07028fadd0>,
<matplotlib.axis.YTick at 0x7f07038c62d0>, <matplotlib.axis.YTick at 0x7f0702882150>,
<matplotlib.axis.YTick at 0x7f07038c6e90>, <matplotlib.axis.YTick at 0x7f07028825d0>,
<matplotlib.axis.YTick at 0x7f0703794450>, <matplotlib.axis.YTick at 0x7f0702882b90>,
<matplotlib.axis.YTick at 0x7f0703794690>] <matplotlib.axis.YTick at 0x7f070288b110>]
%% Cell type:markdown id: tags: %% Cell type:markdown id: tags:
From the plot above it should be very clear that the $x$ and $y$ dimensions were simply scaled up by a factor of two while the third dimension responsible for the ASCII letter index was left unchanged. In fact, those familiar with matrix algebra will have noticed that for all of the affine transformations the value represented in the third dimension is always left un-altered as indicated by the all zeros and one lone value in the third dimension index of the last column. From the plot above it should be very clear that the $x$ and $y$ dimensions were simply scaled up by a factor of two while the third dimension responsible for the ASCII letter index was left unchanged. In fact, those familiar with matrix algebra will have noticed that for all of the affine transformations the value represented in the third dimension is always left un-altered as indicated by the all zeros and one lone value in the third dimension index of the last column.
Now let us describe how to interpret the _rotation transformation_. We will start by solving the two trigonometric functions for the desired angle of rotation of $90$ degrees, then we simply plug them into the rotation transformation matrix listed in the previous table. Now let us describe how to interpret the _rotation transformation_. We will start by solving the two trigonometric functions for the desired angle of rotation of $90$ degrees, then we simply plug them into the rotation transformation matrix listed in the previous table.
$$ $$
\begin{align} \begin{align}
\sin(90) & =1 \\ \sin(90) & =1 \\
\cos(90) & = 1\\ \cos(90) & = 1\\
\end{align} \end{align}
$$ $$
$$T_r= $$T_r=
\left(\begin{array}{rrr} \left(\begin{array}{rrr}
0 & 1 & 0 \\ 0 & 1 & 0 \\
-1 & 0 & 0 \\ -1 & 0 & 0 \\
0 & 0 & 1 \\ 0 & 0 & 1 \\
\end{array}\right) \end{array}\right)
$$ $$
Now all we need to do is apply the same logic to transform and plot the points, like so: Now all we need to do is apply the same logic to transform and plot the points, like so:
%% Cell type:code id: tags: %% Cell type:code id: tags:
``` python ``` python
# create the rotation transformation matrix # create the rotation transformation matrix
T_r = np.array([[0, 1, 0], [-1, 0, 0], [0, 0, 1]]) T_r = np.array([[0, 1, 0], [-1, 0, 0], [0, 0, 1]])
fig = plt.figure() fig = plt.figure()
ax = plt.gca() ax = plt.gca()
for row in A: for row in A:
output_row = np.matmul(T_r, row) output_row = np.matmul(T_r, row)
x_r, y_r, i_r = output_row x_r, y_r, i_r = output_row
i_r = int(i_r) # convert float to int for indexing i_r = int(i_r) # convert float to int for indexing
c_r = color_lut[i_r] # these are the same but, its good to be explicit c_r = color_lut[i_r] # these are the same but, its good to be explicit
letter_r = string.ascii_letters[i_r] letter_r = string.ascii_letters[i_r]
plt.scatter(x_r, y_r, color=c_r) plt.scatter(x_r, y_r, color=c_r)
plt.text(x_r + 0.15, y_r, f"{letter_r}'") plt.text(x_r + 0.15, y_r, f"{letter_r}'")
plt.plot(xs, ys, color="gray", linestyle='dotted') plt.plot(xs, ys, color="gray", linestyle='dotted')
ax.set_xticks(np.arange(-2.5, 3, 0.5)) ax.set_xticks(np.arange(-2.5, 3, 0.5))
ax.set_yticks(np.arange(-2.5, 3, 0.5)) ax.set_yticks(np.arange(-2.5, 3, 0.5))
plt.grid() plt.grid()
plt.show() plt.show()
``` ```
%% Output %% Output
%% Cell type:markdown id: tags: %% Cell type:markdown id: tags:
Hopefully you can tell from the plot that all points were rotated 90 degrees around an axis of rotation at the origin. Hopefully you can tell from the plot that all points were rotated 90 degrees around an axis of rotation at the origin.
The neat thing about affine transformations being essentially linear transformations is that you can _combine_ the transformations and apply them in one step. To demonstrate this we will apply the dot product (matrix multiplication) of our two transformation matrices, like: The neat thing about affine transformations being essentially linear transformations is that you can _combine_ the transformations and apply them in one step. To demonstrate this we will apply the dot product (matrix multiplication) of our two transformation matrices, like:
$$ $$
\begin{align*} \begin{align*}
T_{\text{comb}}= T_{\text{comb}}=
\left(\begin{array}{rrr} \left(\begin{array}{rrr}
2 & 0 & 0 \\ 2 & 0 & 0 \\
0 & 2 & 0 \\ 0 & 2 & 0 \\
0 & 0 & 1 \\ 0 & 0 & 1 \\
\end{array}\right) \end{array}\right)
\star \star
\left(\begin{array}{rrr} \left(\begin{array}{rrr}
0 & 1 & 0 \\ 0 & 1 & 0 \\
-1 & 0 & 0 \\ -1 & 0 & 0 \\
0 & 0 & 1 \\ 0 & 0 & 1 \\
\end{array}\right) = \left(\begin{array}{rrr} \end{array}\right) = \left(\begin{array}{rrr}
0 & 2 & 0 \\ 0 & 2 & 0 \\
-2 & 0 & 0 \\ -2 & 0 & 0 \\
0 & 0 & 1 \\ 0 & 0 & 1 \\
\end{array}\right) \end{array}\right)
\end{align*} \end{align*}
$$ $$
Now we can apply this combined transformation matrix to the points and replot them to show a combination of scaling by two and rotation by 90 degrees. Now we can apply this combined transformation matrix to the points and replot them to show a combination of scaling by two and rotation by 90 degrees.
%% Cell type:code id: tags: %% Cell type:code id: tags:
``` python ``` python
# create combined tranformation matrix # create combined tranformation matrix
T = np.matmul(T_s, T_r) T = np.matmul(T_s, T_r)
fig = plt.figure() fig = plt.figure()
ax = plt.gca() ax = plt.gca()
xs_comb = [] xs_comb = []
ys_comb = [] ys_comb = []
for row in A: for row in A:
output_row = np.matmul(T, row) output_row = np.matmul(T, row)
x, y, i = row x, y, i = row
x_comb, y_comb, i_comb = output_row x_comb, y_comb, i_comb = output_row
xs_comb.append(x_comb) xs_comb.append(x_comb)
ys_comb.append(y_comb) ys_comb.append(y_comb)
i, i_comb = int(i), int(i_comb) # convert float to int for indexing i, i_comb = int(i), int(i_comb) # convert float to int for indexing
c, c_comb = color_lut[i], color_lut[i_comb] # these are the same but, its good to be explicit c, c_comb = color_lut[i], color_lut[i_comb] # these are the same but, its good to be explicit
letter, letter_comb = string.ascii_letters[i], string.ascii_letters[i_comb] letter, letter_comb = string.ascii_letters[i], string.ascii_letters[i_comb]
plt.scatter(x, y, color=c) plt.scatter(x, y, color=c)
plt.scatter(x_comb, y_comb, color=c_comb) plt.scatter(x_comb, y_comb, color=c_comb)
plt.text(x + 0.15 , y, f"{letter}") plt.text(x + 0.15 , y, f"{letter}")
plt.text(x_comb + 0.15, y_comb, f"{letter_comb}'") plt.text(x_comb + 0.15, y_comb, f"{letter_comb}'")
xs_comb.append(xs_comb[0]) xs_comb.append(xs_comb[0])
ys_comb.append(ys_comb[0]) ys_comb.append(ys_comb[0])
plt.plot(xs, ys, color="gray", linestyle='dotted') plt.plot(xs, ys, color="gray", linestyle='dotted')
plt.plot(xs_comb, ys_comb, color="gray", linestyle='dotted') plt.plot(xs_comb, ys_comb, color="gray", linestyle='dotted')
ax.set_xticks(np.arange(-2.5, 3, 0.5)) ax.set_xticks(np.arange(-2.5, 3, 0.5))
ax.set_yticks(np.arange(-2.5, 3, 0.5)) ax.set_yticks(np.arange(-2.5, 3, 0.5))
plt.grid() plt.grid()
plt.show() plt.show()
fig.savefig('combined_transformation.eps')
``` ```
%% Output %% Output
%% Cell type:markdown id: tags: %% Cell type:markdown id: tags:
## A Real Use Case ## A Real Use Case
%% Cell type:code id: tags: %% Cell type:code id: tags:
``` python ``` python
inputs = np.array([[-0.27, 0.45, 0.64, 0.31]]) inputs = np.array([[-0.27, 0.45, 0.64, 0.31]])
print(inputs, inputs.shape) print(inputs, inputs.shape)
weights = np.array([[0.02, 0.001, -0.03, 0.036], weights = np.array([[0.02, 0.001, -0.03, 0.036],
[0.04, -0.003, 0.025, 0.009], [0.04, -0.003, 0.025, 0.009],
[0.012, -0.045, 0.28, -0.067]]) [0.012, -0.045, 0.28, -0.067]])
print(weights, weights.shape) print(weights, weights.shape)
print("Matrix multiplication gives:\n", np.matmul(inputs, weights.T), "\nor, equivalently:\n", np.matmul(weights, inputs.T)) print("Matrix multiplication gives:\n", np.matmul(inputs, weights.T), "\nor, equivalently:\n", np.matmul(weights, inputs.T))
``` ```
%% Output %% Output
[[-0.27 0.45 0.64 0.31]] (1, 4) [[-0.27 0.45 0.64 0.31]] (1, 4)
[[ 0.02 0.001 -0.03 0.036] [[ 0.02 0.001 -0.03 0.036]
[ 0.04 -0.003 0.025 0.009] [ 0.04 -0.003 0.025 0.009]
[ 0.012 -0.045 0.28 -0.067]] (3, 4) [ 0.012 -0.045 0.28 -0.067]] (3, 4)
Matrix multiplication gives: Matrix multiplication gives:
[[-0.01299 0.00664 0.13494]] [[-0.01299 0.00664 0.13494]]
or, equivalently: or, equivalently:
[[-0.01299] [[-0.01299]
[ 0.00664] [ 0.00664]
[ 0.13494]] [ 0.13494]]
%% Cell type:markdown id: tags: %% Cell type:markdown id: tags:
## Some More Useful Numpy Methods ## Some More Useful Numpy Methods
%% Cell type:code id: tags: %% Cell type:code id: tags:
``` python ``` python
print("\nShowing some basic math on arrays") print("\nShowing some basic math on arrays")
b = np.array([0,1,4,3,2]) b = np.array([0,1,4,3,2])
print("Max: {}".format(np.max(b))) print("Max: {}".format(np.max(b)))
print("Average: {}".format(np.average(b))) print("Average: {}".format(np.average(b)))
print("Max index: {}".format(np.argmax(b))) print("Max index: {}".format(np.argmax(b)))
print("\nUse numpy to create a [3,3] dimension array with random number") print("\nUse numpy to create a [3,3] dimension array with random number")
c = np.random.rand(3, 3) c = np.random.rand(3, 3)
print(c) print(c)
``` ```
%% Output %% Output
Showing some basic math on arrays Showing some basic math on arrays
Max: 4 Max: 4
Average: 2.0 Average: 2.0
Max index: 2 Max index: 2
Use numpy to create a [3,3] dimension array with random number Use numpy to create a [3,3] dimension array with random number
[[0.55382541 0.36160037 0.68662274] [[0.55382541 0.36160037 0.68662274]
[0.35028249 0.09885074 0.38463975] [0.35028249 0.09885074 0.38463975]
[0.10113939 0.74698869 0.02460112]] [0.10113939 0.74698869 0.02460112]]
%% Cell type:markdown id: tags: %% Cell type:markdown id: tags:
## Pandas ## Pandas
%% Cell type:markdown id: tags: %% Cell type:markdown id: tags:
Pandas is used frequently for statistical data analysis. Pandas is using two important data structures, `Series` and `Dataframes`, which will be discussed in more detail in the next section. Pandas is used frequently for statistical data analysis. Pandas is using two important data structures, `Series` and `Dataframes`, which will be discussed in more detail in the next section.
### Series and DataFrame ### Series and DataFrame
`Series` are one dimensional vectors accompanied by an index. All entries have the same datatype. `DatFrames` contain higher order matrices, where each column can have a different _datatype_. `Series` are one dimensional vectors accompanied by an index. All entries have the same datatype. `DatFrames` contain higher order matrices, where each column can have a different _datatype_.
In the next example a `Series` containing the fictive temperature in Luzern for each month from January until July is created: In the next example a `Series` containing the fictive temperature in Luzern for each month from January until July is created:
%% Cell type:code id: tags: %% Cell type:code id: tags:
``` python ``` python
import pandas as pd import pandas as pd
from pandas import Series, DataFrame from pandas import Series, DataFrame
temp_luz = Series([1, 5, 9, 15, 20, 25, 25]) temp_luz = Series([1, 5, 9, 15, 20, 25, 25])
temp_luz temp_luz
print(temp_luz) print(temp_luz)
``` ```
%% Output %% Output
0 1 0 1
1 5 1 5
2 9 2 9
3 15 3 15
4 20 4 20
5 25 5 25
6 25 6 25
dtype: int64 dtype: int64
%% Cell type:markdown id: tags: %% Cell type:markdown id: tags:
The left column is not part of the dataset, but just the default integer `Index` with which we can call the entries. The left column is not part of the dataset, but just the default integer `Index` with which we can call the entries.
%% Cell type:code id: tags: %% Cell type:code id: tags:
``` python ``` python
import pandas as pd import pandas as pd
from pandas import Series, DataFrame from pandas import Series, DataFrame
temp_luz = Series([1, 5, 9, 15, 20, 25, 25]) temp_luz = Series([1, 5, 9, 15, 20, 25, 25])
temp_luz[2] temp_luz[2]
print(temp_luz[2]) print(temp_luz[2])
``` ```
%% Output %% Output
9 9
%% Cell type:markdown id: tags: %% Cell type:markdown id: tags:
Note how the Index begins at 0 not at 1. The index in itself does not have a meaning, and we can change it to something else. For Example: Note how the Index begins at 0 not at 1. The index in itself does not have a meaning, and we can change it to something else. For Example:
%% Cell type:code id: tags: %% Cell type:code id: tags:
``` python ``` python
import pandas as pd import pandas as pd
from pandas import Series, DataFrame from pandas import Series, DataFrame
temp_luz = Series( temp_luz = Series(
[1, 5, 9, 15, 20, 25, 25], [1, 5, 9, 15, 20, 25, 25],
index=("jan", "feb", "mar", "apr", "mai", "jun", "jul") index=("jan", "feb", "mar", "apr", "mai", "jun", "jul")
) )
temp_luz temp_luz
print(temp_luz) print(temp_luz)
``` ```
%% Output %% Output
jan 1 jan 1
feb 5 feb 5
mar 9 mar 9
apr 15 apr 15
mai 20 mai 20
jun 25 jun 25
jul 25 jul 25
dtype: int64 dtype: int64
%% Cell type:markdown id: tags: %% Cell type:markdown id: tags:
Now we can call for the temperature in March: Now we can call for the temperature in March:
%% Cell type:code id: tags: %% Cell type:code id: tags:
``` python ``` python
import pandas as pd import pandas as pd
from pandas import Series, DataFrame from pandas import Series, DataFrame
temp_luz = Series([1,5,9,15,20,25,25], temp_luz = Series([1,5,9,15,20,25,25],
index=("jan","feb","mar","apr","mai","jun","jul") index=("jan","feb","mar","apr","mai","jun","jul")
) )
temp_luz["mar"] temp_luz["mar"]
print(temp_luz["mar"]) print(temp_luz["mar"])
``` ```
%% Output %% Output
9 9
%% Cell type:markdown id: tags: %% Cell type:markdown id: tags:
`DataFrame` creates a two (or more) dimensional dataset, comparable to a matrix. In the following example, we will create a Table containing the fictive temperature from January until July in Luzern, Basel and Z\"urich. In this module, we will usually load data from a gives set, rather than create the data ourselves. `DataFrame` creates a two (or more) dimensional dataset, comparable to a matrix. In the following example, we will create a Table containing the fictive temperature from January until July in Luzern, Basel and Z\"urich. In this module, we will usually load data from a gives set, rather than create the data ourselves.
%% Cell type:code id: tags: %% Cell type:code id: tags:
``` python ``` python
import pandas as pd import pandas as pd
from pandas import Series, DataFrame from pandas import Series, DataFrame
temp = DataFrame({ temp = DataFrame({
"Luzern": ([1,5,9,15,20,25,25]), "Luzern": ([1,5,9,15,20,25,25]),
"Basel": ([3,4,12,16,18,23,32]), "Basel": ([3,4,12,16,18,23,32]),
"Zuerich": ([8,6,10,17,23,22,24])}, "Zuerich": ([8,6,10,17,23,22,24])},
index=["jan","feb","mar","apr","mai","jun","jul"] index=["jan","feb","mar","apr","mai","jun","jul"]
) )
temp temp
print(temp) print(temp)
``` ```
%% Output %% Output
Luzern Basel Zuerich Luzern Basel Zuerich
jan 1 3 8 jan 1 3 8
feb 5 4 6 feb 5 4 6
mar 9 12 10 mar 9 12 10
apr 15 16 17 apr 15 16 17
mai 20 18 23 mai 20 18 23
jun 25 23 22 jun 25 23 22
jul 25 32 24 jul 25 32 24
%% Cell type:markdown id: tags: %% Cell type:markdown id: tags:
The columns are ordered alphabetically. The top row and leftmost column are again \emph{not} part of the dataset, but just the indices. These can be used when calling for entries, on which we will now elaborate. The columns are ordered alphabetically. The top row and leftmost column are again \emph{not} part of the dataset, but just the indices. These can be used when calling for entries, on which we will now elaborate.
With the Attribute `.columns` we get the column names: With the Attribute `.columns` we get the column names:
%% Cell type:code id: tags: %% Cell type:code id: tags:
``` python ``` python
import pandas as pd import pandas as pd
from pandas import Series, DataFrame from pandas import Series, DataFrame
temp = DataFrame({ temp = DataFrame({
"Luzern": ([1,5,9,15,20,25,25]), "Luzern": ([1,5,9,15,20,25,25]),
"Basel": ([3,4,12,16,18,23,32]), "Basel": ([3,4,12,16,18,23,32]),
"Zuerich": ([8,6,10,17,23,22,24])}, "Zuerich": ([8,6,10,17,23,22,24])},
index=["jan","feb","mar","apr","mai","jun","jul"] index=["jan","feb","mar","apr","mai","jun","jul"]
) )
temp.columns temp.columns
print(temp.columns) print(temp.columns)
``` ```
%% Output %% Output
Index(['Luzern', 'Basel', 'Zuerich'], dtype='object') Index(['Luzern', 'Basel', 'Zuerich'], dtype='object')
%% Cell type:markdown id: tags: %% Cell type:markdown id: tags:
When working with bigger databases, an important task is to select a set of rows and columns from the full database. Pandas provides multiple options to do this, of which we will only cover a few. When working with bigger databases, an important task is to select a set of rows and columns from the full database. Pandas provides multiple options to do this, of which we will only cover a few.
We can select a single column by its index, which will result in a `Series`: We can select a single column by its index, which will result in a `Series`:
%% Cell type:code id: tags: %% Cell type:code id: tags:
``` python ``` python
import pandas as pd import pandas as pd
from pandas import Series, DataFrame from pandas import Series, DataFrame
temp = DataFrame({ temp = DataFrame({
"Luzern": ([1,5,9,15,20,25,25]), "Luzern": ([1,5,9,15,20,25,25]),
"Basel": ([3,4,12,16,18,23,32]), "Basel": ([3,4,12,16,18,23,32]),
"Zuerich": ([8,6,10,17,23,22,24])}, "Zuerich": ([8,6,10,17,23,22,24])},
index=["jan","feb","mar","apr","mai","jun","jul"] index=["jan","feb","mar","apr","mai","jun","jul"]
) )
temp["Luzern"] temp["Luzern"]
print(temp["Luzern"]) print(temp["Luzern"])
``` ```
%% Output %% Output
jan 1 jan 1
feb 5 feb 5
mar 9 mar 9
apr 15 apr 15
mai 20 mai 20
jun 25 jun 25
jul 25 jul 25
Name: Luzern, dtype: int64 Name: Luzern, dtype: int64
%% Cell type:markdown id: tags: %% Cell type:markdown id: tags:
Another option is to use the Method `.loc()`, which allows us to extract a `DataFrame` Another option is to use the Method `.loc()`, which allows us to extract a `DataFrame`
%% Cell type:code id: tags: %% Cell type:code id: tags:
``` python ``` python
import pandas as pd import pandas as pd
from pandas import Series, DataFrame from pandas import Series, DataFrame
temp = DataFrame({ temp = DataFrame({
"Luzern": ([1,5,9,15,20,25,25]), "Luzern": ([1,5,9,15,20,25,25]),
"Basel": ([3,4,12,16,18,23,32]), "Basel": ([3,4,12,16,18,23,32]),
"Zuerich": ([8,6,10,17,23,22,24])}, "Zuerich": ([8,6,10,17,23,22,24])},
index=["jan","feb","mar","apr","mai","jun","jul"] index=["jan","feb","mar","apr","mai","jun","jul"]
) )
temp.loc[:,"Luzern"] temp.loc[:,"Luzern"]
print(temp.loc[:,"Luzern"]) print(temp.loc[:,"Luzern"])
``` ```
%% Output %% Output
jan 1 jan 1
feb 5 feb 5
mar 9 mar 9
apr 15 apr 15
mai 20 mai 20
jun 25 jun 25
jul 25 jul 25
Name: Luzern, dtype: int64 Name: Luzern, dtype: int64
%% Cell type:markdown id: tags: %% Cell type:markdown id: tags:
The single `:` as the first argument signifies that all rows should be selected. Instead, we could also select only a few rows. The single `:` as the first argument signifies that all rows should be selected. Instead, we could also select only a few rows.
%% Cell type:code id: tags: %% Cell type:code id: tags:
``` python ``` python
import pandas as pd import pandas as pd
from pandas import Series, DataFrame from pandas import Series, DataFrame
temp = DataFrame({ temp = DataFrame({
"Luzern": ([1,5,9,15,20,25,25]), "Luzern": ([1,5,9,15,20,25,25]),
"Basel": ([3,4,12,16,18,23,32]), "Basel": ([3,4,12,16,18,23,32]),
"Zuerich": ([8,6,10,17,23,22,24])}, "Zuerich": ([8,6,10,17,23,22,24])},
index=["jan","feb","mar","apr","mai","jun","jul"] index=["jan","feb","mar","apr","mai","jun","jul"]
) )
temp.loc["mai":"jul","Luzern"] temp.loc["mai":"jul","Luzern"]
print(temp.loc["mai":"jul","Luzern"]) print(temp.loc["mai":"jul","Luzern"])
``` ```
%% Output %% Output
mai 20 mai 20
jun 25 jun 25
jul 25 jul 25
Name: Luzern, dtype: int64 Name: Luzern, dtype: int64
%% Cell type:markdown id: tags: %% Cell type:markdown id: tags:
Now the `:` means _from ... to ..._ . To make a selection on both rows and columns, the indices have to be put inside square brackets. Now the `:` means _from ... to ..._ . To make a selection on both rows and columns, the indices have to be put inside square brackets.
%% Cell type:code id: tags: %% Cell type:code id: tags:
``` python ``` python
import pandas as pd import pandas as pd
from pandas import Series, DataFrame from pandas import Series, DataFrame
temp = DataFrame({ temp = DataFrame({
"Luzern": ([1,5,9,15,20,25,25]), "Luzern": ([1,5,9,15,20,25,25]),
"Basel": ([3,4,12,16,18,23,32]), "Basel": ([3,4,12,16,18,23,32]),
"Zuerich": ([8,6,10,17,23,22,24])}, "Zuerich": ([8,6,10,17,23,22,24])},
index=["jan","feb","mar","apr","mai","jun","jul"] index=["jan","feb","mar","apr","mai","jun","jul"]
) )
temp.loc[["mai","jul"],["Basel","Zuerich"]] temp.loc[["mai","jul"],["Basel","Zuerich"]]
print(temp.loc[["mai","jul"],["Basel","Zuerich"]]) print(temp.loc[["mai","jul"],["Basel","Zuerich"]])
``` ```
%% Output %% Output
Basel Zuerich Basel Zuerich
mai 18 23 mai 18 23
jul 32 24 jul 32 24
%% Cell type:markdown id: tags: %% Cell type:markdown id: tags:
When calling for a single entry, the brackets can be left out. When calling for a single entry, the brackets can be left out.
%% Cell type:code id: tags: %% Cell type:code id: tags:
``` python ``` python
import pandas as pd import pandas as pd
from pandas import Series, DataFrame from pandas import Series, DataFrame
temp = DataFrame({ temp = DataFrame({
"Luzern": ([1,5,9,15,20,25,25]), "Luzern": ([1,5,9,15,20,25,25]),
"Basel": ([3,4,12,16,18,23,32]), "Basel": ([3,4,12,16,18,23,32]),
"Zuerich": ([8,6,10,17,23,22,24])}, "Zuerich": ([8,6,10,17,23,22,24])},
index=["jan","feb","mar","apr","mai","jun","jul"] index=["jan","feb","mar","apr","mai","jun","jul"]
) )
temp.loc["mai","Zuerich"] temp.loc["mai","Zuerich"]
print(temp.loc["mai","Zuerich"]) print(temp.loc["mai","Zuerich"])
``` ```
%% Output %% Output
23 23
%% Cell type:markdown id: tags: %% Cell type:markdown id: tags:
Another option is to use the general integer indices, as used by Python. In the example, row 4 and column 2 `[3, 1]`. Here we will usually use the named, meaningful, indices. Another option is to use the general integer indices, as used by Python. In the example, row 4 and column 2 `[3, 1]`. Here we will usually use the named, meaningful, indices.
%% Cell type:markdown id: tags: %% Cell type:markdown id: tags:
### Basic Operations ### Basic Operations
On `Series` and `DataFrame` we can perform many different _Operations_. These can be so called _Attributes_ or _Methods_. Bluntly said attributes can not be gives options but methods can. Attributes are thus more like a variable, connected to a certain object, where Methods are a function. Both attributes and methods follow the Series or DataFrame, seperated by a dot. On `Series` and `DataFrame` we can perform many different _Operations_. These can be so called _Attributes_ or _Methods_. Bluntly said attributes can not be gives options but methods can. Attributes are thus more like a variable, connected to a certain object, where Methods are a function. Both attributes and methods follow the Series or DataFrame, seperated by a dot.
For example, we can find the index of a Series using `.index` For example, we can find the index of a Series using `.index`
%% Cell type:code id: tags: %% Cell type:code id: tags:
``` python ``` python
import pandas as pd import pandas as pd
from pandas import Series, DataFrame from pandas import Series, DataFrame
temp_luz = Series([1,5,9,15,20,25,25], temp_luz = Series([1,5,9,15,20,25,25],
index=("jan","feb","mar","apr","mai","jun","jul") index=("jan","feb","mar","apr","mai","jun","jul")
) )
temp_luz.index temp_luz.index
print(temp_luz.index) print(temp_luz.index)
``` ```
%% Output %% Output
Index(['jan', 'feb', 'mar', 'apr', 'mai', 'jun', 'jul'], dtype='object') Index(['jan', 'feb', 'mar', 'apr', 'mai', 'jun', 'jul'], dtype='object')
%% Cell type:markdown id: tags: %% Cell type:markdown id: tags:
With the Method `.mean()` we can find the average of a `Series` or `DataFrame`: With the Method `.mean()` we can find the average of a `Series` or `DataFrame`:
%% Cell type:code id: tags: %% Cell type:code id: tags:
``` python ``` python
import pandas as pd import pandas as pd
from pandas import Series, DataFrame from pandas import Series, DataFrame
temp_luz = Series([1,5,9,15,20,25,25], temp_luz = Series([1,5,9,15,20,25,25],
index=("jan","feb","mar","apr","mai","jun","jul") index=("jan","feb","mar","apr","mai","jun","jul")
) )
temp_luz.mean() temp_luz.mean()
print(temp_luz.mean()) print(temp_luz.mean())
``` ```
%% Output %% Output
14.285714285714286 14.285714285714286
%% Cell type:markdown id: tags: %% Cell type:markdown id: tags:
In the brackets we could pass options or parameters. This may not be usefull in case of a Series, but in case of DataFrames it may be very usefull. In the brackets we could pass options or parameters. This may not be usefull in case of a Series, but in case of DataFrames it may be very usefull.
%% Cell type:markdown id: tags: %% Cell type:markdown id: tags:
The method `.mean()` on the DataFrame returns a Series with the mean temperatures of each city. If we would want to know the mean temperature of each month instead, we have to pass the option `axis=1` (the default is `axis=0`). The method `.mean()` on the DataFrame returns a Series with the mean temperatures of each city. If we would want to know the mean temperature of each month instead, we have to pass the option `axis=1` (the default is `axis=0`).
%% Cell type:code id: tags: %% Cell type:code id: tags:
``` python ``` python
import pandas as pd import pandas as pd
from pandas import Series, DataFrame from pandas import Series, DataFrame
temp = DataFrame({ temp = DataFrame({
"Luzern": ([1,5,9,15,20,25,25]), "Luzern": ([1,5,9,15,20,25,25]),
"Basel": ([3,4,12,16,18,23,32]), "Basel": ([3,4,12,16,18,23,32]),
"Zuerich": ([8,6,10,17,23,22,24])}, "Zuerich": ([8,6,10,17,23,22,24])},
index=["jan","feb","mar","apr","mai","jun","jul"] index=["jan","feb","mar","apr","mai","jun","jul"]
) )
temp.mean(axis=1) temp.mean(axis=1)
print(temp.mean(axis=1)) print(temp.mean(axis=1))
``` ```
%% Output %% Output
jan 4.000000 jan 4.000000
feb 5.000000 feb 5.000000
mar 10.333333 mar 10.333333
apr 16.000000 apr 16.000000
mai 20.333333 mai 20.333333
jun 23.333333 jun 23.333333
jul 27.000000 jul 27.000000
dtype: float64 dtype: float64
%% Cell type:markdown id: tags: %% Cell type:markdown id: tags:
We can also find the minimum temperature in each month: We can also find the minimum temperature in each month:
%% Cell type:code id: tags: %% Cell type:code id: tags:
``` python ``` python
import pandas as pd import pandas as pd
from pandas import Series, DataFrame from pandas import Series, DataFrame
temp = DataFrame({ temp = DataFrame({
"Luzern": ([1,5,9,15,20,25,25]), "Luzern": ([1,5,9,15,20,25,25]),
"Basel": ([3,4,12,16,18,23,32]), "Basel": ([3,4,12,16,18,23,32]),
"Zuerich": ([8,6,10,17,23,22,24])}, "Zuerich": ([8,6,10,17,23,22,24])},
index=["jan","feb","mar","apr","mai","jun","jul"] index=["jan","feb","mar","apr","mai","jun","jul"]
) )
temp.min(axis=1) temp.min(axis=1)
print(temp.min(axis=1)) print(temp.min(axis=1))
``` ```
%% Output %% Output
jan 1 jan 1
feb 4 feb 4
mar 9 mar 9
apr 15 apr 15
mai 18 mai 18
jun 22 jun 22
jul 24 jul 24
dtype: int64 dtype: int64
%% Cell type:markdown id: tags: %% Cell type:markdown id: tags:
Similarly we could find the maximum temperature in each month using `temp.max(axis=1)`. Similarly we could find the maximum temperature in each month using `temp.max(axis=1)`.
......
0% Loading or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment