This week's recommended readings are Goodfellow's *Deep Learning* textbook Chapter 6 (pp. 164-172) and Kriegeskorte and Golan's introduction to neural networks (pp. 1-5). The latter can be downloaded from the main course page. For those interested, the reading materials also contain the original paper on the perceptron.

In the late 1950s, Frank Rosenblatt proposed the perceptron, an algorithm for pattern classification loosely inspired by biological neurons. The perceptron consists of a set of input units that are connected to an output unit. The output unit takes a linear weighted sum of the inputs and passes the sum through a stepwise activation function. Importantly, the perceptron's weights are trainable. This feature provides the perceptron with the ability to learn input-output mappings, constituting a major advance compared to the earlier McCulloch-Pitts neuron model.

The perceptron function is defined by the equation:

\begin{equation} f(x) = \begin{cases} 1 & \text{if $w ⋅ x + b > 0$ ,}\\ 0 & \text{otherwise} \end{cases} \end{equation}where \(x\) is the input vector, \(w\) are the weights connecting the input units to the output unit, and \(b\) is the bias term. \(w\) ⋅ \(x\) is the dot product \( \sum_{i=1}^{n}\) \(w_{i}\)\(x_{i}\), where \(n\) is the number of inputs into the perceptron.

We will use the perceptron to classify inputs into two categories. Let our categories be ripe and unripe apples. Each apple is described by two values: colour (\(x_{1}\)) and size (\(x_{2}\)). Values on the colour dimension range from green (low) to red (high). Values on the size dimension range from small (low) to large (high). Unripe apples have low values on both dimensions; ripe apples have high values on both dimensions. Let's examine how well a pretrained perceptron can classify apples as ripe or unripe.

```
# import libraries
import numpy as np
from numpy import random
import matplotlib.pyplot as plt
import numpy.matlib
# function definitions
# generate data for 2 classes (input x and output f(x))
def generate_data(means, sigma, ndatapoints):
nclasses = 2
data = np.zeros((nclasses * ndatapoints, 3))
for c in range(0, nclasses):
starti = c * ndatapoints
endi = (c + 1) * ndatapoints
data[starti:endi, 0:2] = means[c] + sigma * random.standard_normal((ndatapoints, 2))
data[starti:endi, 2] = c
randvec = np.random.permutation(nclasses * ndatapoints)
data = data[randvec,:]
return data, randvec;
# plot the decision boundary
def plot_boundary(weights, figi):
b = weights[0]; w1 = weights[1]; w2 = weights[2]
slope = -(b / w2) / (b / w1)
y_intercept = -b / w2
x = np.linspace(0,1,100)
y = (slope * x) + y_intercept
plt.figure(figi)
plt.plot(x, y)
plt.pause(0.4)
# predict output
def predict(inputs, weights):
summation = np.dot(inputs, weights[1:]) + weights[0]
if summation > 0:
prediction = 1
else:
prediction = 0
return prediction
# test perceptron
def test(data, weights):
inputs_test = data[:,0:2]
labels = data[:,2]
npredictions = data.shape[0]
predictions = np.zeros(npredictions)
for i in range(0, npredictions):
predictions[i] = predict(inputs_test[i,:], weights)
return predictions
# generate test data
means = (0.3,0.7)
sigma = 0.08
ndatapoints = 50
data_output_test = generate_data(means, sigma, ndatapoints)
data_test = data_output_test[0]
randvec_test = data_output_test[1]
# pretrained weights (b, w1, w2)
weights = np.array([-0.01261552, 0.00952113, 0.01201932])
# show generated data and decision boundary
colors_test = np.concatenate((np.matlib.repmat(np.array([1, 0.5, 1]),ndatapoints,1),np.matlib.repmat(np.array([0.5, 0.5, 1]),ndatapoints,1)))
colors_test = colors_test[randvec_test,:]
figi_test = 2; plt.figure(figi_test)
plt.scatter(data_test[:,0], data_test[:,1], c=colors_test, alpha=0.5)
plt.axis('square')
plt.xlabel('x1 (0 = green, 1 = red)')
plt.ylabel('x2 (0 = small, 1 = large)')
plt.title('classes of apples (test data)')
plot_boundary(weights, figi_test)
# inspect predictions
predictions = test(data_test, weights)
labels_test = data_test[:,2]
errors = labels_test - predictions # nonzero entries indicate errors
nerrors = np.sum(errors**2)
```

Play around with different parameters for the test data, i.e. change the means and sigma and examine the errors (the current parameters are identical to the parameters used for training).

The perceptron's learning rule is defined as follows:

\begin{equation} w_{i}(t + 1) = w_{i}(t) + \alpha(y - \hat{y}(t))x_{i} \end{equation}where \(w_{i}\) is the weight associated with input unit \(x_{i}\), \(t\) is time, \( \alpha\) is the learning rate, \(y\) is the correct output, \( \hat{y}\) is the output of the perceptron for input vector \(x\). The learning rule is applied to each weight after each input. The bias term is also adjusted after each input, according to the learning rule below. In our python code, the bias term will be indicated with \(w_{0}\).

\begin{equation} b(t + 1) = b(t) + \alpha(y - \hat{y}(t)) \end{equation}```
# function definitions
def train(data, learning_rate, niterations, figi):
training_inputs = data[:,0:2]
labels = data[:,2]
weights = 0.001 * random.standard_normal(data.shape[1])
errors = np.zeros((data.shape[0], niterations))
j = 0
for _ in range(niterations):
i = 0
for inputs, label in zip(training_inputs, labels):
prediction = predict(inputs, weights)
weights[1:] += learning_rate * (label - prediction) * inputs
weights[0] += learning_rate * (label - prediction)
errors[i,j] = label - prediction
plot_boundary(weights, figi)
i += 1
j += 1
return weights, errors;
# generate training data
means = (0.3,0.7)
sigma = 0.08
ndatapoints = 20
data_output_train = generate_data(means, sigma, ndatapoints)
data_train = data_output_train[0]
randvec_train = data_output_train[1]
# show generated data
colors_train = np.concatenate((np.matlib.repmat(np.array([1, 0.5, 1]),ndatapoints,1),np.matlib.repmat(np.array([0.5, 0.5, 1]),ndatapoints,1)))
colors_train = colors_train[randvec_train,:]
figi_train = 1; plt.figure(figi_train)
plt.scatter(data_train[:,0], data_train[:,1], c=colors_train, alpha=0.5)
plt.axis('square')
plt.xlabel('x1 (0 = green, 1 = red)')
plt.ylabel('x2 (0 = small, 1 = large)')
plt.title('classes of apples (training data)')
# train perceptron
learning_rate = 0.01
niterations = 2
plt.figure(1)
plt.xlim(-10,10); plt.ylim(-10,10)
training_output = train(data_train, learning_rate, niterations, figi_train)
weights = training_output[0]
errors = training_output[1]
sse = np.sum(errors**2,0)
plt.figure(1); plt.xlim(0,1); plt.ylim(0,1) # zoom in to final solution
```

Train the perceptron again on the same training data. Do you get the same solution?

Play around with different parameters for generating the training data, i.e. change the means and sigma and examine whether the perceptron algorithm converges.

Now test the perceptron that you just trained. The implementation of this step is identical to *Predicting outputs using a pretrained perceptron*, except that you are now using your 'own' weights.

```
# generate test data
means = (0.3,0.7)
sigma = 0.08
ndatapoints = 50
data_output_test = generate_data(means, sigma, ndatapoints)
data_test = data_output_test[0]
randvec_test = data_output_test[1]
# show generated data and decision boundary
colors_test = np.concatenate((np.matlib.repmat(np.array([1, 0.5, 1]),ndatapoints,1),np.matlib.repmat(np.array([0.5, 0.5, 1]),ndatapoints,1)))
colors_test = colors_test[randvec_test,:]
figi_test = 3; plt.figure(figi_test)
plt.scatter(data_test[:,0], data_test[:,1], c=colors_test, alpha=0.5)
plt.axis('square')
plt.xlabel('x1 (0 = green, 1 = red)')
plt.ylabel('x2 (0 = small, 1 = large)')
plt.title('classes of apples (test data)')
plot_boundary(weights, figi_test)
# test
predictions = test(data_test, weights)
labels_test = data_test[:,2]
errors = labels_test - predictions # nonzero entries indicate errors
nerrors = np.sum(errors**2)
```

Sources for this programming lab:

perceptron training

decision boundaries