Introduction to Neural Networks

Programming Lab 9: Artificial neural networks


Readings

This week's recommended readings are Goodfellow's Deep Learning textbook Chapter 6 (pp. 164-172) and Kriegeskorte and Golan's introduction to neural networks (pp. 1-5). The latter can be downloaded from the main course page. For those interested, the reading materials also contain the original paper on the perceptron.


The perceptron

In the late 1950s, Frank Rosenblatt proposed the perceptron, an algorithm for pattern classification loosely inspired by biological neurons. The perceptron consists of a set of input units that are connected to an output unit. The output unit takes a linear weighted sum of the inputs and passes the sum through a stepwise activation function. Importantly, the perceptron's weights are trainable. This feature provides the perceptron with the ability to learn input-output mappings, constituting a major advance compared to the earlier McCulloch-Pitts neuron model.

Fig 1 The perceptron learns a function that maps input \(x\) to an output value \(f(x)\) by adjusting the weights \(w\) and bias \(b\). \(x\) is a real-valued vector, \(f(x)\) is a single binary value.

The perceptron function is defined by the equation:

\begin{equation} f(x) = \begin{cases} 1 & \text{if $w ⋅ x + b > 0$ ,}\\ 0 & \text{otherwise} \end{cases} \end{equation}

where \(x\) is the input vector, \(w\) are the weights connecting the input units to the output unit, and \(b\) is the bias term. \(w\) ⋅ \(x\) is the dot product \( \sum_{i=1}^{n}\) \(w_{i}\)\(x_{i}\), where \(n\) is the number of inputs into the perceptron.


Predict outputs using a pretrained perceptron

We will use the perceptron to classify inputs into two categories. Let our categories be ripe and unripe apples. Each apple is described by two values: colour (\(x_{1}\)) and size (\(x_{2}\)). Values on the colour dimension range from green (low) to red (high). Values on the size dimension range from small (low) to large (high). Unripe apples have low values on both dimensions; ripe apples have high values on both dimensions. Let's examine how well a pretrained perceptron can classify apples as ripe or unripe.

Fig 2 We wil use the perceptron to classify apples as ripe or unripe. In this example, the input (\(x\)) consists of two values, which reflect the colour (\(x_{1}\)) and size (\(x_{2}\)) of the apple. The correct output value \(f(x)\) is 0 for unripe apples and 1 for ripe apples.

# import libraries
import numpy as np
from numpy import random
import matplotlib.pyplot as plt
import numpy.matlib

# function definitions 
# generate data for 2 classes (input x and output f(x)) 
def generate_data(means, sigma, ndatapoints):
    nclasses = 2
    data = np.zeros((nclasses * ndatapoints, 3))
    for c in range(0, nclasses):
        starti = c * ndatapoints
        endi = (c + 1) * ndatapoints
        data[starti:endi, 0:2] = means[c] + sigma * random.standard_normal((ndatapoints, 2))
        data[starti:endi, 2] = c
    randvec = np.random.permutation(nclasses * ndatapoints)    
    data = data[randvec,:]
    return data, randvec;

# plot the decision boundary
def plot_boundary(weights, figi):
    b = weights[0]; w1 = weights[1]; w2 = weights[2]
    slope = -(b / w2) / (b / w1)
    y_intercept = -b / w2
    x = np.linspace(0,1,100)
    y = (slope * x) + y_intercept
    plt.figure(figi)
    plt.plot(x, y)
    plt.pause(0.4)

# predict output
def predict(inputs, weights):
    summation = np.dot(inputs, weights[1:]) + weights[0]
    if summation > 0:
      prediction = 1
    else:
      prediction = 0            
    return prediction

# test perceptron	
def test(data, weights):
    inputs_test = data[:,0:2]
    labels = data[:,2]
    npredictions = data.shape[0]
    predictions = np.zeros(npredictions)
    for i in range(0, npredictions):
        predictions[i] = predict(inputs_test[i,:], weights)
    return predictions
	
# generate test data   
means = (0.3,0.7)
sigma = 0.08
ndatapoints = 50
data_output_test = generate_data(means, sigma, ndatapoints)  
data_test = data_output_test[0]
randvec_test = data_output_test[1]

# pretrained weights (b, w1, w2)
weights = np.array([-0.01261552,  0.00952113,  0.01201932])

# show generated data and decision boundary
colors_test = np.concatenate((np.matlib.repmat(np.array([1, 0.5, 1]),ndatapoints,1),np.matlib.repmat(np.array([0.5, 0.5, 1]),ndatapoints,1)))
colors_test = colors_test[randvec_test,:]
figi_test = 2; plt.figure(figi_test)
plt.scatter(data_test[:,0], data_test[:,1], c=colors_test, alpha=0.5)
plt.axis('square')  
plt.xlabel('x1 (0 = green, 1 = red)')
plt.ylabel('x2 (0 = small, 1 = large)')
plt.title('classes of apples (test data)')
plot_boundary(weights, figi_test)

# inspect predictions
predictions = test(data_test, weights)
labels_test = data_test[:,2]
errors = labels_test - predictions # nonzero entries indicate errors
nerrors = np.sum(errors**2)

Play around with different parameters for the test data, i.e. change the means and sigma and examine the errors (the current parameters are identical to the parameters used for training).


Train the perceptron

The perceptron's learning rule is defined as follows:

\begin{equation} w_{i}(t + 1) = w_{i}(t) + \alpha(y - \hat{y}(t))x_{i} \end{equation}

where \(w_{i}\) is the weight associated with input unit \(x_{i}\), \(t\) is time, \( \alpha\) is the learning rate, \(y\) is the correct output, \( \hat{y}\) is the output of the perceptron for input vector \(x\). The learning rule is applied to each weight after each input. The bias term is also adjusted after each input, according to the learning rule below. In our python code, the bias term will be indicated with \(w_{0}\).

\begin{equation} b(t + 1) = b(t) + \alpha(y - \hat{y}(t)) \end{equation}
# function definitions
def train(data, learning_rate, niterations, figi):
    training_inputs = data[:,0:2]
    labels = data[:,2]    
    weights = 0.001 * random.standard_normal(data.shape[1])   
    errors = np.zeros((data.shape[0], niterations))
    j = 0
    for _ in range(niterations):
        i = 0
        for inputs, label in zip(training_inputs, labels):
            prediction = predict(inputs, weights)
            weights[1:] += learning_rate * (label - prediction) * inputs
            weights[0] += learning_rate * (label - prediction)
            errors[i,j] = label - prediction
            plot_boundary(weights, figi)
            i += 1   
        j += 1        
    return weights, errors;

# generate training data   
means = (0.3,0.7)
sigma = 0.08
ndatapoints = 20
data_output_train = generate_data(means, sigma, ndatapoints)  
data_train = data_output_train[0]
randvec_train = data_output_train[1]

# show generated data
colors_train = np.concatenate((np.matlib.repmat(np.array([1, 0.5, 1]),ndatapoints,1),np.matlib.repmat(np.array([0.5, 0.5, 1]),ndatapoints,1)))
colors_train = colors_train[randvec_train,:]
figi_train = 1; plt.figure(figi_train)
plt.scatter(data_train[:,0], data_train[:,1], c=colors_train, alpha=0.5)
plt.axis('square')  
plt.xlabel('x1 (0 = green, 1 = red)')
plt.ylabel('x2 (0 = small, 1 = large)')
plt.title('classes of apples (training data)')

# train perceptron
learning_rate = 0.01
niterations = 2
plt.figure(1)
plt.xlim(-10,10); plt.ylim(-10,10)
training_output = train(data_train, learning_rate, niterations, figi_train)
weights = training_output[0]
errors = training_output[1]
sse = np.sum(errors**2,0)
plt.figure(1); plt.xlim(0,1); plt.ylim(0,1) # zoom in to final solution

Train the perceptron again on the same training data. Do you get the same solution?
Play around with different parameters for generating the training data, i.e. change the means and sigma and examine whether the perceptron algorithm converges.


Test the perceptron

Now test the perceptron that you just trained. The implementation of this step is identical to Predicting outputs using a pretrained perceptron, except that you are now using your 'own' weights.

# generate test data   
means = (0.3,0.7)
sigma = 0.08
ndatapoints = 50
data_output_test = generate_data(means, sigma, ndatapoints)  
data_test = data_output_test[0]
randvec_test = data_output_test[1]

# show generated data and decision boundary
colors_test = np.concatenate((np.matlib.repmat(np.array([1, 0.5, 1]),ndatapoints,1),np.matlib.repmat(np.array([0.5, 0.5, 1]),ndatapoints,1)))
colors_test = colors_test[randvec_test,:]
figi_test = 3; plt.figure(figi_test)
plt.scatter(data_test[:,0], data_test[:,1], c=colors_test, alpha=0.5)
plt.axis('square')  
plt.xlabel('x1 (0 = green, 1 = red)')
plt.ylabel('x2 (0 = small, 1 = large)')
plt.title('classes of apples (test data)')
plot_boundary(weights, figi_test)

# test
predictions = test(data_test, weights)
labels_test = data_test[:,2]
errors = labels_test - predictions # nonzero entries indicate errors
nerrors = np.sum(errors**2)

Sources for this programming lab:
perceptron training
decision boundaries