An artificial neural network

is a network that is designed on the brain and is designed to perform explicit

tasks 1.

An artificial neural network (ANN), are simulations performed on computers to

execute particular tasks including classification, pattern recognition, and

clustering 1.

ANNs are comprised of

various nodes which can accept data as inputs and output simple operations on

the input data. The nodes imitate biological neurons in a human brain. The

outcome of the operations made by the nodes is then succeeded to other neurons.

The production at each node is referred to activation

or node value. 2

Input

Hidden

Output

Each link is associated with a weight which would then be

multiplied by its respective input. An ANN is competent of learning by means of

varying the values of the weights 2.

The following diagram exemplifies a simple ANN:

The input layer encloses neurons that accept input on which the network

will learn. The output layer is the final

layer which comprehends of the units that respond to the information about how

the network learned the tasks. The middle layer, also referred to as the hidden layer, converts the input into

something useful, that the output unit can use. 1

A popular educating

algorithm used in Neural Network is backpropagation

which is the training algorithm 2

and it is an addition of gradient based delta learning rule 1. If the program

finds an error, the error is propagated backward from output to the input layer

through the hidden layer 1.

The network weights are then altered by the algorithm in order to deliver the desired

output for a specific input 2.

The algorithm is concluded when the error is acceptably low.

The assignments’ aim is to

learn how ANNs operate and how they can be employed using a high-level

language. An ANN that learns a Boolean function is implemented. The neural

network has five input neurons, four hidden neurons and three output neurons.

The sigmoid function transfer function is used. An Error Back Propagation

algorithm is implemented in order to train weights. The last step is plotting a

graph using the library ‘matplotlib’. This assignment is implemented using the HLL

Python.

Programming Documentation

In

this section, one could find the list of variables used, a description of each

line of code and a proof of the running programming.

Variable

Description

I

Represents

the dataset matrix to be inputted. Each row in the matrix is a training

example.

O

Represents

the dataset matrix to be outputted. Each row in the matrix is a training

example.

lay0

This

is the primary layer of the Neural Network, which is itemised by the data

inputted.

lay1

This

is the subsequent layer of the Neural Network which is also referred to as

the hidden layer.

lay2

This

is the third layer of the Neural Network, this shows the output data.

W0

This

is the first layer of synapses, which connects lay0 to lay1.

W1

The

second layer of synapses, connecting lay1 to lay2.

Importing

Package

import numpy as np

This line of code

imports the linear algebra library, ‘numpy’.

import matplotlib.pyplot as plt

We also import the

matplotlib as plt in order to plot the graph.

The

Sigmoid Function

def nonlin(x, deriv=False):

if(deriv==True):

return(x*(x-1))

return 1/(1*np.exp(-x))

In the above code the

sigmoid function is represented, where if true is passed then the derivative of

the sigmoid would be calculated, this is one of the necessary assets of the

sigmoid function that the outputs of the function can be utilised to

generate its derivative. If the sigmoid’ output is equal to ‘out’, then the derivative

would be

.

However, if false occurs

then the derivative is not going to be calculated. The derivative is needed

when the error is being calculated in the backpropagation. The sigmoid function

will be run in every neuron.

The sigmoid

function plots any denomination to another denomination which is between 1 and 0.

A sigmoid function is defined by the following formula:

It is a

mathematical function and has a curve shaped like an ‘S’ as shown in the below

figure:

The Input

Data

I = np.array(0,0,0,0,0, 0,1,1,0,0, 1,0,0,1,1, 1,1,1,1,1)

Here we are

initialising the input dataset as a numpy matrix. Each column relates to one of

the input nodes, hence there are four nodes as inputs to

the network and five training examples.

The Output

Data

O = np.array(0,0,0, 1,0,0, 1,0,1, 0,1,0)

This sets the output

dataset, where the data is being produced horizontally with three rows and four

columns. The rows each represent a

training example and the columns each represent an

output node, therefore, it can be said that the

network has three inputs and one output.

Seeding

np.random.seed(1)

Seeding is done

in order to start at the dame point each time the neural network runs. This is

done as it would be simpler to observe how the modifications have an influence on

the neural network.

Creating

Weights

W0 = 2*np.random.random((5,3)) – 1

W0 = 2*np.random.random((5,4)) – 1

W1 = 2*np.random.random((4,3)) – 1

The variable ‘W0’

represents the weight matrix, which would be a five by three matrix. Later on

the weights of the first layer are generated randomly again in order to

generate a five by four matrix. The second layer of weights is a four by three

matrix of weights. The one at the end is the bias.

Training

for iter in range(epochs):

A for loop is used to

iterate numerous times in order to augment the network to the dataset. Here we

are continuously inputting our data and updating the weights over time to

backpropagation.

Layers

lay0 = I

lay1 = nonlin(np.dot(lay0, W0))

lay2

= nonlin(np.dot(lay1, W1))

Lay0 is our input layer and

it is the first layer of the Neural Network. Here we are doing a matrix

multiplication where we are multiplying the synapse, which is also known as the

weight, by the layers and the result is then passed through the sigmoid

function.

Backpropagation

lay2_error = O – lay2

if

(k % 10000) == 0:

print (‘Error:’ + str(np.mean(np.abs(lay2_error))))

In the backpropagation, the

algorithm tries to reduce the error each time the loop is run. The error we try

to reduce is where the prediction is inaccurate. The guess lay2 is subtracted

from the true answer O and the answer is stored in lay2_error which will show

how well the network did. Printing is done every then thousand step to see how

well the network is doing.

The following screenshot

shows how the output of the errors would look when the program is run. Since

the “epochs” variable is set to 60000 and the errors are being executed every

1000, a total of six errors should be shown. Errors should be getting closer to

zero the closer the iterations get to 60000.

Delta Calculations

lay2_delta = lay2_error *

nonlin(lay2, deriv=True)

lay1_error = lay2_delta.dot(W1.T)

lay1_delta

= lay1_error * nonlin(lay1, deriv=True)

Delta is the difference in

the quantity every time the loop is run. Here we are calculating deltas as the

data moves through the layers as the sigmoid function is applied to all of the

layers.

Updating of Weights/Synapses

W1 += lay1.T.dot(lay2_delta)

W0

+= lay0.T.dot(lay1_delta)

This line of code calculates

the synapse updates for each synapse for each training example.

Printing of the Output

print ()

print (‘The final output after training: ‘)

print

(lay2)

Together with the error

printing every then thousand steps, we are also printing out the final output

which would be stored in lay2.

The result is shown in the

following figure.

Plotting the Graph

plt.plot(lay2_delta,

nonlin(lay2,deriv=True), ‘ro’)

plt.show()

The final step was to plot

the graph. Here we are plotting the deltas of the final layer as the data moves

through the layers and the sigmoid function is applied to it against the

sigmoid function.

The yielded graph is shown

in the following screenshot. The “lay2_delta” is on the x-axis and the

“nonlin(lay2, deriv=Ture)” is on the y-axis:

Source Code

In

this section one would find the source code of the program that has been built

for the neural network and is as explained in the previous sections.

import numpy as np

import matplotlib.pyplot as plt

# Number

of iterations

epochs =

60000

#setting

the sizes for the input layer, hidden layer and the output layer respectively

inputLayerSize,

hiddenLayerSize, outputLayerSize = 5, 4, 3

#—Part

1—#

#

sigmoid function

#if true

is passed then the derivative of the sigmoid would be calculated

#if

false is passed it is not going to be calculated. The derivative is needed when

the

#error

is being calculated in the backpropagation

#This is

the sigmoid function that is going to be run in every neuron

def nonlin(x,deriv=False):

if(deriv==True):

return x*(1-x)

return 1/(1+np.exp(-x))

# input

dataset & output dataset

I = np.array(0,0,0,0,0, 0,1,1,0,0,

1,0,0,1,1, 1,1,1,1,1)

O = np.array( 0,0,0,

1,0,0, 1,0,1, 0,1,0)

#seed

random numbers to make calculation

#seeding

to start at the same point each time (good for debugging)

np.random.seed(1)

#—Part

2—-#

#creating

weights/synapses

#a five

by four matrix and the one is the bias

W0 = 2*np.random.random((5,4)) – 1

#A four

by three matrix and the one is the bias

W1 = 2*np.random.random((4,3)) – 1

#Training

#continuously

inputting our data and updating the weights over time to backpropagation

for j in range(epochs):

#layers

# Feed forward

through layers 0, 1, and 2

lay0 = I

#this is matrix multiplication,

multiplying the synapse W0 by layer 0 and the synapse W1 by layer 1

lay1 = nonlin(np.dot(lay0,W0))

lay2 = nonlin(np.dot(lay1,W1))

#backpropagation

#tries to reduce the error each time the

loop is run

#where the prediction is bad

lay2_error = O – lay2

#printing every

10000 steps to see how well it is doing

if (j% 10000) == 0:

#this will print out our error,

using the absolute value function of

numpy

#to make sure that it is a positive number. After this is done

#we would get the mean of that and print it as a string

print (“Error:

” + str(np.mean(np.abs(lay2_error))))

#Delta calculations

#The difference in the quantity

every time and we are going to

calculate

#as data moves through the layers as

the sigmoid function is applied to

#all of them elementary

multiplication is done between the layer and the

#derivative, which is set to true,

the result from this multiplication is

#then multiplied by the layer error.

The multiplication for the slopes with

#the error results in reducing the

error of high confidence predictions

lay2_delta = lay2_error*nonlin(lay2,deriv=True)

lay1_error = lay2_delta.dot(W1.T)

lay1_delta = lay1_error *

nonlin(lay1,deriv=True)

#updating the weights/synapses

W1 += lay1.T.dot(lay2_delta)

W0 += lay0.T.dot(lay1_delta)

print ()

print (“The final output after training:”)

print

(lay2)

#—Part

3=–#

#plotting

graph

plt.plot(lay2_delta, nonlin(lay2,deriv=True), ‘ro’)

plt.show()