How to Make a Neural Network – Intro to Deep Learning #2
Articles,  Blog

How to Make a Neural Network – Intro to Deep Learning #2


How do we learn? Although times may change,
some concepts stay the same. Unchanging, information
outlasts the body. It’s stored in our brain, but can be passed down from
generation to generation. Our brain is capable of
synthesizing the diverse set of inputs we call our five senses, and from
them, creating a hierarchy of concepts. If we’re lucky, we can learn a task while being
supervised by a teacher directly. While interacting with our environment,
we can feel our surroundings, see our obstacles and
try to predict the next steps. If we try at first and
fail, that’s okay. Through the process of trial and
error, we can learn anything. But what is it that gives our
brain this special capability unlike anything else in nature? Everything we’ve ever experienced or
felt, all our thoughts and memories, our very sense of self,
is produced by the brain. At the molecular level, our brain consists of an estimated 100
billion nerve cells called neurons. Each neuron has three jobs, receive a set of signals from
what are called its dendrites. Integrate those signals together
to determine whether or not the information should be passed
on in the cell body, or soma. And then if the sum of the signals
passes a certain threshold, send this resulting signal,
called the action potential, onwards via its axon to
the next set of neurons. Hello world. It’s Siraj, and we’re going to build
our own neural network in Python. The rules that govern the brain
give rise to intelligence. It’s the same algorithm that invented
modern language, space flight, Shia LaBeouf. It’s what makes us, us. It’s what allowed us to survive and
thrive on planet Earth. But as far as we’ve come as a species, we still face a host of existential
threats to our existence. There’s the impending
threat of climate change, the possibility of biochemical warfare,
an asteroid impact. These are nontrivial problems that could
take our biological neural networks many generations to solve. But what if we could harness this power,
what if we could create an artificial neural network, and have it run on
a non-biological substrate like silicon? We could give it more computing power
and data than any one human would be capable of handling, and
have it solve problems a thousand, or even a million times faster
than we could alone. In 1943, two early computer
scientists named Warren McCulloch and Walter Pitts invented the first
computational model of a neuron. Their model demonstrated a neuron that
received binary inputs, summed them, and if the sum exceeded a certain
threshold value, output a 1. If not, output a 0. It was a simple model. But in the early days of AI,
this was a big deal, and got computer scientists talking
about the possibilities. A few years later, a psychologist named
Frank Rosenblatt was frustrated that the McCulloch-Pitts model still
lacked the mechanism for learning. So he conceived a neural model
that built on their idea which he called the Perceptron,
which is another word for a single layer feedforward
neural network. We call it feedforward because the data
just flows in one direction, forward. The Perceptron incorporated
the idea of weights on the inputs. So, given some training set
of input output examples, it should learn a function from it by
increasing or decreasing the weights continuously for each example,
depending on what its output was. These weight values are mathematically
applied to the input, such that after each iteration, the
output prediction gets more accurate. To best understand this process we call
training, let’s build our own single layer neural network in Python
using only Numpy as our dependency. In our main function, we’ll first
initialize our neural network, which w’ell later define
as its own class. Then print out its starting weights for
a reference when we demo it. We can now define our data set. We’ve got four examples. Each example has three input values and
one output value. They’re all ones and zeros. The T function transposes the matrix
from horizontal to vertical. So the computer is storing
the numbers like this. We’ll train our neural
network on these values so that given a new list of ones and zeros,
it’ll be able to predict whether or not the output should be a one or zero. Since we are identifying
which category it belongs to, this is considered a classification
task in machine learning. We’ll train our network on this data
by using them as arguments to our train function, as well as a number, 10,000, which is the amount of times we’d
like to iterate during training. After it’s done training,
we’ll print out the updated weights so we can compare them, and finally, we’ll
predict the output given a new input. We’ve got our main function ready, so let’s now define our
NeuralNetwork class. When we initialize the class, the first
thing we want to do is seed it. We’ll initialize our weight values
randomly in a second, and seeding them makes sure that it generates the same
numbers every time the program runs. This is useful for debugging later on. We’ll assign random weights to a three
by one matrix with values in the range of -1 to 1 with a mean of 0. Since our single neuron has three input
connections and one output connection. Next we’ll write out our activation
function, which, in our case, will be a sigmoid. It describes an s shaped curve. We pass the weighted sum
of the inputs through it, and it will convert them to
a probability between 0 and 1. This probability will
help make our prediction. We’ll use our sigmoid function
directly in our predict function, which takes inputs as parameters and
passes them through our neuron. To get the weighted sum of our inputs, we’ll compute the dot product
of our inputs and our weights. This is how our weights govern
the attention of how data flows in our neural net, and this function
will return our prediction. Now we can write out our train function,
which is the real meat of our code. We’ll write a for loop to iterate
10,000 times, as we specified, then use our predict function to pass
the training set through the network and get the output value,
which is our prediction. We’ll next calculate the error, which is the difference between the
desired output and our predicted output. We want to minimize
this error as we train, and we’ll do this by iteratively
updating our weights. We’ll calculate the necessary adjustment
by computing the dot product of our input’s transpose and the error, multiplied by
the gradient of the sigmoid curve. So less confident weights
are adjusted more, and inputs that are zero don’t
cause changes to the weights. This process is called gradient descent.>>Yeah, I’m descending that gradient!>>We’ll also write out the function
that calculates the derivative of our sigmoid,
which gives us its gradient, or slope. This measures how confident we are of
the existing weight value, and helps us update our prediction
in the right direction. Finally, once we have our adjustment, we’ll update our weights
with that value. This process of propagating our
error value back into our network, to adjust our weights,
is called back propagation. Let’s demo this baby in Terminal. Because the training set is so small,
it took milliseconds to train it. We can see that our weight
values updated themselves after all those iterations. And when we fed it a novel input, it predicted that the output
was very likely a one. We just made our first neural network,
from scratch! Anyways, about backpropagation, I- [MUSIC] So as dope as Rosenblatt’s idea was,
in the decades following it, neural networks didn’t really give
us any kind of noteworthy results. They could only
accomplish simple things. But as the World Wide Web
grew from a CERN project to the massive nervous system for
humanity that it is today, we’ve seen an explosion in data and
computing power. And a small group of researchers
funded by the Canadian government held fast to their belief in
the power of neural networks to help us find
solutions from this data. When they took a neural net and made
it not one or two but many layers deep. Gave it a huge data set and lots of
computing power, they discovered it could outperform humans in tasks
that we thought only we could do. This is profound. Our biological neural network is
carbon-based, sending electrochemicals, like acetylcholine, glutamate,
and serotonin, as signals. An artificial neural network doesn’t
even exist in physical space. It’s an abstract concept we
programmatically created, and it’s represented on silicon transistors. Yet despite the complete difference
in mediums, they both developed a very similar mechanism for processing
information, and the results show that. Perhaps there’s a law of intelligence
encoded into our universe, and we’re coming ever closer to finding it. So to break it down, a neural network is
a biologically inspired algorithm that learns to identify patterns in data. Backpropagation is
a popular technique to train a neural network by continually
updating weights via gradient descent. And when we train a many layer deep
neural network on lots of data, using lots of computing power,
we call this process deep learning. The coding challenge winner for
last week is Ludo Bouan. Ludo made a really slick iPython
notebook to demo not just 2D repression, but 3D regression as well on
a climate change data set. Wizard of the week. And the runner up is Amanullah Tariq. He completed the bonus
with great results. The challenge for this video is
to create a not one, not two, but three layer feedforward
neural network using just numpy. Post your GitHub link
in the comments and I’ll announce the winner in one week. Please subscribe, and for now, I’ve got
to update my weights, so thanks for watching.

100 Comments

Leave a Reply

Your email address will not be published. Required fields are marked *