Transaction: bd6d577bf6e39a97af4e901fd813014796eb20f1cd73e87aade0cffb5bf64d50

#0

0.00000000 BSV

jÃ¾àA XÔd"Rb@±ïô,*µ3§á\Ê)¸Wd:Uî`EVéñ|i³Ï-ÄÚ?£áû$¥ËkH+äwE"166fBLEoU1L5QhnDWGf7xiEnFHJeaioLb4A >+N1Î»nÉ1ÐlQLIKÉTÏ=Mùo)87£htÍ©zQ+&wcAd4xuV E±B.jk"1883nWhhJtFPuX6ysaqqrsMxtPXWxZTTPm 1630904419|ø¦1

https://whatsonchain.com/tx/bd6d577bf6e39a97af4e901fd813014796eb20f1cd73e87aade0cffb5bf64d50

�jÃ¾àA XÔd"Rb@±ïô,*µ3§á\Ê)¸Wd:Uî`EVéñ|i³Ï-ÄÚ?£áû$¥ËkH+äwE"166fBLEoU1L5QhnDWGf7xiEnFHJeaioLb4A >+N1Î»nÉ1Ð�lQLIKÉTÏ=Mùo)87£htÍ©zQ+&wcAd4xuV	E±B.jk"1883nWhhJtFPuX6ysaqqrsMxtPXWxZTTPm
1630904419|ø¦1

#1

OP_RETURN

0.00000000 BSV

jôÏ38 text/markdownMßIf you have read my previous posts [The Myth of the 'Black Box' AI](https://www.bitpost.app/u/zachrobertson/the-myth-of-the-black-box-ai-WJaCmu) or [The Blockchain and AI](https://www.bitpost.app/u/zachrobertson/the-blockchain-and-ai-A1yCwz5) you will probably be able to tell I'm a little obsessed with AI and in this post I hope to explain one of the most popular forms of AI. It is important to me that people understand these things on a fundamental level so when they are faced with fear mongering or misinformation about AI they are able to look past that manipulation and use their mathematical understanding of AI to think about what is truly possible with the technology and what is snake oil. Okay, so to start I guess I should explain what exactly a neural network is. The basic concept is to copy the function of a biological neuron, which takes in some input (electrical charge) and outputs some other electrical charge based on the internal logic of the neuron. To do this we create computational "neurons" that take in some input and apply a weight function to them, where the weight of each neuron will be adjusted by the training process. To help visualize this here is an example of the most simple neural network you could possibly create, called a [Perceptron](https://en.wikipedia.org/wiki/Perceptron). ![Linear Regression Neural Network](https://github.com/zachrobertson/obsidian_images/blob/master/Images/Perceptron.drawio.png?raw=true) A Perceptron is a type of neural network meant for [linear regression](https://en.wikipedia.org/wiki/Linear_regression), which means guessing the outcome of a approximately linear system. The example above shows a Perceptron with only one neuron, this can be used for linear regression because the output function is linear, namely `Y = W*X + b`, where `X` is the input value, `W` is the weight of the neuron, `b` is the bias of the neuron and `Y` is the output from the Perceptron . Of course for this to be an accurate representation of a linear system we need to either know `W` and `b` in advance, which defeats the whole purpose of this type of system, or we need come up with a way to teach the neuron what the values of `W` and `b` are. The later is done through a process called [Gradient Descent](https://en.wikipedia.org/wiki/Gradient_descent), which uses the idea of loss to find the optimal weight `W` and bias `b` for the Perceptron system by attempting to minimize this loss value over the [feature](https://en.wikipedia.org/wiki/Feature_(machine_learning)) space. The mathematical formulation for this loss function can be a lot of different things and changes wildly based on the type of neural network we are building, for a linear regression model the standard loss function is something called [mean-square error](https://en.wikipedia.org/wiki/Mean_squared_error), which in mathematical terms looks like: ![Mean Squared Error](https://github.com/zachrobertson/obsidian_images/blob/master/Images/mean_squared_error.jpeg?raw=true) Where È³áµ¢ is the predicted output and yáµ¢ is the actual value. We can then substitute È³áµ¢ with the equation above for the output of the Perceptron. This will give us an equation for the mean squared error in terms of the weight `W` and bias `b` of the Perceptron that we want to optimize. The next step is to use this loss function and gradient descent to find the optimum values for `W` and `b`. The basic idea of this is that we take the derivative of the loss function with respect to `W` and `b` then take a small step in the negative direction of the gradient. We continue to do this until the loss function is producing outputs close to 0 (ideally equal to 0). To visualize this you can think about a skateboarder on a half-pipe, who for some reason is trying to find the lowest point on the ramp (analogous to the smallest loss) you would want to find the point at which if you moved to the left or right (if we are constrained to only two dimensions) you would be accelerated back to the place you came from. # But wait what about "Deep"? So far we have only talked about the most basic form or a neural network, but it is important to understand this very simple example as all other neural network architectures build on the concepts of the Perceptron. Other neural networks will add additional neurons in parallel to the Perceptron, others will have more neurons in series (called layers) and some will do both at the same time (there are many other things to do as well but these are the basics). The only difference between a neural network and a deep neural network are the ways in which the neurons are interconnected. In a standard neural network you have only one input and one output for each neuron in a layer, whereas with a deep neural network the outputs for each neuron go to the inputs of every other neuron in the next layer, which looks something like this: ![Deep Neural Network](https://github.com/zachrobertson/obsidian_images/blob/master/Images/deep-neural-network-AI.jpg?raw=true) This represents a deep neural network with 5 input neurons in red, 4 hidden layers (a name of layers that are not the input or output layers) in yellow (each with 7 neurons) and and output layer with 4 neurons. This may seems like something that is far too complex to be expressed with math but it is actually quite simple, this is just a linear combination of Perceptrons (so just a whole bunch of Perceptrons added together). However, in our calculations for the Perceptron we only had to optimize for two values (`W` and `b`) but for a deep neural network there will be hundreds if not thousands to millions of variables that need to be optimized. This leads to two issues, one being that vast computational resources are needed to train a deep neural network with a large number of variables to optimize, but also the the function that represents the loss function becomes increasingly more complex and the likelihood of getting stuck in a local minimum instead of finding the absolute minimum grows much higher (more on this is the days to come). Thanks for reading Zach|$2KbitpostarticletitleDeep Neural Networks|Ðoptsenc0pub1

https://whatsonchain.com/tx/bd6d577bf6e39a97af4e901fd813014796eb20f1cd73e87aade0cffb5bf64d50

�jôÏ38
text/markdownMßIf you have read my previous posts [The Myth of the 'Black Box' AI](https://www.bitpost.app/u/zachrobertson/the-myth-of-the-black-box-ai-WJaCmu) or [The Blockchain and AI](https://www.bitpost.app/u/zachrobertson/the-blockchain-and-ai-A1yCwz5) you will probably be able to tell I'm a little obsessed with AI and in this post I hope to explain one of the most popular forms of AI. It is important to me that people understand these things on a fundamental level so when they are faced with fear mongering or misinformation about AI they are able to look past that manipulation and use their mathematical understanding of AI to think about what is truly possible with the technology and what is snake oil.

Okay, so to start I guess I should explain what exactly a neural network is. The basic concept is to copy the function of a biological neuron, which takes in some input (electrical charge) and outputs some other electrical charge based on the internal logic of the neuron. To do this we create computational "neurons" that take in some input and apply a weight function to them, where the weight of each neuron will be adjusted by the training process. To help visualize this here is an example of the most simple neural network you could possibly create, called a [Perceptron](https://en.wikipedia.org/wiki/Perceptron).

![Linear Regression Neural Network](https://github.com/zachrobertson/obsidian_images/blob/master/Images/Perceptron.drawio.png?raw=true)

A Perceptron is a type of neural network meant for [linear regression](https://en.wikipedia.org/wiki/Linear_regression), which means guessing the outcome of a approximately linear system. The example above shows a Perceptron with only one neuron, this can be used for linear regression because the output function is linear, namely `Y = W*X + b`, where `X` is the input value, `W` is the weight of the neuron, `b` is the bias of the neuron and  `Y` is the output from the Perceptron . Of course for this to be an accurate representation of a linear system we need to either know `W` and `b` in advance, which defeats the whole purpose of this type of system, or we need come up with a way to teach the neuron what the values of `W` and `b` are. The later is done through a process called [Gradient Descent](https://en.wikipedia.org/wiki/Gradient_descent), which uses the idea of loss to find the optimal weight `W`  and bias `b` for the Perceptron system by attempting to minimize this loss value over the [feature](https://en.wikipedia.org/wiki/Feature_(machine_learning)) space. The mathematical formulation for this loss function can be a lot of different things and changes wildly based on the type of neural network we are building, for a linear regression model the standard loss function is something called [mean-square error](https://en.wikipedia.org/wiki/Mean_squared_error), which in mathematical terms looks like:

![Mean Squared Error](https://github.com/zachrobertson/obsidian_images/blob/master/Images/mean_squared_error.jpeg?raw=true)

Where È³áµ¢ is the predicted output and yáµ¢ is the actual value. We can then substitute È³áµ¢ with the equation above for the output of the Perceptron. This will give us an equation for the mean squared error in terms of the weight `W` and bias `b` of the Perceptron that we want to optimize. The next step is to use this loss function and gradient descent to find the optimum values for `W` and `b`. The basic idea of this is that we take the derivative of the loss function with respect to `W` and `b` then take a small step in the negative direction of the gradient. We continue to do this until the loss function is producing outputs close to 0 (ideally equal to 0). To visualize this you can think about a skateboarder on a half-pipe, who for some reason is trying to find the lowest point on the ramp (analogous to the smallest loss) you would want to find the point at which if you moved to the left or right (if we are constrained to only two dimensions) you would be accelerated back to the place you came from.

# But wait what about "Deep"?
So far we have only talked about the most basic form or a neural network, but it is important to understand this very simple example as all other neural network architectures build on the concepts of the Perceptron. Other neural networks will add additional neurons in parallel to the Perceptron, others will have more neurons in series (called layers) and some will do both at the same time (there are many other things to do as well but these are the basics). The only difference between a neural network and a deep neural network are the ways in which the neurons are interconnected. In a standard neural network you have only one input and one output for each neuron in a layer, whereas with a deep neural network the outputs for each neuron go to the inputs of every other neuron in the next layer, which looks something like this:

![Deep Neural Network](https://github.com/zachrobertson/obsidian_images/blob/master/Images/deep-neural-network-AI.jpg?raw=true)

This represents a deep neural network with 5 input neurons in red, 4 hidden layers (a name of layers that are not the input or output layers) in yellow (each with 7 neurons) and and output layer with 4 neurons. This may seems like something that is far too complex to be expressed with math but it is actually quite simple, this is just a linear combination of Perceptrons (so just a whole bunch of Perceptrons added together). However, in our calculations for the Perceptron we only had to optimize for two values (`W` and `b`) but for a deep neural network there will be hundreds if not thousands to millions of variables that need to be optimized. This leads to two issues, one being that vast computational resources are needed to train a deep neural network with a large number of variables to optimize, but also the the function that represents the loss function becomes increasingly more complex and the likelihood of getting stuck in a local minimum instead of finding the absolute minimum grows much higher (more on this is the days to come).

Thanks for reading
Zach|$2KbitpostarticletitleDeep Neural Networks|Ðoptsenc0pub1

#2

1KYYakr8DaVhow8pgnHVwPEFi9Re52RiJD

0.00010493 BSV

Settings

Transaction

1 Input

3 Outputs