In a neural network, each neuron (or node) receives inputs from other neurons or directly from the input data. These inputs are multiplied by corresponding weights, and the weighted sum is passed through an activation function to produce the output of the neuron.
-
Inputs: Suppose we have an input vector
mathbf{x} = [x_1, x_2, ldots, x_n]
representing features or attributes of our data.
-
Weights: Similarly, we have a weight vector
mathbf{w} = [w_1, w_2, ldots, w_n]
associated with each input. These weights are learned during the training process and determine the importance of each feature.
-
Linear Combination: The weighted sum (linear combination) of inputs and weights is calculated as:
z = w_1x_1 + w_2x_2 + ldots + w_nx_n
-
Activation Function: The result of the linear combination is then passed through an activation function (e.g., sigmoid, ReLU, etc.). This introduces non-linearity into the model. For instance, the sigmoid activation function is given by:
sigma(z) = frac{1}{1 + e^{-z}}
The output of the neuron is
sigma(z)
.
-
Bias Term: Additionally, we often include a bias term (similar to the fixed cost in the shopping bill analogy). The bias term allows the neuron to shift its decision boundary. The bias is represented as
b
, and the modified linear combination becomes:
z = w_1x_1 + w_2x_2 + ldots + w_nx_n + b
-
Output: Finally, the output of the neuron (after activation) is used as input for subsequent layers or as the final prediction.
Neural networks consist of multiple layers of interconnected neurons, forming a complex architecture capable of learning intricate patterns from data. The process of adjusting weights during training (using techniques like backpropagation) allows the network to learn and generalize from examples.