NorthGradient
Start reading
Neural Networks Browse lessons

Neural Networks · Neural Networks · 5 min read

From a neuron to a layer

A single neuron turns a list of inputs into one number. That is rarely enough. A layer is simply many neurons placed side by side, all reading the same inputs at the same time, each producing its own output. The useful trick is that the whole layer can be written as a single equation.

A layer is many neurons reading the same inputs at once. Stacking their weights into a matrix lets one equation describe the entire layer.

Many neurons, same inputs

Give the same input vector x\mathbf{x} to mm different neurons. Neuron jj has its own weights wj\mathbf{w}_j and its own bias bjb_j, so it computes its own sum exactly as before:

zj=wjx+bjz_j = \mathbf{w}_j \cdot \mathbf{x} + b_j

Here zjz_j is neuron jj‘s raw sum, wj\mathbf{w}_j is that neuron’s weight vector, and bjb_j is its bias. Nothing here is new: it is lesson 1 repeated once per neuron.

One equation for the whole layer

Writing mm separate sums is tedious. Instead, stack every neuron’s weights as the rows of a single weight matrix WW, and stack the biases into one vector. The whole layer becomes:

z=Wx+b,a=σ(z)\mathbf{z} = W\mathbf{x} + \mathbf{b}, \qquad \mathbf{a} = \sigma(\mathbf{z})

Symbol by symbol:

  • xRn\mathbf{x} \in \mathbb{R}^{n} is the input vector, with nn inputs shared by every neuron.
  • WRm×nW \in \mathbb{R}^{m \times n} is the weight matrix. It has mm rows (one per neuron) and nn columns (one per input). The entry WjiW_{ji} is the weight from input ii into neuron jj.
  • bRm\mathbf{b} \in \mathbb{R}^{m} is the bias vector, one bias per neuron.
  • zRm\mathbf{z} \in \mathbb{R}^{m} is the vector of raw sums, where row jj is exactly zj=i=1nWjixi+bjz_j = \sum_{i=1}^{n} W_{ji} x_i + b_j.
  • σ\sigma is applied to each entry on its own, so aj=σ(zj)a_j = \sigma(z_j).
  • aRm\mathbf{a} \in \mathbb{R}^{m} is the layer’s output: mm numbers, one per neuron.
A layer: every input connects to every neuron, and each connection is one entry of the weight matrix W.
A layer: every input connects to every neuron, and each connection is one entry of the weight matrix W.

A worked example

Take two inputs x=[2,3]\mathbf{x} = [2, 3] and a layer of two neurons. Neuron 1 reuses lesson 1’s weights, and neuron 2 gets its own:

W=[0.5110.5],b=[12]W = \begin{bmatrix} 0.5 & -1 \\ 1 & 0.5 \end{bmatrix}, \qquad \mathbf{b} = \begin{bmatrix} 1 \\ -2 \end{bmatrix}

The raw sums are z1=(0.5)(2)+(1)(3)+1=1z_1 = (0.5)(2) + (-1)(3) + 1 = -1 and z2=(1)(2)+(0.5)(3)2=1.5z_2 = (1)(2) + (0.5)(3) - 2 = 1.5. Applying the sigmoid to each gives the layer output. The same calculation in code:

import math

# two inputs, shared by every neuron in the layer
x = [2.0, 3.0]

# each row is one neuron's weights; this layer has two neurons
W = [[0.5, -1.0],
     [1.0,  0.5]]
# one bias per neuron
b = [1.0, -2.0]

# sigmoid activation, applied to one number
def sigmoid(t):
    return 1 / (1 + math.exp(-t))

# for each neuron: weighted sum of inputs, add its bias, then activate
a = [sigmoid(sum(w_i * x_i for w_i, x_i in zip(row, x)) + b_j)
     for row, b_j in zip(W, b)]

print(a)  # [0.2689414213699951, 0.8175744761936437]

The first output reuses the sigmoid value from lesson 2, confirming that a layer really is just the single neuron repeated and gathered into one equation.

In the next lesson, we will feed one layer’s output into another layer, which is what makes a network deep.