This document describes the mathematical operations of a convolutional layer in a neural network. It shows that a convolutional layer can be represented as a matrix multiplication between the input feature maps and the convolutional kernels. To perform this matrix multiplication, the input is first transformed using im2col to form a 2D matrix, where each column consists of a patch from the input. This matrix is then multiplied with the kernel matrix to produce the output feature maps.
2. l 5 1
– 2 I 0 ( A (
0 5
– A 1 ) 2 2 1 (
1
– 1 5 2 0 81
3. -
l 2 2 5 a 0 2 6
+, 1
– ie ]
– VT o [
– GN
Rn P CSd
– l A f
l
–
G
– P V
argmin '()(*+ = -'.)/(0/( + 2'3(4+0
'()(*+
'.)/(0/(
'3(4+0
2
input
conv3_1
conv1_1
conv2_1
conv4_1
conv5_1
conv4_2
( - ) (
5+
= 6789+
6789+ :
U G
P
?<
?=
?>