�ݺ�ߣ

퍼셉트론 구현
딥러닝을 위한 신경망 기초
nonezerok@gmail.com
이번에는 퍼셉트론을 프로그램 코드로 살펴보겠습니다.
프로그램 코드는 퍼셉트론의 동작원리를 보다 명확하게 해 줍니다.
코드는 C언어로 제공됩니다.
또한, 딥러닝 공개 프레임워크 중 하나인 Tensorflow로도 제공됩니다.
Tensorflow는 구글이 제공하는 것으로 전세계적으로 사용자가 가장 많습니다.
Tensorflow 설치 방법은 인터넷을 참고하세요.

2
숫자인식
1 이다
아니다
퍼셉트론을 사용하여
9개의 화소
이런 식으로
총 10개의 퍼셉트론을 학습한다.

3
0
9
1
-0.2
+0.8
+0.1
1인지 아닌지 학습한 퍼셉트론
10개의 퍼셉트론을 사용하면
입력 값을 숫자 1로 분류 가능

4



1
 

 
⋮
 = 1, 0, ⋯ , 0
 = 0, 1, ⋯ , 0
 1 
 2 
 = 0, 0, ⋯ , 1   
⋮ ⋮
one-hot encoding

• 배치
• 단일
• 미니-배치
5
가중치 갱신 모드
∆ =    −  ,
∈
∆ =   −  ,
∆ = 
1

  −  ,
∈
 ⊂ 
 ℎ 
Gradient Descent
Incremental Gradient Descent
~ gradient descent,
if η is small enough
Stochastic Gradient Descent

6
 =
1
1
 = +1
 =
−1
1
 = −1
 =
−1
−1
 = −1
 =
1
−1
 = −1
C 코드 예제
 ∈ −1, +14개의 학습 데이터
target을 +1, -1로 설정

7
 =
1
1
 = 1
 =
0
1
 = 0
 =
0
0
 = 0
 =
1
0
 = 0
 ∈ 0, 1
target을 0, +1로 설정
4개의 학습 데이터도
0 ~ 1 범위 값 사용
이렇게 하는 것이 일반적

8
 =
1
1
 = 1
 =
0
1
 = 0
 =
0
0
 = 0
 =
1
0
 = 0
 =  + + 
이 경우 입력이 2개 (2차원 데이터)니까 퍼셉트론으로 학습하는 것은 평면
 =
0
1
 =
0
0
 =
1
0
 =
1
1
 = 0
 = 0
 = 0
 = 1
학습하는 평면 식
 =



9


1




 =  + + 
퍼셉트론 모델 그림으로 다시 그려보면
3개 구해야 한다.

10
Perceptron for OR pattern
in C

11
#include <stdio.h>
#include <stdlib.h>
#include <limits.h>
#define TRAIN_SAMPLES 4
void main()
{
//
// OR pattern
//
// input: X
double X[TRAIN_SAMPLES][2] =
{
0, 0, // 클래스 0
0, 1, // 클래스 1
1, 0, // 클래스 1
1, 1 // 클래스 1
};
// target Y
double Y[TRAIN_SAMPLES] =
{
0, 1, 1, 1
};
//
// weight
//
double W[3];
//-----------------------------------------------------------
// 1) Training
//-----------------------------------------------------------
//
// initialize W
//
for (int i = 0; i < 3; i++)
W[i] = ((double)rand() / RAND_MAX)*0.5 - 0.25;
unsigned int epoch = 0;
unsigned int MAX_EPOCH = 500;
double Etha = 0.05;
double output;
double x1, x2;
double target;
적당히 임의의 값 설정
500회 반복
학습률은 0.05로 설정
구해야 할 w는 3개

12
while (epoch++ < MAX_EPOCH)
{
//
// 1 epoch (for all training samples)
//
// compute deltaWi for each Wi
//
double deltaW[3];
for (int i = 0; i < TRAIN_SAMPLES; i++)
{
deltaW[0] = 0.0;
deltaW[1] = 0.0;
deltaW[2] = 0.0;
x1 = X[i][0];
x2 = X[i][1];
target = Y[i];
output = 1 * W[0] + x1 * W[1] + x2 * W[2];
deltaW[0] += (target - output) * 1;
deltaW[1] += (target - output) * x1;
deltaW[2] += (target - output) * x2;
}
//
// update W
//
W[0] = W[0] + Etha * (deltaW[0] / TRAIN_SAMPLES);
 =  + ∆
∆ = −


=    −  ,

 =  + ∆
∆ = −


=    − 


13
while (epoch++ < MAX_EPOCH)
{
// 1 epoch (for all training samples)
//
// compute deltaWi for each Wi
//
double deltaW[3];
{
deltaW[0] = 0.0;
deltaW[1] = 0.0;
deltaW[2] = 0.0;
x1 = X[i][0];
x2 = X[i][1];
target = Y[i];
output = 1 * W[0] + x1 * W[1] + x2 * W[2];
deltaW[0] = (target - output) * 1;
deltaW[1] = (target - output) * x1;
deltaW[2] = (target - output) * x2;
//
// update W
//
W[0] = W[0] + Etha * (deltaW[0]);
W[1] = W[1] + Etha * (deltaW[1]);
W[2] = W[2] + Etha * (deltaW[2]);
}
이렇게 샘플 1개당 바로 바로 갱신하는 것을
Stochastic Gradient Descent.
학습률만 충분히 작게 하면 학습 가능.
∆ =   −  ,
[참고]
갱신을 샘플 반복문 안으로

14
//
// compute the Cost
//
double cost;
cost = 0.0;
{
x1 = X[i][0];
x2 = X[i][1];
target = Y[i];
output = 1 * W[0] + x1 * W[1] + x2 * W[2];
cost += (target - output) * (target - output);
}
cost = 0.5 * cost / TRAIN_SAMPLES;
printf("%05d: cost = %10.9lf n", epoch, cost);
}
printf("training donenn");
}

15
//------------------------------------------------------------
// 2) Testing for the training set
//------------------------------------------------------------
{
x1 = X[i][0];
x2 = X[i][1];
target = Y[i];
output = 1 * W[0] + x1 * W[1] + x2 * W[2];
printf("%2.1lf %2.1lf (%d) %2.1lf n", x1, x2, (int)target, output);
}
printf("training test donenn");
//------------------------------------------------------------
// 3) Testing for unknown data
//------------------------------------------------------------
x1 = 0.8;
x2 = 0.7;
double Threshold = 0.5;
output = 1 * W[0] + x1 * W[1] + x2 * W[2];
int output_class = (output > Threshold) ? 1 : 0;
printf("%2.1lf %2.1lf (%d) %2.1lfn", x1, x2, output_class, output);
}

16
Perceptron for OR pattern
in Tensorflow

17
import numpy as np
import tensorflow as tf
np.random.seed(0)
x = tf.placeholder(tf.float32, shape=(4, 2))
y = tf.placeholder(tf.float32, shape=(4, 1))
w = tf.placeholder(tf.float32, shape=(2, 1))
b = tf.placeholder(tf.float32)
y_pred = tf.matmul(x, w) + b
loss = 0.5 * tf.reduce_mean((y - y_pred) * (y - y_pred))
grad_w, grad_b = tf.gradients(loss, [w, b])
values = {
x: [[-1,-1], [-1,1], [1,-1], [1,1]],
y: [[0], [1], [1], [1]],
w: np.random.randn(2, 1),
b: np.random.randn(1)
}

18
with tf.Session() as sess:
learning_rate = 0.1
for i in range(100):
loss_v, grad_w_v, grad_b_v = sess.run([loss, grad_w, grad_b], feed_dict=values)
values[w] -= learning_rate * grad_w_v
values[b] -= learning_rate * grad_b_v
print('%10d loss: %10.8f w1:%10.5f w2: %10.5f b: %10.5f'
%(i,loss_v,values[w][0],values[w][1],values[b]))
a = sess.run([y_pred], feed_dict=values)
for i in range(4):
print('%2d %2d %2d : %3.2f'
%(values[x][i][0], values[x][i][1], values[y][i][0], a[0][i]))

19



1  

 = 1, 0
 = 0, 1
  1
  2
출력을 2개로
입력이 클래스 1이면 여기에 1이 출력,
입력이 클래스 2이면 여기에 1이 출력
되도록 학습하면,
새로운 입력이 들어왔을 때,
더 큰 값 출력하는 뉴런의 클래스로 분류
2개의 클래스를 구분하는 문제에 대해 출력이 2개가 되도록 구현

20
 =








=
0 0
0 1
1 0
1 1
 =








=
1 0
1 0
1 0
0 1
1번 클래스
2번 클래스
퍼셉트론 2개 구현 첫 번째 퍼셉트론의 출력
두 번째 퍼셉트론의 출력

�ݺ�ߣ

[신경망기초] 퍼셉트론구현

More Related Content

[신경망기초] 퍼셉트론구현