CNN_training test dataset, learning rate

황TL 2017. 8. 5. 01:20

2017. 8. 5. 01:20

learning rate에 대한 내용이다.

요약하자면,

1) rate를 작게 할 수록 경사하강법 알고리즘이 천천히 내려 감.

2) rate를 크게 할 수록 빠르게 내려감

3) 빠르게 할 수록 overfitting의 확률이 높아짐.

4) 적절한 rate 구하는것이 중요.

training data / test data에 대한 내용이다.

강의에서의 내용은

1) 트레이닝 데이터로 학습하고 테스트 데이터를 통해 정확도를 비교한다.

2) 트레이닝 데이터에서 트레이닝 데이터를 비교하는건 옳지 않다.

요약 : 데이터를 줘서 기계를 학습시키고 맞추게 하는것은 정보를 불러오는 과정에 불과하다 라는 것

# Lab 7 Learning rate and Evaluation

import tensorflow as tf

tf.set_random_seed(777) # for reproducibility

#트레이닝 데이터 셋

x_data = [[1, 2, 1],

[1, 3, 2],

[1, 3, 4],

[1, 5, 5],

[1, 7, 5],

[1, 2, 5],

[1, 6, 6],

[1, 7, 7]]

y_data = [[0, 0, 1],

[0, 0, 1],

[0, 1, 0],

[1, 0, 0],

[1, 0, 0]]

# Evaluation our model using this test dataset

#테스트 데이터 셋

x_test = [[2, 1, 1],

[3, 1, 2],

[3, 3, 4]]

y_test = [[0, 0, 1],

[0, 0, 1],

[0, 0, 1]]

#N행 3열의 노드

X = tf.placeholder("float", [None, 3])

Y = tf.placeholder("float", [None, 3])

#3행 3열의 W

W = tf.Variable(tf.random_normal([3, 3]))

b = tf.Variable(tf.random_normal([3]))

# tf.nn.softmax computes softmax activations

# softmax = exp(logits) / reduce_sum(exp(logits), dim)

#softmax 함수를 이용한 가설함수

hypothesis = tf.nn.softmax(tf.matmul(X, W) + b)

# Cross entropy cost/loss

#softmax 함수에 대한 비용함수

cost = tf.reduce_mean(-tf.reduce_sum(Y * tf.log(hypothesis), axis=1))

# Try to change learning_rate to small numbers

#경사하강법을 이용한 비용함수 최소화

optimizer = tf.train.GradientDescentOptimizer(

learning_rate=1e-0).minimize(cost)

# Correct prediction Test model

#arg_max를 이용하여 hypothesis값을 one-hot encoding 시킨다.

#one-hot encoding시킨 Y노드의 데이터와 prediction 값이 일치하는지 확인한다.

#prediction과 Y노드에 arg_max를 취한 값의 차를 평균낸다. logistic regression 방법과 동일

prediction = tf.arg_max(hypothesis, 1)

is_correct = tf.equal(prediction, tf.arg_max(Y, 1))

accuracy = tf.reduce_mean(tf.cast(is_correct, tf.float32))

# Launch graph

with tf.Session() as sess:

# Initialize TensorFlow variables

sess.run(tf.global_variables_initializer())

#step과 hypothesis 값을 출력한다.

for step in range(201):

hy_val = sess.run(

[hypothesis], feed_dict={X: x_data, Y: y_data})

print(step, hy_val)

# predict_training

#트레이닝 데이터 셋을 이용하여 prediction, accuracy를 구한다.

print("Prediction:", sess.run(prediction, feed_dict={X: x_data}))

# Calculate the accuracy

print("Accuracy: ", sess.run(accuracy, feed_dict={X: x_data, Y: y_data}))

# predict_test

#테스트 데이터 셋을 이용하여 prediction, accuracy를 구한다.

print("Prediction:", sess.run(prediction, feed_dict={X: x_test}))

# Calculate the accuracy

print("Accuracy: ", sess.run(accuracy, feed_dict={X: x_test, Y: y_test}))

예를들어 트레이닝 데이터 셋을 이용한 prediction이 [2,2,2,2,2,2,2,2]이 나왔다면 accuracy는 0.375가 나온다.

-> prediction이 [2,2,2,1,1,1,0,0]이 나와야 accuracy가 1.0이 나온다

-> 트레이닝 데이터 셋의 Y데이터를 참고할것.

예를들어 테스트 데이터 셋을 이용한 prediction이 [0,0,2]이 나왔다면 accuracy는 0.333가 나온다.

-> prediction이 [2,2,2]이 나와야 accuracy가 1.0이 나온다

-> 테스트 데이터 셋의 Y데이터를 참고할것.

* arg_max에 대해서

(잘못된 생각)

1) 예를들어 hypothesis 값이 a,b,c가 나왔다

2) softmax를 통해 각각이 나올 확률을 구한다.

3) 각각 확률에 대해서 순서를 매긴다(arg_max)

a -> 0.6 -> 2

b -> 0.3 -> 1

c -> 0.1 -> 0

(옳은 생각)

1) 예를들어 hypothesis 값이 a,b,c가 나왔다

2) softmax를 통해 각각이 나올 확률을 구한다.

3) 각각 확률에 대해서 가장 큰 확률이 나온 값을 고른다(arg_max)

a -> 0.6 -> 1

b -> 0.3 -> 0

c -> 0.1 -> 0

예제를 풀다보면 arg_max를 했는데도 0과1이 아닌 다른 숫자들(2~9)이 나오는 경우가 있다(mnist 예제)

0과1이 아닌 다른 숫자가 나오는 이유는 예를들어 hypothesis가 [0.2, 0.15, 0.2, 0.15, 0.3]를 갖는다고 가정하자

arg_max를 취하면 출력 값은 5가 나온다

이뜻은 arg_max가 softmax로 나온 값(hypothesis)을 0과 1로 구분하여 가장큰 값을 1로 취하고 그 값에 해당하는 자릿수를 출력하는 것이다.

'코딩이것저것' 카테고리의 다른 글

CNN_Neural Net for XOR (0)	2017.08.05
CNN_MNIST Dataset (0)	2017.08.05
CNN_Softmax classifier (0)	2017.08.04
CNN_logistic_regression (0)	2017.08.02
OCR_이미지를 텍스트로 변환 (0)	2017.08.02

미국 주식 분석 블로그

CNN_training test dataset, learning rate

'코딩이것저것' 카테고리의 다른 글

+ Recent posts

티스토리툴바