官术网_书友最值得收藏!

Softmax in TensorFlow

The softmax function converts its inputs, known as logit or logit scores, to be between 0 and 1, and also normalizes the outputs so that they all sum up to 1. In other words, the softmax function turns your logits into probabilities. Mathematically, the softmax function is defined as follows:

In TensorFlow, the softmax function is implemented. It takes logits and returns softmax activations that have the same type and shape as input logits, as shown in the following image:

The following code is used to implement this:

logit_data = [2.0, 1.0, 0.1]
logits = tf.placeholder(tf.float32)
softmax = tf.nn.softmax(logits)

with tf.Session() as sess:
output = sess.run(softmax, feed_dict={logits: logit_data})
print( output )

The way we represent labels mathematically is often called one-hot encoding. Each label is represented by a vector that has 1.0 for the correct label and 0.0 for everything else. This works well for most problem cases. However, when the problem has millions of labels, one-hot encoding is not efficient, since most of the vector elements are zeros. We measure the similarity distance between two probability vectors, known as cross-entropy and denoted by D.

Cross-entropy is not symmetric. That means: D(S,L) != D(L,S)

In machine learning, we define what it means for a model to be bad usually by a mathematical function. This function is called loss, cost, or objective function. One very common function used to determine the loss of a model is called the cross-entropy loss. This concept came from information theory (for more on this, please refer to Visual Information Theory at https://colah.github.io/posts/2015-09-Visual-Information/). Intuitively, the loss will be high if the model does a poor job of classifying on the training data, and it will be low otherwise, as shown here:

Cross-entropy loss function

In TensorFlow, we can write a cross-entropy function using tf.reduce_sum(); it takes an array of numbers and returns its sum as a tensor (see the following code block):

x = tf.constant([[1,1,1], [1,1,1]])
with tf.Session() as sess:
print(sess.run(tf.reduce_sum([1,2,3]))) #returns 6
print(sess.run(tf.reduce_sum(x,0))) #sum along x axis, prints [2,2,2]

But in practice, while computing the softmax function, intermediate terms may be very large due to the exponentials. So, dividing large numbers can be numerically unstable. We should use TensorFlow's provided softmax and cross-entropy loss API. The following code snippet manually calculates cross-entropy loss and also prints the same using the TensorFlow API:

import tensorflow as tf

softmax_data = [0.1,0.5,0.4]
onehot_data = [0.0,1.0,0.0]

softmax = tf.placeholder(tf.float32)
onehot_encoding = tf.placeholder(tf.float32)

cross_entropy = - tf.reduce_sum(tf.multiply(onehot_encoding,tf.log(softmax)))

cross_entropy_loss = tf.nn.softmax_cross_entropy_with_logits(logits=tf.log(softmax), labels=onehot_encoding)

with tf.Session() as session:
print(session.run(cross_entropy,feed_dict={softmax:softmax_data, onehot_encoding:onehot_data} ))
print(session.run(cross_entropy_loss,feed_dict={softmax:softmax_data, onehot_encoding:onehot_data} ))

主站蜘蛛池模板: 陇南市| 广丰县| 溧水县| 雷州市| 崇信县| 斗六市| 洛川县| 和静县| 隆安县| 道真| 灵宝市| 同江市| 深水埗区| 曲阜市| 林口县| 长白| 偃师市| 和静县| 镇巴县| 托克逊县| 舒城县| 镇雄县| 和林格尔县| 新兴县| 苏州市| 庐江县| 确山县| 台南市| 杭州市| 兴城市| 长子县| 乌拉特中旗| 甘谷县| 土默特左旗| 淳安县| 清水河县| 正蓝旗| 兴义市| 石泉县| 庆阳市| 赤壁市|