- Neural Network Programming with TensorFlow
- Manpreet Singh Ghotra Rajdeep Dua
- 293字
- 2021-07-02 15:17:10
Optimizers
We will study AdamOptimizer here; TensorFlow AdamOptimizer uses Kingma and Ba's Adam algorithm to manage the learning rate. Adam has many advantages over the simple GradientDescentOptimizer. The first is that it uses moving averages of the parameters, which enables Adam to use a larger step size, and it will converge to this step size without any fine-tuning.
The disadvantage of Adam is that it requires more computation to be performed for each parameter in each training step. GradientDescentOptimizer can be used as well, but it would require more hyperparameter tuning before it would converge as quickly.
The following example shows how to use AdamOptimizer:
- tf.train.Optimizer creates an optimizer
- tf.train.Optimizer.minimize(loss, var_list) adds the optimization operation to the computation graph
Here, automatic differentiation computes gradients without user input:
import numpy as np
import seaborn
import matplotlib.pyplot as plt
import tensorflow as tf
# input dataset
xData = np.arange(100, step=.1)
yData = xData + 20 * np.sin(xData/10)
# scatter plot for input data
plt.scatter(xData, yData)
plt.show()
# defining data size and batch size
nSamples = 1000
batchSize = 100
# resize
xData = np.reshape(xData, (nSamples,1))
yData = np.reshape(yData, (nSamples,1))
# input placeholders
x = tf.placeholder(tf.float32, shape=(batchSize, 1))
y = tf.placeholder(tf.float32, shape=(batchSize, 1))
# init weight and bias
with tf.variable_scope("linearRegression"):
W = tf.get_variable("weights", (1, 1), initializer=tf.random_normal_initializer())
b = tf.get_variable("bias", (1,), initializer=tf.constant_initializer(0.0))
y_pred = tf.matmul(x, W) + b
loss = tf.reduce_sum((y - y_pred)**2/nSamples)
# optimizer
opt = tf.train.AdamOptimizer().minimize(loss)
with tf.Session() as sess:
sess.run(tf.global_variables_initializer())
# gradient descent loop for 500 steps
for _ in range(500):
# random minibatch
indices = np.random.choice(nSamples, batchSize)
X_batch, y_batch = xData[indices], yData[indices]
# gradient descent step
_, loss_val = sess.run([opt, loss], feed_dict={x: X_batch, y: y_batch})
Here is the scatter plot for the dataset:

This is the plot of the learned model on the data:

- Google Visualization API Essentials
- Learning Spring Boot
- SQL查詢:從入門到實踐(第4版)
- Hadoop與大數據挖掘(第2版)
- 醫療大數據挖掘與可視化
- Remote Usability Testing
- OracleDBA實戰攻略:運維管理、診斷優化、高可用與最佳實踐
- Starling Game Development Essentials
- Apache Kylin權威指南
- 區塊鏈技術應用與實踐案例
- Spark分布式處理實戰
- Visual Studio 2013 and .NET 4.5 Expert Cookbook
- 中國云存儲發展報告
- Access 2016數據庫應用基礎
- 智能與數據重構世界