官术网_书友最值得收藏!

Analyzing the Iris dataset

Let's look at a feedforward example using the Iris dataset.

In the Iris dataset, we will use 150 rows of data made up of 50 samples from each of three Iris species: Iris setosa, Iris virginica, and Iris versicolor.

Petal geometry compared from three iris species:
Iris Setosa, Iris Virginica, and Iris Versicolor.

In the dataset, each row contains data for each flower sample: sepal length, sepal width, petal length, petal width, and flower species. Flower species are stored as integers, with 0 denoting Iris setosa, 1 denoting Iris versicolor, and 2 denoting Iris virginica.

First, we will create a run() function that takes three parameters--hidden layer size h_size, standard deviation for weights stddev, and Step size of Stochastic Gradient Descent sgd_step:

def run(h_size, stddev, sgd_step)

Input data loading is done using the genfromtxt function in numpy. The Iris data loaded has a shape of L: 150 and W: 4. Data is loaded in the all_X variable. Target labels are loaded from target.csv in all_Y with the shape of L: 150, W:3:

def load_iris_data():
from numpy import genfromtxt
data = genfromtxt('iris.csv', delimiter=',')
target = genfromtxt('target.csv', delimiter=',').astype(int)
# Prepend the column of 1s for bias
L, W = data.shape
all_X = np.ones((L, W + 1))
all_X[:, 1:] = data
num_labels = len(np.unique(target))
all_y = np.eye(num_labels)[target]
return train_test_split(all_X, all_y, test_size=0.33, random_state=RANDOMSEED)

Once data is loaded, we initialize the weights matrix based on x_size, y_size, and h_size with standard deviation passed to the run() method:

  • x_size= 5
  • y_size= 3
  • h_size= 128 (or any other number chosen for neurons in the hidden layer)
# Size of Layers
x_size = train_x.shape[1] # Input nodes: 4 features and 1 bias
y_size = train_y.shape[1] # Outcomes (3 iris flowers)
# variables
X = tf.placeholder("float", shape=[None, x_size])
y = tf.placeholder("float", shape=[None, y_size])
weights_1 = initialize_weights((x_size, h_size), stddev)
weights_2 = initialize_weights((h_size, y_size), stddev)

Next, we make the prediction using sigmoid as the activation function defined in the forward_propagration() function:

def forward_propagation(X, weights_1, weights_2):
sigmoid = tf.nn.sigmoid(tf.matmul(X, weights_1))
y = tf.matmul(sigmoid, weights_2)
return y

First, sigmoid output is calculated from input X and weights_1. This is then used to calculate y as a matrix multiplication of sigmoid and weights_2:

y_pred = forward_propagation(X, weights_1, weights_2)
predict = tf.argmax(y_pred, dimension=1)

Next, we define the cost function and optimization using gradient descent. Let's look at the GradientDescentOptimizer being used. It is defined in the tf.train.GradientDescentOptimizer class and implements the gradient descent algorithm.

To construct an instance, we use the following constructor and pass sgd_step as a parameter:

# constructor for GradientDescentOptimizer
__init__(
learning_rate,
use_locking=False,
name='GradientDescent'
)

Arguments passed are explained here:

  • learning_rate: A tensor or a floating point value. The learning rate to use.
  • use_locking: If True, use locks for update operations.
  • name: Optional name prefix for the operations created when applying gradients. The default name is "GradientDescent".

The following list shows the code to implement the cost function:

cost = tf.reduce_mean(tf.nn.softmax_cross_entropy_with_logits(labels=y, logits=y_pred))
updates_sgd = tf.train.GradientDescentOptimizer(sgd_step).minimize(cost)

Next, we will implement the following steps:

  1. Initialize the TensorFlow session:
sess = tf.Session()
  1. Initialize all the variables using tf.initialize_all_variables(); the return object is used to instantiate the session.
  2. Iterate over steps (1 to 50).
  3. For each step in train_x and train_y, execute updates_sgd.
  4. Calculate the train_accuracy and test_accuracy.

We stored the accuracy for each step in a list so that we could plot a graph:

    init = tf.initialize_all_variables()
steps = 50
sess.run(init)
x = np.arange(steps)
test_acc = []
train_acc = []
print("Step, train accuracy, test accuracy")
for step in range(steps):
# Train with each example
for i in range(len(train_x)):
sess.run(updates_sgd, feed_dict={X: train_x[i: i + 1], y: train_y[i: i + 1]})

train_accuracy = np.mean(np.argmax(train_y, axis=1) ==
sess.run(predict, feed_dict={X: train_x, y: train_y}))
test_accuracy = np.mean(np.argmax(test_y, axis=1) ==
sess.run(predict, feed_dict={X: test_x, y: test_y}))

print("%d, %.2f%%, %.2f%%"
% (step + 1, 100. * train_accuracy, 100. * test_accuracy))

test_acc.append(100. * test_accuracy)
train_acc.append(100. * train_accuracy)
主站蜘蛛池模板: 辽阳县| 喜德县| 汤原县| 茶陵县| 随州市| 明溪县| 当涂县| 正镶白旗| 抚顺市| 北京市| 兖州市| 兴化市| 公主岭市| 红原县| 登封市| 古浪县| 塔河县| 岗巴县| 嘉善县| 蓬溪县| 绥棱县| 太仓市| 定陶县| 景泰县| 赣榆县| 绥中县| 弋阳县| 炎陵县| 衡南县| 古蔺县| 托里县| 遵化市| 湾仔区| 栾城县| 美姑县| 通许县| 江西省| 柘荣县| 乌海市| 静乐县| 永州市|