官术网_书友最值得收藏!

Gradient descent

An SGD implementation of gradient descent uses a simple distributed sampling of the data examples. Loss is a part of the optimization problem, and therefore, is a true sub-gradient.

This requires access to the full dataset, which is not optimal.

The parameter miniBatchFraction specifies the fraction of the full data to use. The average of the gradients over this subset

is a stochastic gradient. S is a sampled subset of size |S|= miniBatchFraction.

In the following code, we show how to use stochastic gardient descent on a mini batch to calculate the weights and the loss. The output of this program is a vector of weights and loss.

object SparkSGD { 
def main(args: Array[String]): Unit = {
val m = 4
val n = 200000
val sc = new SparkContext("local[2]", "")
val points = sc.parallelize(0 until m,
2).mapPartitionsWithIndex { (idx, iter) =>
val random = new Random(idx)
iter.map(i => (1.0,
Vectors.dense(Array.fill(n)(random.nextDouble()))))
}.cache()
val (weights, loss) = GradientDescent.runMiniBatchSGD(
points,
new LogisticGradient,
new SquaredL2Updater,
0.1,
2,
1.0,
1.0,
Vectors.dense(new Array[Double](n)))
println("w:" + weights(0))
println("loss:" + loss(0))
sc.stop()
}
主站蜘蛛池模板: 洪湖市| 揭东县| 廊坊市| 肇源县| 定安县| 清徐县| 仪征市| 福海县| 白沙| 淮南市| 绵竹市| 德兴市| 洪洞县| 鄂伦春自治旗| 揭东县| 永安市| 肃宁县| 鹤岗市| 蛟河市| 廉江市| 吉隆县| 陆河县| 贵德县| 商城县| 武威市| 望城县| 嘉峪关市| 唐河县| 西畴县| 青河县| 开江县| 姚安县| 安西县| 博罗县| 保定市| 潜山县| 嘉定区| 和平区| 聂荣县| 潞西市| 竹山县|