官术网_书友最值得收藏!

Gradient descent

An SGD implementation of gradient descent uses a simple distributed sampling of the data examples. Loss is a part of the optimization problem, and therefore, is a true sub-gradient.

This requires access to the full dataset, which is not optimal.

The parameter miniBatchFraction specifies the fraction of the full data to use. The average of the gradients over this subset

is a stochastic gradient. S is a sampled subset of size |S|= miniBatchFraction.

In the following code, we show how to use stochastic gardient descent on a mini batch to calculate the weights and the loss. The output of this program is a vector of weights and loss.

object SparkSGD { 
def main(args: Array[String]): Unit = {
val m = 4
val n = 200000
val sc = new SparkContext("local[2]", "")
val points = sc.parallelize(0 until m,
2).mapPartitionsWithIndex { (idx, iter) =>
val random = new Random(idx)
iter.map(i => (1.0,
Vectors.dense(Array.fill(n)(random.nextDouble()))))
}.cache()
val (weights, loss) = GradientDescent.runMiniBatchSGD(
points,
new LogisticGradient,
new SquaredL2Updater,
0.1,
2,
1.0,
1.0,
Vectors.dense(new Array[Double](n)))
println("w:" + weights(0))
println("loss:" + loss(0))
sc.stop()
}
主站蜘蛛池模板: 阿巴嘎旗| 吐鲁番市| 灵丘县| 陕西省| 博乐市| 许昌市| 汶上县| 额尔古纳市| 都江堰市| 泗水县| 旬阳县| 潮州市| 邯郸市| 双峰县| 阜平县| 六枝特区| 枞阳县| 大理市| 尉氏县| 孟村| 句容市| 榆林市| 望奎县| 兴宁市| 喀什市| 虹口区| 扬中市| 赣榆县| 汉阴县| 柳江县| 牟定县| 府谷县| 望都县| 正蓝旗| 灯塔市| 双鸭山市| 香河县| 镇原县| 疏勒县| 京山县| 东方市|