官术网_书友最值得收藏!

Gradient descent

An SGD implementation of gradient descent uses a simple distributed sampling of the data examples. Loss is a part of the optimization problem, and therefore, is a true sub-gradient.

This requires access to the full dataset, which is not optimal.

The parameter miniBatchFraction specifies the fraction of the full data to use. The average of the gradients over this subset

is a stochastic gradient. S is a sampled subset of size |S|= miniBatchFraction.

In the following code, we show how to use stochastic gardient descent on a mini batch to calculate the weights and the loss. The output of this program is a vector of weights and loss.

object SparkSGD { 
def main(args: Array[String]): Unit = {
val m = 4
val n = 200000
val sc = new SparkContext("local[2]", "")
val points = sc.parallelize(0 until m,
2).mapPartitionsWithIndex { (idx, iter) =>
val random = new Random(idx)
iter.map(i => (1.0,
Vectors.dense(Array.fill(n)(random.nextDouble()))))
}.cache()
val (weights, loss) = GradientDescent.runMiniBatchSGD(
points,
new LogisticGradient,
new SquaredL2Updater,
0.1,
2,
1.0,
1.0,
Vectors.dense(new Array[Double](n)))
println("w:" + weights(0))
println("loss:" + loss(0))
sc.stop()
}
主站蜘蛛池模板: 喀喇沁旗| 崇州市| 萍乡市| 翁牛特旗| 高唐县| 阳春市| 高安市| 内黄县| 靖宇县| 巨鹿县| 西安市| 开原市| 乌苏市| 天等县| 乌拉特后旗| 义马市| 漾濞| 平度市| 尤溪县| 大埔区| 克什克腾旗| 江永县| 会同县| 九寨沟县| 名山县| 中超| 嘉禾县| 昭苏县| 富宁县| 广丰县| 凌海市| 松滋市| 大同县| 历史| 平阳县| 神木县| 卢氏县| 吉隆县| 阜平县| 龙州县| 长沙县|