官术网_书友最值得收藏!

Gradient descent

An SGD implementation of gradient descent uses a simple distributed sampling of the data examples. Loss is a part of the optimization problem, and therefore, is a true sub-gradient.

This requires access to the full dataset, which is not optimal.

The parameter miniBatchFraction specifies the fraction of the full data to use. The average of the gradients over this subset

is a stochastic gradient. S is a sampled subset of size |S|= miniBatchFraction.

In the following code, we show how to use stochastic gardient descent on a mini batch to calculate the weights and the loss. The output of this program is a vector of weights and loss.

object SparkSGD { 
def main(args: Array[String]): Unit = {
val m = 4
val n = 200000
val sc = new SparkContext("local[2]", "")
val points = sc.parallelize(0 until m,
2).mapPartitionsWithIndex { (idx, iter) =>
val random = new Random(idx)
iter.map(i => (1.0,
Vectors.dense(Array.fill(n)(random.nextDouble()))))
}.cache()
val (weights, loss) = GradientDescent.runMiniBatchSGD(
points,
new LogisticGradient,
new SquaredL2Updater,
0.1,
2,
1.0,
1.0,
Vectors.dense(new Array[Double](n)))
println("w:" + weights(0))
println("loss:" + loss(0))
sc.stop()
}
主站蜘蛛池模板: 玉环县| 乳山市| 岳阳县| 囊谦县| 沾益县| 石阡县| 安化县| 高平市| 鸡东县| 建宁县| 揭西县| 那曲县| 东方市| 平昌县| 蕲春县| 隆子县| 马山县| 南丰县| 西贡区| 凤庆县| 安岳县| 永德县| 承德市| 广丰县| 泸州市| 津南区| 克什克腾旗| 新邵县| 县级市| 达孜县| 沐川县| 乐昌市| 东兰县| 天等县| 南雄市| 咸宁市| 定陶县| 肥西县| 宣化县| 维西| 观塘区|