官术网_书友最值得收藏!

Vectors in Spark

Spark MLlib uses Breeze and JBlas for internal linear algebraic operations. It uses its own class to represent a vector defined using the org.apache.spark.mllib.linalg.Vector factory. A local vector has integer-typed and 0-based indices. Its values are stored as double-typed. A local vector is stored on a single machine, and cannot be distributed. Spark MLlib supports two types of local vectors, dense and sparse, created using factory methods.

The following code snippet shows how to create basic sparse and dense vectors in Spark:

val dVectorOne: Vector = Vectors.dense(1.0, 0.0, 2.0) 
println("dVectorOne:" + dVectorOne)
// Sparse vector (1.0, 0.0, 2.0, 3.0)
// corresponding to nonzero entries.
val sVectorOne: Vector = Vectors.sparse(4, Array(0, 2,3),
Array(1.0, 2.0, 3.0))
// Create a sparse vector (1.0, 0.0, 2.0, 2.0) by specifying its
// nonzero entries.
val sVectorTwo: Vector = Vectors.sparse(4, Seq((0, 1.0), (2, 2.0),
(3, 3.0)))

The preceding code produces the following output:

dVectorOne:[1.0,0.0,2.0]
sVectorOne:(4,[0,2,3],[1.0,2.0,3.0])
sVectorTwo:(4,[0,2,3],[1.0,2.0,3.0])

There are various methods exposed by Spark for accessing and discovering vector values as shown next:

val sVectorOneMax = sVectorOne.argmax
val sVectorOneNumNonZeros = sVectorOne.numNonzeros
val sVectorOneSize = sVectorOne.size
val sVectorOneArray = sVectorOne.toArray
val sVectorOneJson = sVectorOne.toJson

println("sVectorOneMax:" + sVectorOneMax)
println("sVectorOneNumNonZeros:" + sVectorOneNumNonZeros)
println("sVectorOneSize:" + sVectorOneSize)
println("sVectorOneArray:" + sVectorOneArray)
println("sVectorOneJson:" + sVectorOneJson)
val dVectorOneToSparse = dVectorOne.toSparse

The preceding code produces the following output:

sVectorOneMax:3
sVectorOneNumNonZeros:3
sVectorOneSize:4
sVectorOneArray:[D@38684d54
sVectorOneJson:{"type":0,"size":4,"indices":[0,2,3],"values":
[1.0,2.0,3.0]}

dVectorOneToSparse:(3,[0,2],[1.0,2.0])
主站蜘蛛池模板: 中卫市| 玛曲县| 乐至县| 岳阳市| 乡宁县| 锦州市| 峨边| 中宁县| 松溪县| 文昌市| 安宁市| 恩施市| 东阿县| 白城市| 嘉义市| 永兴县| 新宾| 闽清县| 甘孜| 平湖市| 开阳县| 宁都县| 策勒县| 克山县| 高阳县| 大余县| 清远市| 三原县| 大同县| 海阳市| 富平县| 奇台县| 铁岭市| 年辖:市辖区| 无棣县| 师宗县| 合阳县| 潼南县| 泸定县| 咸宁市| 郸城县|