官术网_书友最值得收藏!

SparkSession

SparkSession allows programming with the DataFrame and Dataset APIs. It is a single point of entry for these APIs.

First, we need to create an instance of the SparkConf class and use it to create the SparkSession instance. Consider the following example:

val spConfig = (new SparkConf).setMaster("local").setAppName("SparkApp")
val spark = SparkSession
.builder()
.appName("SparkUserData").config(spConfig)
.getOrCreate()

Next we can use spark object to create a DataFrame:

val user_df = spark.read.format("com.databricks.spark.csv")
.option("delimiter", "|").schema(customSchema)
.load("/home/ubuntu/work/ml-resources/spark-ml/data/ml-100k/u.user")
val first = user_df.first()
主站蜘蛛池模板: 云梦县| 驻马店市| 周口市| 吴川市| 万荣县| 西城区| 临澧县| 延寿县| 太保市| 同德县| 尼玛县| 梁河县| 安顺市| 合肥市| 江都市| 睢宁县| 喀喇| 昌都县| 峡江县| 兴化市| 勃利县| 宁安市| 大悟县| 望谟县| 同仁县| 吴桥县| 宁国市| 宁乡县| 石林| 翁源县| 梅州市| 尚义县| 怀柔区| 门头沟区| 安康市| 正安县| 武定县| 闽侯县| 赤城县| 新和县| 民丰县|