官术网_书友最值得收藏!

SparkSession

SparkSession allows programming with the DataFrame and Dataset APIs. It is a single point of entry for these APIs.

First, we need to create an instance of the SparkConf class and use it to create the SparkSession instance. Consider the following example:

val spConfig = (new SparkConf).setMaster("local").setAppName("SparkApp")
val spark = SparkSession
.builder()
.appName("SparkUserData").config(spConfig)
.getOrCreate()

Next we can use spark object to create a DataFrame:

val user_df = spark.read.format("com.databricks.spark.csv")
.option("delimiter", "|").schema(customSchema)
.load("/home/ubuntu/work/ml-resources/spark-ml/data/ml-100k/u.user")
val first = user_df.first()
主站蜘蛛池模板: 肇州县| 弥渡县| 南郑县| 炉霍县| 织金县| 麻栗坡县| 克什克腾旗| 原平市| 莎车县| 东丽区| 墨玉县| 天柱县| 望江县| 汉阴县| 晋中市| 赣州市| 法库县| 龙川县| 浦江县| 子洲县| 三河市| 洛扎县| 交城县| 扬中市| 泽普县| 长阳| 娄底市| 高淳县| 章丘市| 新龙县| 韶关市| 江西省| 白水县| 天祝| 牙克石市| 沈丘县| 泗水县| 乐业县| 富阳市| 淮北市| 石台县|