作者名:Rajdeep Dua Manpreet Singh Ghotra Nick Pentreath
本章字數(shù):76字
更新時間:2021-07-09 21:07:40
SparkSession
SparkSession allows programming with the DataFrame and Dataset APIs. It is a single point of entry for these APIs.
First, we need to create an instance of the SparkConf class and use it to create the SparkSession instance. Consider the following example:
val spConfig = (new SparkConf).setMaster("local").setAppName("SparkApp") val spark = SparkSession .builder() .appName("SparkUserData").config(spConfig) .getOrCreate()
Next we can use spark object to create a DataFrame:
val user_df = spark.read.format("com.databricks.spark.csv") .option("delimiter", "|").schema(customSchema) .load("/home/ubuntu/work/ml-resources/spark-ml/data/ml-100k/u.user") val first = user_df.first()