官术网_书友最值得收藏!

SparkSession

SparkSession allows programming with the DataFrame and Dataset APIs. It is a single point of entry for these APIs.

First, we need to create an instance of the SparkConf class and use it to create the SparkSession instance. Consider the following example:

val spConfig = (new SparkConf).setMaster("local").setAppName("SparkApp")
val spark = SparkSession
.builder()
.appName("SparkUserData").config(spConfig)
.getOrCreate()

Next we can use spark object to create a DataFrame:

val user_df = spark.read.format("com.databricks.spark.csv")
.option("delimiter", "|").schema(customSchema)
.load("/home/ubuntu/work/ml-resources/spark-ml/data/ml-100k/u.user")
val first = user_df.first()
主站蜘蛛池模板: 万全县| 甘谷县| 营口市| 新绛县| 和政县| 蚌埠市| 梅州市| 巧家县| 麟游县| 弥渡县| 沁水县| 东光县| 喀什市| 灵丘县| 肃宁县| 新平| 莫力| 桃园县| 镇安县| 张家川| 洛隆县| 资兴市| 东辽县| 景泰县| 凤凰县| 崇明县| 铜川市| 聂荣县| 苍梧县| 永济市| 新田县| 平顺县| 祁阳县| 华亭县| 射阳县| 陆良县| 阿瓦提县| 乐平市| 子洲县| 尤溪县| 新蔡县|