官术网_书友最值得收藏!

書名： Hands-On Big Data Analytics with PySpark
作者名： Rudy Lai Bart?omiej Potaczek
本章字數： 96字
更新時間： 2021-06-24 15:52:33

Getting Your Big Data into the Spark Environment Using RDDs

Primarily, this chapter will provide a brief overview of how to get your big data into the Spark environment using resilient distributed datasets (RDDs). We will be using a wide array of tools to interact with and modify this data so that useful insights can be extracted. We will first load the data on Spark RDDs and then carry out parallelization with Spark RDDs.

In this chapter, we will cover the following topics:

Loading data onto Spark RDDs
Parallelization with Spark RDDs
Basics of RDD operation

主站蜘蛛池模板：临猗县| 长葛市| 阿尔山市| 育儿| 太仆寺旗| 图木舒克市| 鱼台县| 商河县| 彭泽县| 鱼台县| 灵寿县| 开江县| 蓝山县| 鄱阳县| 陵川县| 安塞县| 云浮市| 四子王旗| 常宁市| 友谊县| 阿图什市| 孟津县| 定襄县| 宣威市| 广宗县| 华宁县| 伽师县| 灵宝市| 潞西市| 安溪县| 恩平市| 西城区| 华蓥市| 锦屏县| 平乡县| 五常市| 永宁县| 明溪县| 萍乡市| 六盘水市| 应用必备|