官术网_书友最值得收藏!

Chapter 3. ETL with Spark

So we have gone through the architecture of Spark, and have had some detailed level discussions around RDDs. By the end of Chapter 2Transformations and Actions with Spark RDDs, we had focused on PairRDDs and some of the transformations.

This chapter focuses on doing ETL with Apache Spark. We'll cover the following topics, which hopefully will help you with taking the next step on Apache Spark:

  • Understanding the ETL process
  • Commonly supported file formats
  • Commonly supported filesystems
  • Working with NoSQL databases

Let's get started!

主站蜘蛛池模板: 溧水县| 巧家县| 阿合奇县| 阿合奇县| 汉阴县| 宣汉县| 长葛市| 阿城市| 晋江市| 铅山县| 扬中市| 元江| 久治县| 仪陇县| 清徐县| 永康市| 长治市| 柘城县| 达尔| 望都县| 南岸区| 赤城县| 安溪县| 河池市| 梁河县| 高平市| 开封市| 师宗县| 左权县| 积石山| 莆田市| 泰和县| 徐州市| 如东县| 揭东县| 琼海市| 罗山县| 岱山县| 玉环县| 依安县| 天气|