官术网_书友最值得收藏!

RDDs versus DataFrames versus Datasets

To make it clear, we are discouraging you from using RDDs unless there is a strong reason to do so for the following reasons:

  • RDDs, on an abstraction level, are equivalent to assembler or machine code when it comes to system programming
  • RDDs express how to do something and not what is to be achieved, leaving no room for optimizers
  • RDDs have proprietary syntax; SQL is more widely known

Whenever possible, use Datasets because their static typing makes them faster. As long as you are using statically typed languages such as Java or Scala, you are fine. Otherwise, you have to stick with DataFrames.

主站蜘蛛池模板: 营山县| 新郑市| 吴旗县| 抚宁县| 宜宾县| 双鸭山市| 长子县| 正镶白旗| 奇台县| 泸水县| 杭锦后旗| 利川市| 司法| 益阳市| 宁晋县| 正宁县| 鄂托克前旗| 花垣县| 化州市| 济阳县| 昌邑市| 那坡县| 临湘市| 鹤壁市| 金乡县| 孙吴县| 斗六市| 德令哈市| 孟连| 讷河市| 韶山市| 岳池县| 福清市| 安新县| 锦州市| 二连浩特市| 五莲县| 临湘市| 永定县| 汶上县| 澎湖县|