官术网_书友最值得收藏!

RDDs versus DataFrames versus Datasets

To make it clear, we are discouraging you from using RDDs unless there is a strong reason to do so for the following reasons:

  • RDDs, on an abstraction level, are equivalent to assembler or machine code when it comes to system programming
  • RDDs express how to do something and not what is to be achieved, leaving no room for optimizers
  • RDDs have proprietary syntax; SQL is more widely known

Whenever possible, use Datasets because their static typing makes them faster. As long as you are using statically typed languages such as Java or Scala, you are fine. Otherwise, you have to stick with DataFrames.

主站蜘蛛池模板: 汉沽区| 兴隆县| 龙州县| 通辽市| 台中市| 邳州市| 凌源市| 兴和县| 合阳县| 沅陵县| 平邑县| 辽中县| 子洲县| 克东县| 景宁| 琼海市| 乾安县| 永宁县| 霍州市| 巴楚县| 汝阳县| 容城县| 沂水县| 镇原县| 新昌县| 奉节县| 麦盖提县| 锦州市| 饶河县| 尤溪县| 香格里拉县| 桦甸市| 濮阳县| 青田县| 布尔津县| 乐东| 静海县| 霸州市| 富平县| 潼南县| 全南县|