官术网_书友最值得收藏!

RDDs versus DataFrames versus Datasets

To make it clear, we are discouraging you from using RDDs unless there is a strong reason to do so for the following reasons:

  • RDDs, on an abstraction level, are equivalent to assembler or machine code when it comes to system programming
  • RDDs express how to do something and not what is to be achieved, leaving no room for optimizers
  • RDDs have proprietary syntax; SQL is more widely known

Whenever possible, use Datasets because their static typing makes them faster. As long as you are using statically typed languages such as Java or Scala, you are fine. Otherwise, you have to stick with DataFrames.

主站蜘蛛池模板: 盐池县| 买车| 方城县| 合山市| 辽阳县| 十堰市| 洱源县| 离岛区| 睢宁县| 睢宁县| 绥化市| 睢宁县| 阿坝| 北京市| 于田县| 璧山县| 色达县| 当阳市| 奈曼旗| 榕江县| 郑州市| 铜山县| 洱源县| 邵武市| 西乌珠穆沁旗| 兰西县| 益阳市| 衡水市| 德化县| 永宁县| 屏边| 南昌县| 巴青县| 凉城县| 新余市| 蓬溪县| 西吉县| 扶余县| 道孚县| 台安县| 上高县|