官术网_书友最值得收藏!

Preface

As data scientists and machine learning professionals, our jobs are to build models for detecting frauds, predicting customer churns, or turning data into insights in a broad sense; for this, we sometimes need to process huge amounts of data and handle complicated computations. Therefore, we are always excited to see new computing tools, such as Spark, and spend a lot of time learning about them. To learn about these new tools, a lot of learning materials are available, but they are from a more computing perspective, and often written by computer scientists.

We, the data scientists and machine learning professionals, as users of Spark, are more concerned about how the new systems can help us build models with more predictive accuracy and how these systems can make data processing and coding easy for us. This is the main reason why this book has been developed and why this book has been written by a data scientist.

At the same time, we, as data scientists and machine learning professionals, have already developed our frameworks and processes as well as used some good model building tools, such as R and SPSS. We understand that some of the new tools, such as MLlib of Spark, may replace certain old tools, but not all of them. Therefore, using Spark together with our existing tools is essential to us as users of Spark and becomes one of the main focuses for this book, which is also one of the critical elements, making this book different from other Spark books.

Overall, this is a Spark book written by a data scientist for data scientists and machine learning professionals to make machine learning easy for us with Spark.

主站蜘蛛池模板: 黔南| 水城县| 云林县| 西宁市| 吉林省| 溧水县| 枣庄市| 新丰县| 镇坪县| 庄浪县| 姚安县| 陇南市| 勃利县| 清水河县| 呼和浩特市| 叙永县| 正阳县| 黑山县| 嘉峪关市| 张家界市| 沛县| 凤山市| 柳州市| 津南区| 中方县| 叶城县| 章丘市| 舟山市| 晋宁县| 盱眙县| 怀集县| 阿拉尔市| 晋江市| 长宁县| 读书| 正蓝旗| 黄山市| 沅陵县| 临洮县| 淳化县| 正宁县|