官术网_书友最值得收藏!

Preface

As data scientists and machine learning professionals, our jobs are to build models for detecting frauds, predicting customer churns, or turning data into insights in a broad sense; for this, we sometimes need to process huge amounts of data and handle complicated computations. Therefore, we are always excited to see new computing tools, such as Spark, and spend a lot of time learning about them. To learn about these new tools, a lot of learning materials are available, but they are from a more computing perspective, and often written by computer scientists.

We, the data scientists and machine learning professionals, as users of Spark, are more concerned about how the new systems can help us build models with more predictive accuracy and how these systems can make data processing and coding easy for us. This is the main reason why this book has been developed and why this book has been written by a data scientist.

At the same time, we, as data scientists and machine learning professionals, have already developed our frameworks and processes as well as used some good model building tools, such as R and SPSS. We understand that some of the new tools, such as MLlib of Spark, may replace certain old tools, but not all of them. Therefore, using Spark together with our existing tools is essential to us as users of Spark and becomes one of the main focuses for this book, which is also one of the critical elements, making this book different from other Spark books.

Overall, this is a Spark book written by a data scientist for data scientists and machine learning professionals to make machine learning easy for us with Spark.

主站蜘蛛池模板: 北川| 望谟县| 韶关市| 景宁| 阿克| 化隆| 绵竹市| 南漳县| 上犹县| 凌源市| 通化市| 观塘区| 谢通门县| 禹州市| 禹城市| 贵溪市| 古田县| 东乌珠穆沁旗| 宝丰县| 青浦区| 兴化市| 嘉祥县| 英德市| 天气| 伊金霍洛旗| 甘洛县| 丰原市| 平度市| 蓬安县| 三都| 福海县| 招远市| 瑞丽市| 渭南市| 鄂伦春自治旗| 炎陵县| 湘乡市| 西藏| 兴安县| 盐津县| 大邑县|