- Mastering Machine Learning with Spark 2.x
- Alex Tellez Max Pumperla Michal Malohlava
- 237字
- 2021-07-02 18:46:06
Data science - an iterative process
Often, the process flow of many big data projects is iterative, which means a lot of back-and-forth testing new ideas, new features to include, tweaking various hyper-parameters, and so on, with a fail fast attitude. The end result of these projects is usually a model that can answer a question being posed. Notice that we didn't say accurately answer a question being posed! One pitfall of many data scientists these days is their inability to generalize a model for new data, meaning that they have overfit their data so that the model provides poor results when given new data. Accuracy is extremely task-dependent and is usually dictated by the business needs with some sensitivity analysis being done to weigh the cost-benefits of the model outcomes. However, there are a few standard accuracy measures that we will go over throughout this book so that you can compare various models to see how changes to the model impact the result.
- Puppet 4 Essentials(Second Edition)
- 零起步玩轉掌控板與Mind+
- Flink SQL與DataStream入門、進階與實戰
- Hadoop+Spark大數據分析實戰
- Android Native Development Kit Cookbook
- OpenStack Orchestration
- C語言程序設計與應用(第2版)
- Magento 2 Beginners Guide
- 零基礎學C++(升級版)
- Java高手是怎樣煉成的:原理、方法與實踐
- AutoCAD基礎教程
- SaaS攻略:入門、實戰與進階
- 跟小樓老師學用Axure RP 9:玩轉產品原型設計
- KnockoutJS Blueprints
- R語言與網站分析