官术网_书友最值得收藏!

What this book covers

Chapter 1, Data Science Overview, covers what the term data science means, the need for data science, the difference compared with traditional BI/DWH, and the competencies and knowledge required in order to be a data scientist.

Chapter 2, SQL Server 2017 as a Data Science Platform, explains the architecture of SQL Server from a data science perspective: in-memory OLTP for data acquisition; integration services as a transformation feature set; reporting services for visualization of input as well as output data; and, probably most importantly of all, T-SQL as a language for data exploration and transformation and machine learning services for making models themselves. 

Chapter 3, Data Sources for Analytics, covers relational databases and NoSQL concepts side-by-side as valuable sources of data with a different approach to use. It also provides an overview of technologies such as HDInsight, Apache Hadoop, and Cosmos DB, and querying against such data sources.

Chapter 4, Data Transforming and Cleaning with T-SQL, demonstrates T-SQL techniques that are useful for making data consumable and complete for further utilization in data science, along with database architectures that are useful for transform/cleansing tasks.

Chapter 5, Data Exploration and Statistics with T-SQL, takes a deep dive into T-SQL capabilities, including common grouping and aggregations, framing/windowing, running aggregates, and (if needed) features such as custom CLR aggregates (with performance considerations).

Chapter 6, Custom Aggregations on SQL Server, explains how to create your own aggregations in order to enhance core T-SQL functionality.

Chapter 7, Data Visualization, explains the importance of visualizing data to reveal hidden patterns therein, along with examples of reporting services, PowerView, and PowerBI. By way of an alternative, an overview of R/Python visualization features is also provided (as these languages will play a vital role later in the book).

Chapter 8, Data Transformations with Other Tools, explains how to use integration services, probably R or Python, to transform data into a useful format, replacing missing values, detecting mistakes in datasets, normalization and its purpose, categorization, and finally data denormalization for better analytic purposes using views.

Chapter 9, Predictive Model Training and Evaluation, concerns a wide set of predictive models (clustering, N-point Bayes machines, recommenders) and their implementations via Machine Learning Studio, R, or Python.

Chapter 10Making Predictions, explains how to use models created, evaluated, and scored in previous chapters. We will also learn how to make the model self-learning from the predictions made.

Chapter 11Getting It All Together – a Real-World Exampledemonstrates how to use certain features to grab, transform, and analyze data for a successful data science case.

Chapter 12Next Steps with Data Science and SQLsummarizes the main points of all the preceding chapters and concludes outcomes. The chapter also provides ideas of how to continue working with data science, which trends are probably awaited in the future, and which other technologies will play strong roles in data science.

主站蜘蛛池模板: 吉木萨尔县| 搜索| 延安市| 隆子县| 平乡县| 五台县| 延庆县| 格尔木市| 温宿县| 水富县| 将乐县| 苏尼特左旗| 清新县| 射洪县| 彝良县| 资源县| 喀什市| 泰来县| 旬阳县| 南城县| 长子县| 桐梓县| 侯马市| 竹山县| 连南| 东辽县| 东方市| 延长县| 枝江市| 安康市| 岐山县| 荃湾区| 克拉玛依市| 广灵县| 台南县| 宝鸡市| 永福县| 文登市| 博爱县| 鹰潭市| 汉寿县|