- Mastering Java for Data Science
- Alexey Grigorev
- 359字
- 2021-07-02 23:44:31
Data science
Data science is the discipline of extracting actionable knowledge from data of various forms. The name data science emerged quite recently--it was invented by DJ Patil and Jeff Hammerbacher and popularized in the article Data Scientist: The Sexiest Job of the 21st Century in 2012. But the discipline itself had existed before for quite a while and previously was known by other names such as data mining or predictive analytics. Data science, like its predecessors, is built on statistics and machine learning algorithms for knowledge extraction and model building.
The science part of the term data science is no coincidence--if we look up science, its definition can be summarized to systematic organization of knowledge in terms testable explanations and predictions. This is exactly what data scientists do, by extracting patterns from available data, they can make predictions about future unseen data, and they make sure the predictions are validated beforehand.
Nowadays, data science is used across many fields, including (but not limited to):
- Banking: Risk management (for example, credit scoring), fraud detection, trading
- Insurance: Claims management (for example, accelerating claim approval), risk and losses estimation, also fraud detection
- Health care: Predicting diseases (such as strokes, diabetes, cancer) and relapses
- Retail and e-commerce: Market basket analysis (identifying product that go well together), recommendation engines, product categorization, and personalized searches
This book covers the following practical use cases:
- Predicting whether an URL is likely to appear on the first page of a search engine
- Predicting how fast an operation will be completed given the hardware specifications
- Ranking text documents for a search engine
- Checking whether there is a cat or a dog on a picture
- Recommending friends in a social network
- Processing large-scale textual data on a cluster of computers
In all these cases, we will use data science to learn from data and use the learned knowledge to solve a particular business problem.
We will also use a running example throughout the book, building a search engine. We will use it to illustrate many data science concepts such as, supervised machine learning, dimensionality reduction, text mining, and learning to rank models.
- GitHub Essentials
- 輕松學大數據挖掘:算法、場景與數據產品
- 云計算環境下的信息資源集成與服務
- MySQL從入門到精通(第3版)
- 數據結構與算法(C語言版)
- Mockito Cookbook
- 深入淺出MySQL:數據庫開發、優化與管理維護(第2版)
- 聯動Oracle:設計思想、架構實現與AWR報告
- Unreal Engine Virtual Reality Quick Start Guide
- Hands-On System Programming with C++
- Deep Learning with R for Beginners
- 利用Python進行數據分析(原書第2版)
- Hands-On Deep Learning for Games
- 數字化轉型實踐:構建云原生大數據平臺
- 數據中心UPS系統運維