- Mastering Java for Data Science
- Alexey Grigorev
- 308字
- 2021-07-02 23:44:28
Preface
Data science has become a quite important tool for organizations nowadays: they have collected large amounts of data, and to be able to put it into good use, they need data science--the discipline about methods for extracting knowledge from data. Every day more and more companies realize that they can benefit from data science and utilize the data that they produce more effectively and more profitably.
It is especially true for IT companies, they already have the systems and the infrastructure for generating and processing the data. These systems are often written in Java--the language of choice for many large and small companies across the world. It is not a surprise, Java offers a very solid and mature ecosystem of libraries that are time proven and reliable, so many people trust Java and use it for creating their applications.
Thus, it is also a natural choice for many data processing applications. Since the existing systems are already in Java, it makes sense to use the same technology stack for data science, and integrate the machine learning model directly in the application's production code base.
This book will cover exactly that. We will first see how we can utilize Java’s toolbox for processing small and large datasets, then look into doing initial exploration data analysis. Next, we will review the Java libraries that implement common Machine Learning models for classification, regression, clustering, and dimensionality reduction problems. Then we will get into more advanced techniques and discuss Information Retrieval and Natural Language Processing, XGBoost, deep learning, and large scale tools for processing big datasets such as Apache Hadoop and Apache Spark. Finally, we will also have a look at how to evaluate and deploy the produced models such that the other services can use them.
We hope you will enjoy the book. Happy reading!
- 公有云容器化指南:騰訊云TKE實(shí)戰(zhàn)與應(yīng)用
- 達(dá)夢(mèng)數(shù)據(jù)庫(kù)編程指南
- 復(fù)雜性思考:復(fù)雜性科學(xué)和計(jì)算模型(原書(shū)第2版)
- 從0到1:數(shù)據(jù)分析師養(yǎng)成寶典
- MySQL從入門(mén)到精通(第3版)
- R數(shù)據(jù)科學(xué)實(shí)戰(zhàn):工具詳解與案例分析(鮮讀版)
- Oracle高性能自動(dòng)化運(yùn)維
- OracleDBA實(shí)戰(zhàn)攻略:運(yùn)維管理、診斷優(yōu)化、高可用與最佳實(shí)踐
- 企業(yè)級(jí)數(shù)據(jù)與AI項(xiàng)目成功之道
- Flutter Projects
- Solaris操作系統(tǒng)原理實(shí)驗(yàn)教程
- 數(shù)據(jù)挖掘競(jìng)賽實(shí)戰(zhàn):方法與案例
- Delphi High Performance
- 成功之路:ORACLE 11g學(xué)習(xí)筆記
- 工業(yè)大數(shù)據(jù)融合體系結(jié)構(gòu)與關(guān)鍵技術(shù)