- Mastering Java for Data Science
- Alexey Grigorev
- 133字
- 2021-07-02 23:44:33
Data Processing Toolbox
In the previous chapter, we discussed the best practices for approaching data science problems. We looked at CRISP-DM, which is the methodology for dealing with data mining projects, and one of the first steps there is data preprocessing. In this chapter, we will take a closer look at how to do this in Java.
Specifically, we will cover the following topics:
- Standard Java library
- Extensions to the standard library
- Reading data from different sources such as text, HTML, JSON, and databases
- DataFrames for manipulating tabular data
In the end, we will put everything together to prepare the data for the search engine.
By the end of this chapter, you will be able to process data such that it can be used for machine learning and further analysis.
推薦閱讀
- 我們都是數據控:用大數據改變商業、生活和思維方式
- Learning Spring Boot
- MySQL從入門到精通(第3版)
- R數據科學實戰:工具詳解與案例分析(鮮讀版)
- 數據革命:大數據價值實現方法、技術與案例
- 深度剖析Hadoop HDFS
- OracleDBA實戰攻略:運維管理、診斷優化、高可用與最佳實踐
- 大數據技術入門
- 新基建:數據中心創新之路
- 圖數據實戰:用圖思維和圖技術解決復雜問題
- INSTANT Android Fragmentation Management How-to
- 數據庫應用系統開發實例
- Unreal Engine Virtual Reality Quick Start Guide
- 中文版Access 2007實例與操作
- 數字化轉型方法論:落地路徑與數據中臺