- Java Data Science Cookbook
- Rushdi Shams
- 125字
- 2021-07-09 18:44:25
Chapter 1. Obtaining and Cleaning Data
In this chapter, we will cover the following recipes:
- Retrieving all file names from hierarchical directories using Java
- Retrieving all file names from hierarchical directories using Apache Commons IO
- Reading contents from text files all at once using Java 8
- Reading contents from text files all at once using Apache Commons IO
- Extracting PDF text using Apache Tika
- Cleaning ASCII text files using Regular Expressions
- Parsing Comma Separated Value files using Univocity
- Parsing Tab Separated Value files using Univocity
- Parsing XML files using JDOM
- Writing JSON files using JSON.simple
- Reading JSON files using JSON.simple
- Extracting web data from a URL using JSoup
- Extracting web data from a website using Selenium
Webdriver
- Reading table data from MySQL database
推薦閱讀
- 數(shù)據(jù)挖掘原理與實(shí)踐
- Unity 5.x Game AI Programming Cookbook
- Python數(shù)據(jù)分析、挖掘與可視化從入門到精通
- Learning JavaScriptMVC
- 數(shù)據(jù)庫系統(tǒng)原理及應(yīng)用教程(第4版)
- Enterprise Integration with WSO2 ESB
- 數(shù)據(jù)驅(qū)動(dòng)設(shè)計(jì):A/B測(cè)試提升用戶體驗(yàn)
- 城市計(jì)算
- 數(shù)亦有道:Python數(shù)據(jù)科學(xué)指南
- MySQL 8.x從入門到精通(視頻教學(xué)版)
- 企業(yè)級(jí)數(shù)據(jù)與AI項(xiàng)目成功之道
- 基于OPAC日志的高校圖書館用戶信息需求與檢索行為研究
- 深入淺出 Hyperscan:高性能正則表達(dá)式算法原理與設(shè)計(jì)
- SQL Server 2012數(shù)據(jù)庫管理教程
- Oracle數(shù)據(jù)庫管理、開發(fā)與實(shí)踐