官术网_书友最值得收藏!

Chapter 1. Obtaining and Cleaning Data

In this chapter, we will cover the following recipes:

  • Retrieving all file names from hierarchical directories using Java
  • Retrieving all file names from hierarchical directories using Apache Commons IO
  • Reading contents from text files all at once using Java 8
  • Reading contents from text files all at once using Apache Commons IO
  • Extracting PDF text using Apache Tika
  • Cleaning ASCII text files using Regular Expressions
  • Parsing Comma Separated Value files using Univocity
  • Parsing Tab Separated Value files using Univocity
  • Parsing XML files using JDOM
  • Writing JSON files using JSON.simple
  • Reading JSON files using JSON.simple
  • Extracting web data from a URL using JSoup
  • Extracting web data from a website using Selenium Webdriver
  • Reading table data from MySQL database
主站蜘蛛池模板: 山丹县| 陈巴尔虎旗| 耒阳市| 美姑县| 申扎县| 达州市| 建阳市| 怀远县| 江源县| 威信县| 盘山县| 图木舒克市| 洮南市| 博白县| 新泰市| 广平县| 雅江县| 富锦市| 运城市| 息烽县| 绍兴县| 镇原县| 磐安县| 浏阳市| 温泉县| 名山县| 得荣县| 潮安县| 璧山县| 凤庆县| 江川县| 西青区| 呼伦贝尔市| 东至县| 卢湾区| 沈阳市| 长丰县| 莱州市| 明溪县| 秦皇岛市| 登封市|