- R for Data Science Cookbook
- Yu Wei Chiu (David Chiu)
- 273字
- 2021-07-14 10:51:24
Introduction
Before using data to answer critical business questions, the most important thing is to prepare it. Data is normally archived in files, and using Excel or text editors allows it to be easily obtained. However, data can be located in a range of different sources, such as databases, websites, and various file formats. Being able to import data from these sources is crucial.
There are four main types of data. Data recorded in text format is the simplest. As some users require storing data in a structured format, files with a .tab
or .csv
extension can be used to arrange data in a fixed number of columns. For many years, Excel has had a leading role in the field of data processing, and this software uses the .xls
and .xlsx
formats. Knowing how to read and manipulate data from databases is another crucial skill. Moreover, as most data is not stored in a database, one must know how to use the web scraping technique to obtain data from the Internet. As part of this chapter, we introduce how to scrape data from the Internet using the rvest
package.
Many experienced developers have already created packages to allow beginners to obtain data more easily, and we focus on leveraging these packages to perform data extraction, transformation, and loading. In this chapter, we first learn how to utilize R packages to read data from a text format and scan files line by line. We then move to the topic of reading structured data from databases and Excel. Last, we learn how to scrape Internet and social network data by using the R web scraper.
- Visual C++數字圖像模式識別技術詳解
- Mastering Python Scripting for System Administrators
- Oracle Database 12c Security Cookbook
- Go并發編程實戰
- 碼上行動:用ChatGPT學會Python編程
- Angular開發入門與實戰
- Instant PHP Web Scraping
- INSTANT Silverlight 5 Animation
- Appcelerator Titanium:Patterns and Best Practices
- Learning C++ by Creating Games with UE4
- 實驗編程:PsychoPy從入門到精通
- 面向物聯網的Android應用開發與實踐
- Flask Web開發實戰:入門、進階與原理解析
- 寫給所有人的編程思維
- 亮劍Java Web項目開發案例導航