- R Web Scraping Quick Start Guide
- Olgun Aydin
- 334字
- 2021-06-10 19:35:03
What this book covers
Chapter 1, Introduction to Web Scraping, introduces web scraping techniques, which are getting more and more popular, since data is as valuable as oil in the 21st century. In this chapter, you can find detailed information about web scraping technologies. We also take an overview of some of the key languages for web scraping, such as XPath and regEX. We'll also look into some web scraping libraries for R, such as rvest and RSelenium technologies.
Chapter 2, Working with the XML Path Language and the Regular Expression Language, looks at XPath and regEX rules, which are quite important to know when scraping a web page. In this chapter, you can find useful information about these languages and also have a chance to write XPath and regEX rules from scratch.
Chapter 3, Web Scraping with rvest, covers the rvest library. Scraping a web page with R is straightforward thanks to the rvest library, which was developed by Hadley Wickham. In this chapter, you can find tips and tricks about the library and learn how to write an R script by using the rvest library to scrape a web page from scratch.
Chapter 4, Web Scraping with RSelenium, explores RSelenium. RSelenium is a technology for testing, but it's also useful for scraping web pages. In this chapter, you can find an overview of Selenium and learn how to scrape a web page using RSelenium library.
Chapter 5, Storing Data and Creating Cronjobs, deals with the matter of storage. After collecting data, you should store the dataset somewhere; it would be good if you could use a cloud-based solution, such as AWS RDS, EC2, Google Cloud Platform, or Microsoft Azure. Also, if you would like to schedule the collection of data, it's possible to create cronjob that will help you do so. In this chapter, you can find an overview of databases and cloud platforms, and you'll also learn how to connect databases and schedule cronjobs using R.
- 樂高機器人EV3設計指南:創造者的搭建邏輯
- 電腦上網直通車
- 小型電動機實用設計手冊
- 計算機網絡技術基礎
- 大型數據庫管理系統技術、應用與實例分析:SQL Server 2005
- Photoshop CS3圖層、通道、蒙版深度剖析寶典
- PostgreSQL 10 Administration Cookbook
- Machine Learning with Apache Spark Quick Start Guide
- 水晶石影視動畫精粹:After Effects & Nuke 影視后期合成
- Hands-On Dashboard Development with QlikView
- 大數據:引爆新的價值點
- 設計模式
- 渲染王3ds Max三維特效動畫技術
- Hands-On Geospatial Analysis with R and QGIS
- 人工智能基礎