- Go Web Scraping Quick Start Guide
- Vincent Smith
- 238字
- 2021-07-02 13:58:14
Search engines
One well-known use case for web scraping is indexing websites for the purpose of building a search engine. In this case, a web scraper would visit different websites and follow references to other websites in order to discover all of the content available on the internet. By collecting some of the content from the pages, you could respond to search queries by matching the terms to the contents of the pages you have collected. You could also suggest similar pages if you track how pages are linked together, and rank the most important pages by the number of connections they have to other sites.
Googlebot is the most famous example of a web scraper used to build a search engine. It is the first step in building the search engine as it downloads, indexes, and ranks each page on a website. It will also follow links to other websites, which is how it is able to index a substantial portion of the internet. According to Googlebot's documentation, the scraper attempts to reach each web page every few seconds, which requires them to reach estimates of well into billions of pages per day!
If your goal is to build a search engine, albeit on a much smaller scale, you will find enough tools in this book to collect the information you need. This book will not, however, cover indexing and ranking pages to provide relevant search results.
- RCNP實驗指南:構建高級的路由互聯網絡(BARI)
- 物聯網識別技術
- Windows Server 2003 Active Directory Design and Implementation: Creating, Migrating, and Merging Networks
- Go Web Scraping Quick Start Guide
- HTML5 Game development with ImpactJS
- 面向云平臺的物聯網多源異構信息融合方法
- PLC、現場總線及工業網絡實用技術速成
- 物聯網與無線傳感器網絡
- 中國互聯網發展報告2018
- IPv6網絡切片:使能千行百業新體驗
- 5G技術與標準
- 網絡AI+:2030后的未來網絡
- 云工廠:開啟中國制造云時代
- Getting Started with tmux
- Enterprise ApplicationDevelopment with Ext JSand Spring