- Python Web Scraping Cookbook
- Michael Heydt
- 107字
- 2021-06-30 18:44:03
Loading data in unicode / UTF-8
A document's encoding tells an application how the characters in the document are represented as bytes in the file. Essentially, the encoding specifies how many bits there are per character. In a standard ASCII document, all characters are 8 bits. HTML files are often encoded as 8 bits per character, but with the globalization of the internet, this is not always the case. Many HTML documents are encoded as 16-bit characters, or use a combination of 8- and 16-bit characters.
A particularly common form HTML document encoding is referred to as UTF-8. This is the encoding form that we will examine.
推薦閱讀
- Building E-commerce Sites with VirtueMart Cookbook
- 物聯(lián)網(wǎng)安全與深度學(xué)習(xí)技術(shù)
- HCNA網(wǎng)絡(luò)技術(shù)
- Building RESTful Web Services with Spring 5(Second Edition)
- Unity Artificial Intelligence Programming
- 網(wǎng)絡(luò)安全應(yīng)急響應(yīng)技術(shù)實戰(zhàn)指南
- 云工廠:開啟中國制造云時代
- bash網(wǎng)絡(luò)安全運維
- 小型局域網(wǎng)組建
- SEO攻略:搜索引擎優(yōu)化策略與實戰(zhàn)案例詳解
- 學(xué)術(shù)虛擬社區(qū)用戶社會化交互行為研究
- 智能家庭網(wǎng)絡(luò):技術(shù)、標(biāo)準(zhǔn)與應(yīng)用實踐
- 物聯(lián)網(wǎng)與無線傳感器網(wǎng)絡(luò)(第2版)
- Alfresco Share
- CCNP TSHOOT(642-832)學(xué)習(xí)指南