官术网_书友最值得收藏!

How it works...

We start by importing the PyGitHub library in Step 1 in order to be able to conveniently call the GitHub APIs. These will allow us to scrape and explore the universe of repositories. We also import the base64 module for decoding the base64 encoded files that we will be downloading from GitHub. Note that there is a rate limit on the number of API calls a generic user can make to GitHub. For this reason, you will find that if you attempt to download too many files in a short duration, your script will not get all of the files. Our next step is to supply our credentials to GitHub (step 2), and specify that we are looking for repositories with JavaScript, using the query='language:javascript' command. We enumerate such repositories matching our criteria of being associated with JavaScript, and if they do, we search through these for files ending with .js and create local copies (steps 3 to 6). Since these files are encoded in base64, we make sure to decode them to plaintext in step 7. Finally, we show you how to adjust the script in order to scrape other file types, such as Python and PowerShell (Step 8).

主站蜘蛛池模板: 金平| 屏东县| 嵊泗县| 南和县| 凤阳县| 长岛县| 洛阳市| 德格县| 廊坊市| 太原市| 客服| 灵川县| 横峰县| 长白| 杂多县| 浦县| 大安市| 浑源县| 海伦市| 阜新市| 乐昌市| 望都县| 桐庐县| 天峨县| 新化县| 威海市| 仙桃市| 梅河口市| 亳州市| 石首市| 梁山县| 洛阳市| 迁西县| 崇仁县| 永济市| 武隆县| 新民市| 遵义县| 上高县| 丁青县| 出国|