官术网_书友最值得收藏!

How it works...

We start by importing the PyGitHub library in Step 1 in order to be able to conveniently call the GitHub APIs. These will allow us to scrape and explore the universe of repositories. We also import the base64 module for decoding the base64 encoded files that we will be downloading from GitHub. Note that there is a rate limit on the number of API calls a generic user can make to GitHub. For this reason, you will find that if you attempt to download too many files in a short duration, your script will not get all of the files. Our next step is to supply our credentials to GitHub (step 2), and specify that we are looking for repositories with JavaScript, using the query='language:javascript' command. We enumerate such repositories matching our criteria of being associated with JavaScript, and if they do, we search through these for files ending with .js and create local copies (steps 3 to 6). Since these files are encoded in base64, we make sure to decode them to plaintext in step 7. Finally, we show you how to adjust the script in order to scrape other file types, such as Python and PowerShell (Step 8).

主站蜘蛛池模板: 迁西县| 鹤岗市| 盐边县| 丹江口市| 安乡县| 岳阳市| 郯城县| 普陀区| 桦川县| 河间市| 钟山县| 阳信县| 涞源县| 宾阳县| 海晏县| 新巴尔虎右旗| 无棣县| 普洱| 巴楚县| 黄浦区| 陈巴尔虎旗| 阿图什市| 弋阳县| 礼泉县| 大港区| 福清市| 长垣县| 桃江县| 浏阳市| 浠水县| 新巴尔虎左旗| 江油市| 蒙山县| 永安市| 巴彦淖尔市| 于都县| 道孚县| 自贡市| 连平县| 云梦县| 界首市|