官术网_书友最值得收藏!

Downloading a page for offline analysis with Wget

Wget is a part of the GNU project and is included in most of the major Linux distributions, including Kali Linux. It has the ability to recursively download a web page for offline browsing, including conversion of links and downloading of non-HTML files.

In this recipe, we will use Wget to download pages that are associated with an application in our vulnerable_vm.

Getting ready

All recipes in this chapter will require vulnerable_vm running. In the particular scenario of this book, it will have the IP address 192.168.56.102.

How to do it...

  1. Let's make the first attempt to download the page by calling Wget with a URL as the only parameter:
    wget http://192.168.56.102/bodgeit/
    

    As we can see, it only downloaded the index.html file to the current directory, which is the start page of the application.

  2. We will have to use some options to tell Wget to save all the downloaded files to a specific directory and to copy all the files contained in the URL that we set as the parameter. Let's first create a directory to save the files:
    mkdir bodgeit_offline
    
  3. Now, we will recursively download all files in the application and save them in the corresponding directory:
    wget -r -P bodgeit_offline/ http://192.168.56.102/bodgeit/
    

How it works...

As mentioned earlier, Wget is a tool created to download HTTP content. With the –r parameter we made it act recursively, which is to follow all the links in every page it downloads and download them too. The -P option allows us to set the directory prefix, which is the directory where Wget will start saving the downloaded content; it is set to the current path, by default.

There's more...

There are some other useful options to be considered when using Wget:

  • -l: When downloading recursively, it might be necessary to establish limits to the depth Wget goes to, when following links. This option, followed by the number of levels of depth we want to go to, lets us establish such a limit.
  • -k: After files are downloaded, Wget modifies all the links to make them point to the corresponding local files, thus making it possible to browse the site locally.
  • -p: This option lets Wget download all the images needed by the page, even if they are on other sites.
  • -w: This option makes Wget wait the number of seconds specified after it between one download and the next. It's useful when there is a mechanism to prevent automatic browsing in the server.
主站蜘蛛池模板: 海门市| 安龙县| 武乡县| 盐源县| 江口县| 定安县| 鄄城县| 九江市| 杭锦旗| 云安县| 安新县| 重庆市| 牡丹江市| 汉川市| 襄汾县| 鄂州市| 肃宁县| 丘北县| 盘山县| 云安县| 谢通门县| 嘉兴市| 防城港市| 汾阳市| 朝阳市| 尼玛县| 霍州市| 平武县| 云林县| 郧西县| 鸡泽县| 洮南市| 乐安县| 包头市| 介休市| 巍山| 阿拉善盟| 阿城市| 寿阳县| 奇台县| 尼玛县|