Internet Protocol (IP)
Internet Protocol (IP) was originally designed to operate on top of Version 2 Ethernet. The Compendium has a separate section to discuss Ethernet. Various components of the IP protocol family were differentiated by Ethertype number. IP is assigned Ethertype 0800 hex.
When the IEEE developed the 802.3 standards for Ethernet they, essentially, replaced the Ethertype number with a Service Access Point identifier. It was necessary to include an option for embedding the original Ethertype inside a newer 802.3 frame in order to allow access to an IP Subnet. This is why there is a Sub-Network Access Protocol (SNAP) header in most IP frames that aren’t using Version 2 Ethernet.
IP operates at OSI Layer 3 and provides the routing function in an IP network. Each communicating device is assigned an IP address. The address identifies the network (which may be divided into sub-networks) and the host. The term “host” refers to any communicating device in an IP network. Originally the term referred to a central host computer. Today it includes any PC, printer, gateway, file server, or other device that has an IP address and talks on an IP network.
The discussion of IP begins with a description of the addressing scheme, progresses through the routing function, and then expands on the addressing concepts used to create sub-networks. Troubleshooting IP is the process of troubleshooting routing on the network.
This topic describes the binary nature of the IP address and the structure of the address fields.
IP Address Construction: Dotted Decimal Notation
The Layer 3 address convention in the TCP/IP world uses a 32 bit binary number to logically identify each node on the network. The communicating nodes are referred to as hosts and the 32 bit binary address number is the IP Address. This 32 bit number is represented by breaking the 32 bits into four groups of eight bits each and representing each eight bit byte with the decimal value equivalent of the binary number.
This is referred to as the Dotted-Decimal Notation or as an Octet String. For example, assume a station is assigned the following address:
To represent this address in Dotted-Decimal Notation, the address is first broken up into four bytes, as follows:
10000010 00000100 00101100 00000001
Next, each byte is converted from binary to decimal, as follows:
10000010 = 130
00000100 = 4
00101100 = 44
00000001 = 1
The result is written in Dotted-Decimal Notation as:
Although the representation of the address consists of four decimal numbers separated by dots, the underlying meaning of the address can only be understood by evaluating the 32-bit binary number. This 32-bit number is used to identify each communicating device on the network.
Flat Networks Versus Hierarchical Networks
In general there are two fundamental design relationships that can be identified in the construction of a network infrastructure. We can call these flat networks versus hierarchical networks. In a flat network every device is directly reachable by every other device. In a hierarchical network the world is divided into separate locations and devices are assigned to a specific location. The advantage of the hierarchical design is that the devices that interconnect the parts of the communications infrastructure need only know how to reach the intended destination location without having to keep track of the individual devices at each location. This device is a router. It makes a forwarding decision by looking at that part of the station address that identifies the location where the station resides.
All addressing in hierarchical networks may be considered to have two distinct parts. We might refer to these two parts as a locator portion and a node portion. The locator portion identifies the location at which the node resides. This “locator & node” address operates at Layer 3 of the OSI model, the Network Layer. Different vendors use different terms to refer to the locator and node portions of the address. For example, Novell NetWare calls them the “network” and “node” number. DECNet refers to the “area” and “node” number. TCP/IP uses two different terms to refer to the locator portion of the IP address. In a very fundamental sense these two terms are synonymous. They are the “network” and the “subnetwork”. While it is true that the concept being expressed in IP routing is that of a “network” which is sub-divided into smaller locations referred to as “subnetworks” it must be remembered that a “subnetwork” may be further subdivided into smaller locations which would each also be referred to as a “subnetwork”. The difference is mainly semantic in nature. Which of the following descriptions would you like to adopt?
- The world is divided into separate networks, interconnected with routers.
- A network is divided into subnetworks, interconnected with routers.
- A subnetwork is divided into smaller subnetworks, interconnected with routers.
Whichever description you choose to represent the interconnection between locations the basic properties of connectivity remain identical. A predefined location is connected to another predefined location using the decision making capabilities of a router. The terminology only helps to confuse the innocent (and that’s us!).
In the IP world there is a configuration parameter that defines which bits in a Layer 3 address are used to differentiate between the locator portion and the node portion. This parameter is called the “address mask” (also called the “Subnet Mask”). We’ll talk in-depth about the address mask under the SUBNET MASK topic (this is the Next Topic), but the simple explanation is that for each ‘1’ bit in the mask, the corresponding bit in the address is considered to be part of the locator portion. Whether you refer to this as a subnet or a “Class A” or “Class B” network; it makes no difference. (The ADDRESS CLASSES topic describes network classes.) To understand IP addressing you must understand this fundamental rule, again:
For each ‘1’ bit in the mask, the corresponding bit in the address is considered to be part of the locator portion of the address. The remaining bits are considered to identify a specific node at that location.
IP Routing Functions
This topic describes the way IP uses a routing table to make forwarding decisions.
Understanding Routing with IP
To minimize unnecessary traffic load and to provide efficient movement of frames from one location to another, the interconnected hosts are grouped into separate networks. As a result of this grouping (which is determined by the network designer and administrator) it is possible for an interconnect device to determine the best path between two networks.
This interconnect device, by definition, is called a router. A router, operating at Layer 3, the Network Layer, forms the boundary between one network and another network. When a frame crosses a router it is in a different network. A frame that travels from source to destination without crossing a router has remained in the same network. A network is a group of communicating machines bounded by routers.
The router will use some of the bits in the IP address to identify the network location to which the frame is destined. The remaining bits in the address will uniquely identify the host on that network that will ultimately receive the frame.
It is necessary to differentiate between the bits used to identify the network and those used to identify the host. The sender of a frame must make this differentiation because it must decide whether it is on the same network as the destination or on a different network. If the sender is on the same network as the destination, it will determine the data link address of the destination machine. Then it will send the frame directly to the destination machine.
On the other hand, if the destination is on a different then the originator must send the frame to a router and let the router forward the frame on to the ultimate destination network. At the ultimate destination network the last router must determine the data link address of the host and forward the frame directly to that host on that ultimate destination network.
When a router receives an incoming data frame it masks the destination address to create a lookup key that is compared to the entries in its routing table. The routing table indicates how the frame should be processed.
The frame might be delivered directly on a particular port on the router. The frame might have to be sent on to the next router in line for ultimate delivery to some remote network. The routing table contains this information. The routing table is created by the combination of direct configuration by the administrator or dynamically through the periodic broadcasting of router update frames. Protocols like RIP (Routing Information Protocol), OSPF (Open Shortest Path First), and Cisco’s IGRP (Internet Gateway Routing Protocol) are sent from all routers at periodic intervals. As a result, all routers become aware of how to reach all other networks.
Consider this example:
A network consists of three routers with four segments
The routers use the mask value 255.255.0.0. The routing table in the router between 126.96.36.199 and 188.8.131.52 says:
|Result of Mask||Do this with the frame|
|184.108.40.206||ARP for destination on Port #1 and deliver directly|
|220.127.116.11||ARP for destination on Port #2 and deliver directly|
|18.104.22.168||Forward frame to next router: 22.214.171.124|
|126.96.36.199||Forward frame to next router: 188.8.131.52|
Study this example to understand the relationship between the routing table and the physical network. Assume you are an end-node attached to Network 184.108.40.206. If you want to send a frame to a destination on Network 220.127.116.11 you will send the frame to the router. The router knows to deliver the frame directly on the appropriate port. If the frame were destined to network 18.104.22.168 then the router would know the forward the frame to the next router in line, in this example it is identified as 22.214.171.124. 126.96.36.199 would then deliver the frame directly on its own port to network 188.8.131.52.
Because of the masking process the router knows that some station, perhaps 184.108.40.206, is on the same network as 220.127.116.11. The masking of both these addresses produces 18.104.22.168 which is looked-up in the table. All frames destined to 22.214.171.124 are sent correctly. The router doesn’t have to keep track of all end- nodes individually. Because the end-nodes are grouped together into networks the router can process the frames without the need for a massive list of all stations in the known universe.
Additional Information about IP Routing
RFC 1812: The specific behavior that is expected from an IP router is discussed in RFC 1812. This is a somewhat lengthy document but it does provide a complete discussion of routing in the IP Version 4 network environment.
This topic explains the way IP addresses use bit fields to represent logical divisions in the network, called SubNetworks. Bits are assigned to identify the network portion of the address, the subnet portion, and the remainder are the host portion.
…Also Called “Address Masking”
The IP Address Mask is a configuration parameter used by a TCP/IP end-node and IP router to differentiate between that part of the IP address that represents the network and the part that represents the host.
A router uses the mask value to create a key value that is looked up in the router table to determine where to forward a frame. An end-node uses the mask value to create the same key value but the value is used to compare the destination address with the end-node address to determine whether the destination is directly reachable (on the same network) or remote (in which case the frame must be sent to a router and can not be sent directly to the destination).
The mask value can be assigned by default or it can be specified by the installer of the end-node or router software. The destination IP address and the mask value are combined with a Boolean AND operation to produce the resultant key value. Consider the following example:
An end-node is assigned the IP address 126.96.36.199 and a mask value of 255.255.0.0. This end-node wants to send a frame to 188.8.131.52. If 184.108.40.206 is on the same network as 220.127.116.11 then the end-node will broadcast an ARP (Address Resolution Protocol) frame to determine the data link address of the destination and it will then send the frame directly to the destination. If 18.104.22.168 is on a different network then the workstation must send the frame to a router for forwarding to the ultimate destination network.
All the dotted-decimal notation must be converted to the underlying 32-bit binary numbers to understand what is taking place.
|End-Node 22.214.171.124||= 10001100||00000110||00001111||00000011|
|Mask 255.255.0.0||= 11111111||11111111||00000000||00000000|
|Destination 126.96.36.199||= 10001100||00000111||00001001||00000010|
When the End-Node IP address is AND’ed with the mask the result is:
|End-Node 188.8.131.52||= 10001100||00000110||00001111||00000011|
|Mask 255.255.0.0||= 11111111||11111111||00000000||00000000|
|RESULT OF “AND”||= 10001100||00000110||00000000||00000000|
|(In dotted-decimal)||= 140.||6.||0.||0|
When the destination IP address is AND’ed with the mask the result is:
|Destination 184.108.40.206||= 10001100||00000111||00001001||00000010|
|Mask 255.255.0.0||= 11111111||11111111||00000000||00000000|
|RESULT OF “AND”||= 10001100||00000111||00000000||00000000|
|(In dotted-decimal)||= 140.||7.||0.||0|
Since the results (220.127.116.11 and 18.104.22.168) are not equal the end-node concludes that the destination must be on a different network and the frame is sent to a router. This is the way an end-node uses the mask value. A router, on the other hand, masks the destination address in an incoming frame and the result is used as a lookup key in the routing table.
This topic describes the fundamental “classes” of IP addresses; major address divisions defined by the standards. Three standard address masks (subnet masks) are used to differentiate the address classes.
Assignment of Address Classes in the IP Version Four (IPV4) Specifications
Origins of Internet Protocol Version Four Addresses
The origins of the current implementation of the Internet Protocol (IP Version 4 or IPV4) and its associated classes of IP addressing can be traced to RFC 791: Internet Protocol (September 1981). As originally envisioned, these IP addresses were to be of fixed, 32-bit (4 Octets) length comprised of a Network Number and a Local Address or Host Number. The resulting range of addresses were then divided into three broad groupings or “Classes”, each based upon the bit values within the first octet:
Class A – high order bit is “0”, the remaining 7 bits are the network, and the last 24 bits are the host
Class B – high order two bits are “10”, the remaining 14 bits are the network, and the last 16 bits are the host
Class C – high order three bits are “110”, the remaining 21 bits are the network, and the last 8 bits are the host
*Note: There are two additional classes of IPV4 addressing, know as Class D & E that were specified in subsequent RFC’s. These additional classes were intended for highly specialized functions and are identified as follows:
Class D – high order four bits are “1110”, the remaining 20 bits identify the Multicast group
Class E – high order five bits are “11110”, the remaining bits are reserved for experimental use
These two specialized classes of address are outside the scope of this article and will be addressed in a later Technical Compendium article.
The Roll of Masking in Determining Address Class
Implied within RFC 791, was the concept of “Masking”, be used by Routers and Hosts. The masks were defined as follows:
- Class A mask = 255.0.0.0
- Class B mask = 255.255.0.0
- Class C mask = 255.255.255.0
*Note: These three original masks are often to as “Class A, B, C” or “Default” subnet masks. Details regarding the rolls of subnet masking are covered in the Technical Compendium articles “IP Address Construction: Dotted Decimal Notation, Subnet Masking, Creating Subnets, Special Subnet Masks and VLSM – Variable Length Subnet Masking”.
These masks were applied by default based on the value of the leading bits in the IP address. If an address started with a binary 0, then stations assumed Class A masking. The starting bits 10 indicated Class B, and 110 indicated Class C. Consequently, the class of addressing masking being used could be determined by looking at the first octet in the address as shown below:
- Class A starts with 0 and ends with 0111 1111, hence the smallest value in the first octet is decimal 0 and the largest value is 127 yielding a potential range of 0-127.
- Class B starts with 10 and ends with 1011 1111, hence the smallest value in the first octet is decimal 128 and the largest value is 191 yielding a potential range of 128-191.
- Class C starts with 110 and ends with 1101 1111, hence the smallest value in the first octet is decimal 192 and the largest value is 223 yielding a potential range of 192-223.
- Class D (Reserved Multicast) starts with 1110 and ends with 1110 1111, hence the smallest value in the first octet is decimal 224 and the largest value is 239 yielding a potential range of 224-239.
- Class E (reserved Experimental) starts with 1111 and ends with 1111 1111, hence the smallest value in the first octet is decimal 240 and the largest value is 255 yielding a potential range of 240-255.
Some practical Examples of IP addressing –
- A station with address 10.2.3.4 would be an example of a Class A address. The default mask for this station would be 255.0.0.0. The network identifier would be 10.0.0.0 (after applying the mask to the address) and the Host identifier would therefore be 2.3.4.
- A station with address 22.214.171.124 would be an example of a Class B address. The default mask for this station would be 255.255.0.0. The network identifier would be 126.96.36.199 (after applying the mask to the address) and the Host identifier would therefore be 15.3.
- A station with address 188.8.131.52 would be an example of a Class C address. The default mask for this station would be 255.255.255.0. The network identifier would be 184.108.40.206 (after applying the mask to the address) and the Host identifier would therefore be 8.
Determining The Number of Networks and Hosts in Each Class
Notice that this division of addressing into these three classes allows for the following potential number of addresses:
- Class A
- 8 bits in the Network part, 24 bits in the Host part
- 2^8 or 128 possible values in the Network part and 2^24 or 16777216 possible values in the Host part
- Class B
- 16 bits in the Network part, 16 bits in the Host par
- 2^16 or 65536 possible values in the Network part and 2^16 or 65536 possible values in the Host part
- Class C
- 24 bits in the Network part, 8 bits in the Host part
- 2^24 or 16777216 possible values in the network part and 256 values in the node part.
(Now, before you do the math on your own, let’s work through the actual number of networks and hosts or nodes in each address class. First of all, realize that the number of values in a field is determined by raising the number2 to the exponential power determined by the number of bits in the field. Consequently we find:
2, raised to the 24th power: 224 = 16777216
2, raised to the 16th power: 216 = 65536
2, raised to the 8th power: 28 = 256)
*Note: The above values represent a maximum theoretical number of Network and Host addresses. AS we are about to examine, there are a number of factors that directly effect the true number of available addresses.
Factors Affecting Available Address Values
As previously mentioned, there are a number of factors that have an in pact upon the available number of Network and Host addresses actually available within each address class:
1. Overlapping Bit Values: Foremost among these is the simple fact that bit values used for one class may NOT be used for the subsequent class(es):
- Recall that for a Class A Network address, the Network identifier is 8 bits, however the first bit must always be zero, so that leaves only 7 bits to differentiate, so there are 2^7=128 possible class A Networks.
- For a Class B Network address, the Network identifier is 16 bits, but the first two must be 10, so that leaves only 14 bits to differentiate, so there are 2^14=16384 possible Class B Networks.
- For a Class C Network address, the Network Identifier is 24 bits, however, the first three must be 110, leaving only 21 bits to differentiate, so there are 2^21= 2,097,152 possible Class C Network addresses.
2. Reserved Broadcast Numbers: Additionally, there are two values in each set that will not be available. When the network or node portion of an IP address is set to all “1”s (ie, decimal 255 or hex FF) it indicates that this is a broadcast destination. There is also an older form of the broadcast address, from the UNIX environment, that used all “0”s. Although this form is considered obsolete it still may exist in some sites.
*Note: If you encounter a station that is still using zero’s as a broadcast address you should reconfigure it (after you confirm that it really isn’t being used by some alien system!).
Also, within the UNIX environment the configuration of a station includes specification of the correct broadcast address to use. A UNIX administrator might type “ifconfig broadcast 220.127.116.11” to specify the broadcast address for station 18.104.22.168. Other environments may allow only the use of the default values or may offer other ways of configuring the broadcast address.
Therefore, for each range of values, you must subtract 2 to arrive at the number of potential values that are available.
Summary of Actual Network and Host Values
All of this math can get more than a bit confusing, so the following chart summarizes the results of all of the mathematical manipulations and lists the available Network and Host addresses for each of the IPV4 Classes of Addressing:
|Address Class||Range||Leading Bits||Implied Mask / Host Bits / Hosts|
|Class A||000 – 127||0000 0000 – 0111 1111||255.XXX.XXX.XXX|
|0/127 Reserved, 126 Networks Possible|
|16,711,680 Hosts Possible|
|Class B||128 – 191||1000 0000 – 1011 1111||255.255.XXX.XXX|
|16,382 Networks Possible|
|65,536 Hosts Possible|
|Class C||192 – 223||1100 0000 – 1101 1111||255.255.255.XXX|
|2,097,150 Networks Possible|
|0/255 reserved, 254 Hosts Possible|
|Class D||224 – 239||1110 0000 – 1110 1111||Reserved Network Multicast|
|Class E||240 – 247||1111 0000 – 1111 0111||Reserved For Experimental|
Additional Related Topics:
There are a number of topics that are directly related to the material contained within this article. The following is a partial listing of some related topics:
1. Technical Compendium articles:
- IP Address Construction: Dotted Decimal Notation
- Subnet Masking
- Creating Subnets
- Special Subnet Masks
- VLSM – Variable Length Subnet Masking”.
- Reserved Address List
2. Request For Comments (RFC’s):
- RFC 791: Internet Protocol (September 1981)
- RFC 917: Internet Subnets (October 1984)
- RFC 940: Toward an Internet Standard Scheme for Subnetting (April 1985)
- RFC 950: Internet Standard Subnetting Procedure (Aug 1985)
- RFC 1219: On the Assignment of Subnet Numbers (April 1991)
- Creating Subnets
This topic details the mechanism whereby IP addresses represent network, subnetwork, and host by using bit fields. The process of configuring these bit fields (the ‘subnet mask’) is described.
Extending the Use of Address Masking
The original conception for three default masks defined by the leading bits in the address field was extended to allow installers and administrators to specify any mask value they wanted to use. In this way the restrictions placed on the addresses by the original masking were removed. The requirement remained, however, that the original masking be honored. It could be extended but not shortened. So, suppose that a site was assigned the address 22.214.171.124. This would imply that all stations at the site were on the same network. The default masking forced the use of the first two octets to identify the network and the remaining octets identify the hosts. If this site wanted to subdivide into separate networks, however, there was no facility in the original scheme to allow the address to reflect anything other than one level of hierarchical division; the world is divided into networks – end of story.
Because the mask was now a configurable parameter, the Class B network 126.96.36.199 could use a mask of 255.255.255.0, like a Class C network. The rest of the universe would see all stations at this site as being part of network 188.8.131.52 but within the site the routers would see the world as being divided on the basis of the first three octets. These are referred to as subnetworks. The site would be divided into subnetworks 184.108.40.206, 220.127.116.11, up to 18.104.22.168.
The extension of the original default address masks to allow any desired mask parameter assumes compliance with the original class masks. That is, a Class A network must have a mask of at least 255.0.0.0, a Class B network must have a mask of at least 255.255.0.0, and a Class C network must have at least 255.255.255.0. A Class B network could, for example, have a mask of 255.255.255.0 or any other mask as long as the first two octets are included in the mask; to remain compliant with the original mask definitions.
The extension of the address mask allows networks to be further subdivided into sub networks. For example, the Class B network 22.214.171.124 could be subnetted with 255.255.255.0 as a mask. This would create subnets like 126.96.36.199, 188.8.131.52,… up to 184.108.40.206.
Special Address Masks
This topic talks about extending the idea of address classes to create ‘non-standard’ masks. The masks are ‘non-standard’ because the go beyond the original specification for address class. These types of masks are in common use today; hence, the ‘non-standard’, special address masks are, in fact, very ‘standard’ and typical.
Address Masks That Don’t End on an Eight-Bit Boundary
If we recall that a mask of 255.255.255.0 actually represents a binary number then we can easily understand how the mask value need not end on an eight-bit boundary. For example, consider a Class B network 220.127.116.11. In this network I want to have up to 1024 hosts per subnetwork. This means I will need 10 bits to represent the host portion of the address (since 2 raised to the 10th power = 1024). I need 10 “0” bits in the mask to represent the host portion, the remainder will be split between the network and subnetwork identifiers. Since Class B uses the first 16 bits to identify the network this leaves 6 bits to represent the subnetwork. I can have up to 64 different values in the subnet field and up to 1024 values in the host field. The address mask looks like this:
11111111 11111111 11111100 00000000
There are three fields that are defined in this mask. Because we are applying this to a Class B network the first 16 bits represent the network. The next 6 bits represent the subnetwork, and the last 10 bits represent the host.
When each octet is converted to decimal the mask value becomes:
Any number of bits can be used to identify the subnetwork. Some other examples (in Class B) include:
- 255.255.255.128 – Nine bits for the subnet, seven bits for the host
- 255.255.255.192 – Ten bits for the subnet, six bits for the host
The exact same scheme applies to Class A but only the first eight bits are included in the network part. This allows many more combinations for subnetting. Some examples in Class A include:
- 255.255.0.0 – (Typical) Eight bits for the subnet, sixteen bits for the host.
- 255.252.0.0 – Six bits for the subnet, eighteen bits for the host.
- 255.255.252.0 – Fourteen bits for the subnet, ten bits for the host.
When considering the subnetting in a network it is necessary to convert the dotted-decimal octet strings for the addresses and the mask back into binary to compare the fields and evaluate the results.
Reserved Address List
This topic describes and lists some of the IP address constructions that have been assigned specific meanings and, therefore, are not available for use as unique end-station addresses.
Special Purpose and Reserved IP Addresses
When the subnet mask is applied to an IP address the result is the definition of some number of bits for use in identifying the subnetwork on which the device resides. For any given masking, the subnet bit value of all zeros and the value of all ones is not used to specify a device. RFC 950 “Internet Standard Subnetting Procedure” discusses these issues in detail.
To calculate the number of available subnets you use the formula: 2^n -2, where n is the number of bits used in the mask. In the case where three bits were used as the subnet portion of an address you would have: 2^3 – 2 = 6 combinations available. In reality, there are 8 combinations for a 3-bit pattern. Because of the restriction on using all zeros or all ones, the resulting number of legal subnets that can be uniquely identified with three bits is 6.
It should be noted that BSD 4.3 UNIX lets the administrator select zeros or ones to indicate a broadcast destination. In current practice zeros are never used. RFC 1118 and RFC 1009 both discuss these issues in more detail.
Here is an outline of the requirements for special-purpose subnet numbers. The reference to NETWORK means the network portion of the IP address (As defined by the Class A, B, or C address class range). SUBNET refers to that portion (those bits) in the address that are defined as the subnet portion (by the sub net/address mask). HOST means the remaining bits in the address (not included in the network or subnetwork portion).
|If You See This:||It Means This:|
|Entire IP address is all zeros||As a source address: “This host”
As a destination address: “I don’t know the correct IP address to use so I’ll use all zeros” This was also the early form of the IP destination address which meant “BROADCAST” but this usage is now considered obsolete.
|Entire IP address is all ones.||This is called a LIMITED BROADCAST. This is a broadcast to all hosts on the current subnet. A router will NOT forward this type of broadcast to other networks.|
|NETWORK = A valid network number.
SUBNET = A valid subnet number.
HOST = All ones.
|This is called a DIRECTED BROADCAST. This is a broadcast to all hosts on the specified network and subnetwork. A router WILL forward this frame for broadcast on the specified destination network and subnetwork.|
|127.x.x.x||This is a local loopback address. Frames sent to this address will be looped back and returned to the sending application without actually being sent onto the network.|
|0.x.x.x, 128.0.x.x, 191.255.x.x, 129.0.0.x, 223.255.255.x||These address constructions are reserved by the Network Information Center.|
- VLSM – Variable Length Subnet Masking
This topic explains VLSM, the configuration of different subnet masks at different levels of the network tree topology.
Variable Length Subnet Masks
Consider the XYZ Corporation which has been assigned the network number 18.104.22.168 from the InterNIC. The world sees this company as 22.214.171.124. Within the XYZ Corporation, however, the division of the network is very different. They could use Variable Length Subnet Masks to divide their world into a multi-level hierarchy.
The term Variable Length Subnet Mask (VLSM) refers to a design practice of creating sub-subnets in a tree-structured network. XYZ Corp has an office in many states, 23 field offices in all. The designers of the XYZ Corp network decide to divide their network into 32 subnets using a mask of 255.255.252.0. In binary, the mask bits are:
11111111 11111111 11111100 00000000
The six bits of “1” in the third octet are the subnet bits (since the first 16 bits represent the network). These six bits can differentiate between up to 64 different subnetworks. This is the same logic as would be applied to any subnet mask.
Now, however, it is realized that at each site there is a sales division, an accounting department, a marketing group, and a technical support group. The designers want to further subdivide each site with a router. This would require further division of the address field. No problem. The main, central routers are subnetted 255.255.252.0 and they differentiate between field offices. The field office routers, however, are subnetted with 255.255.255.128. Think about this in the binary representation.
Notice that the field router defines an additional three bits in the mask. These three bits can be used to differentiate between 7 more subnet numbers (Since 2 raised to the 3rd power = 8). Of these eight possible values, the 000 and 111 value are not available for use in identifying a specific subnet. Refer to the Reserved IP Address List for more information on these restrictions. These are going to be used to route between the sales, accounting, marketing, and tech support groups at each field site. Perhaps the assignment is like this:
|Tech Support =||100|
So, at a particular location, we discover that the bit sequence “000011” has been used to represent the site, say the network at the field office in Palo Alto, California. Here are the four divisions:
|Tech Support =||000011||100||XXXXXXX|
The “X”s represent the bits that are available to differentiate between individual stations (hosts) in each department. When viewed in the binary sense this scheme identifies FOUR fields. The NETWORK PORTION (in each case this is 126.96.36.199), the SUBNET (which is 000011), a “sub”-subnet (001,010,011, and 100) and the node portion (the “X”s). The routers understand how to divide the address based on the subnet mask. The world, in our example, sees 255.255.0.0. The company sees 255.255.252.0. Each field office sees 255.255.255.128. The router masks the address and looks up the result in its table to determine how to forward the frame. Since the router “thinks” in binary there is no confusion, no problem. We, however, don’t think in binary. Consider these three stations shown with their dotted decimal and binary representations:
|188.8.131.52||10100000 . 00000110 . 00001100 . 10000001|
|184.108.40.206||10100000 . 00000110 . 00001110 . 00000001|
|220.127.116.11||10100000 . 00000110 . 00001110 . 10000001|
When looking at the dotted-decimal notation there is nothing immediately obtuse. In fact, when looking at the binary you don’t necessarily see the conflict immediately. To understand any subnet masking it is necessary to break the 32-bit address into the fields defined by the variable length masks.
First, mask the addresses with the 255.255.0.0 used by the world at large:
|Mask =||11111111 . 11111111 . 00000000 . 00000000|
|18.104.22.168 =||10100000 . 00000110 . 00001100 . 10000001|
|Result =||10100000 . 00000110 . 00000000 . 00000000|
You can see that all three address mask back to 22.214.171.124; they are all on the same network as far as the world is concerned. Now lets just consider the last 16 bits of each address (since we know the first 16 are the same in all three cases).
The next router uses that mask 255.255.252.0; we are considering the 252.0 part. The masking now continues as follows:
|Mask =||11111100 . 00000000|
|(160.6).12.129||00001100 . 10000001|
|(160.6).14.1||00001110 . 00000001|
|(160.6).14.129||00001110 . 10000001|
Do you see that all three station are identified with 000011 as the bit pattern included in the masked portion? This means that the next router in line (the one masked as 255.255.252.0) will direct frames to all three of these stations to the same destination router according to its routing table.
The last router in this hierarchy is using the mask 255.255.255.128. Here is the masking:
|Result Of Masking|
|Mask =||11111111 . 10000000|
|(160.6).12.129||00001100 . 10000001||00001100 . 10000000|
|(160.6).14.1||00001110 . 00000001||00001110 . 00000000|
|(160.6).14.129||00001110 . 10000001||00001110 . 10000000|
It is critical that you understand this last step. Do you see that the bits included by the mask have been included in the result? Do you see the three additional bits used as the mask went from 255.255.252.0 to 255.255.255.128? Now we can assess the validity of the addresses. We know that the design intent called for four “sub”-subnetworks (001, 010, 011, and 100). Don’t be confused because these bits “span” the dot in the dotted-decimal notation. This is the confusing aspect of using anything other than “255”s in a subnet mask; the actual fields don’t break at the dots. The fields break as defined by the mask bits.
In this example, we know that all three stations are on the subnet defined with the leading bits “000011”. This leaves the “other” three bits to further differentiate between sub-subnets. (By the way, the term “sub”-subnet is being used only in the context of this document. The real world simply calls all of them “subnets” without regard for their level of hierarchical differentiation.) The remaining three bits may be broken out as follows (this is the table above simply repeated and clarified):
|Result Of Masking|
|Mask =||11111111 . 10000000|
|(160.6).12.129||00001100 . 10000001||000011 [ 00 . 1 ] 0000000|
|(160.6).14.1||00001110 . 00000001||000011 [ 10 . 0 ] 0000000|
|(160.6).14.129||00001110 . 10000001||000011 [ 10 . 1 ] 0000000|
Compare this to the design document which defined:
|Sales =||000011 001 XXXXXXX|
|Accounting =||000011 010 XXXXXXX|
|Marketing =||000011 011 XXXXXXX|
|Tech Support =||000011 100 XXXXXXX|
What we now see is that the address 126.96.36.199 has the subnet bits 101 at the mask point 255.255.255.128. This is an address which is not defined by the design of the network. Herein lies the danger with variable length masking: it’s awfully confusing to our decimal brains.
Troubleshooting TCP/IP Networks
This topic talks about troubleshooting methodology for IP-related problems. This is a section of the Compendium that we are working to constantly expand. We welcome your FEEDBACK.
Thoughts About Fixing Problems…
The actual troubleshooting maxim is quite simple: Follow the frame from source to destination. Each station should be forwarding the frame to a correct destination; router to router; until the final destination is reached. If someone doesn’t forward the frame correctly, and if the destination address is valid, then that station is misconfigured.
To know what the expected forwarding will be from router to router it is necessary to understand the underlying subnet masking being used by the routers and by the nodes. The meaning of the dotted-decimal IP address can only be ascertained by applying the mask using binary arithmetic to determine which bits are used to represent the network, the subnet (or subnets), and the host.
IP Routing Tables
This topic explains how a host (any communicating device) makes a forwarding decision by evaluating the contents of its routing table.
The Logic Used To Make Forwarding Decisions
Every device on the network maintains a routing table in memory. This table may be very simplistic, as would be the case with a low-end PC workstation, or very complex, as would be the case with a high-end router.
The table consists of pairs of IP addresses. “Where do you want to send to?” coupled with “Where do you really send to get there?”. For example, a routing table may indicate that all frames destined for network 188.8.131.52 are to be sent to IP address 184.108.40.206. The immediate destination is always directly reachable; in this case we must be on network 220.127.116.11 and we trying to reach an ultimate destination on 18.104.22.168. The routing table provides a lookup for a device to tell it how to send frames.
There are two interesting exception cases. In the one case, the originating station concludes that the ultimate destination is, itself, directly reachable. In that case the frame will be forwarded directly to the destination IP address in the frame. It is not necessary to forward it to a router; the sender realizes that it is on the same subnetwork as the destination. In the other case the routing table doesn’t have any matching lookup value when the masked destination address is compared against the available entries in the table. In this case a special address appears in the routing table called the Default Gateway Address. The pair of values in the table has “0.0.0.0” paired with an assigned DLC address of a router. This is the router to which frames will be sent when there is no other reference to the frames’ destination in the table.
When making a routing decision the following steps are taken:
- Mask the destination IP address, mask your own IP address. If the results are the same then you are on the same subnet as the destination. Send the frame directly to the data link (physical) address of the destination.
- If the destination is not on the same subnet then check the routing table to see if the exact, complete, 32-bit destination address is specified. This is referred to as a Host Specific Route. If such a route is specified then forward the frame to the IP destination indicated in the table. The implication is that this destination is the next router in line on the way to the destination.
- If a Host Specific Route doesn’t appear in the routing table then use the masked address that you computed as the lookup key against the routing table. That is, see if the network/subnetwork is specified in the table. If it is, then forward the frame to the IP address specified in the table. The implication, again, is that this is the IP address of the next router in line.
- If there isn’t a Host Specific Route and there also isn’t a reference to the network/subnetwork then forward the frame to the address specified as the target for the Default Gateway.
- If there is no default gateway specified then assume that all unspecified destinations are directly reachable. Resolve the physical address of the destination IP station and forward the frame directly to the destination. This is sometimes referred to as activating “Proxy ARP” which will be discussed in more detail later
This topic introduces the ARP (Address Resolution Protocol) and explains how it is used by IP to resolve a physical network interface card address from an IP address. A separate section of the compendium talks in detail about the ARP Protocol.
Resolution Of Physical Addresses
So, a device uses its routing table to determine the destination for a frame. The “destination” from the routing table is either an IP address or the awareness that the frame should be delivered directly. Furthermore, the destination may be perceived to be on the same subnet to begin with and the routing table isn’t even consulted. In every case, when a station (be it an end-node or a router) wants to send a frame to some particular IP destination it must first figure out what data link destination address will be used in the frame. This is referred to as the process of Address Resolution.
There are two methods used for address resolution in IP networks. The first method, used in IP Version 4, uses ARP (Address Resolution Protocol); this method has been used since the 1970’s and is considered the standard. The new method, in IP Version 6 (IPng, “IP Next Generation”; the final drafts of which came out in January, 1995) which uses ICMP (Internet Control Message Protocol) for address resolution.
ICMP has, as a protocol, been in use since the 1970″s but its use in address resolution is brand new with IP Version 6. In 1996, the deployment of IPng (Version 6) is limited and it will surely be a number of months before the implications of ICMP resolution become significant to most networks. Consequently, in this document only the ARP method of address resolution will be examined in detail.
An ARP frame has four significant fields relative to our discussion. These are:
- Sender’s Hardware Address – The data link address of the sender
- Sender’s Protocol Address – The IP address of the sender
- Target Hardware Address – The data link address of the target being sought
- Target Protocol Address – The IP address of the target being sought
Some slight reflection will lead you to correctly conclude that all four of these fields are not filled in by the sender of an ARP frame. If I want to know the data link address being used by 22.214.171.124 then I will put 126.96.36.199 in the Target Protocol Address field and, typically, I will set the Target Hardware Address field to zero’s; I don’t know the hardware address of 188.8.131.52 – That’s what I’m looking for.
I send this ARP Command frame to the Ethernet or Token-Ring broadcast destination – everyone on the cable hears the frame. If 184.108.40.206 hears the frame then it responds with an ARP Reply. In the ARP Reply, 220.127.116.11 supplies its data link (hardware) address. In this way I know the association between the IP address and the data link address; I have resolved the IP address.
The association between IP address and data link address is recorded in a temporary memory table referred to as the ARP Cache. The ARP Cache tells a station what data link destination address to use for every IP destination address.
When a station first boots up it must ARP for the IP address of the Default Gateway. Additionally, prior to sending to any destination IP address, there must be an entry in the ARP Cache associated with that destination IP address. The ARP Cache is dynamic. Entries age out after just a few seconds if they are unused. Consequently, during a typical conversation a station ARPUs for the destination once at the beginning and then the entry remains in the ARP Cache for the life of the connection (assuming the connection is active).
The two tables (the Routing Table and the ARP Cache) form the basis for frame transmission and forwarding. The address mask serves as the guide for interpreting the IP address. The Default Gateway is the destination specified in the routing table to use when no other destination can be ascertained. The idea of a default destination has no corresponding behavior in the ARP Cache.
IP Type Of Service
This topic details the meaning of the bits in the IP Type Of Service field in the IP header.
The ‘Type Of Service’ Byte In The IP Header
RFC 791, (Internet Protocol – DARPA Internet Program Protocol Specification, September 1981), defined a field within the IP header called the Type Of Service (TOS) byte. This Byte is used to specify the quality of service desired for the datagram and is an amalgamation of several factors. These factors include several fields such as Precedence, Speed, Throughput and Reliability as identified below. In normal conversations you would not use any special alternatives, so the Type of Service byte typically would be set to zero. However, with the advent of Internet multimedia transmission and the emergence of protocols such as Session Initiation Protocol (SIP), this field is coming into use.
(A general note regarding the use of the IP TOS Byte is that in the course of normal network operations, not including Internet Multimedia, if you ever see the Type Of Service byte set to anything other than zero you should find out who’s doing it, and why.)
The IP Type of Service Byte:
Bits 0-2: Precedence.
Bit 3: Delay (0 = Normal Delay, 1 = Low Delay)
Bit 4: Throughput (0 = Normal Throughput, 1 = High Throughput)
Bit 5: Reliability (0 = Normal Reliability, 1 = High Reliability)
Bits 6-7: Reserved for Future Use.
The three bit Precedence field is further defined as follows:
111 – Network Control
110 – Internetwork Control
101 – CRITIC/ECP
100 – Flash Override
011 – Flash
010 – Immediate
001 – Priority
000 – Routine
So what exactly is the Precedence field, and what is the difference between these various classifications such as Priority and Immediate? Recall that ARPAnet originated under the authority of the DOD, and that a number of the early performance parameters were patterned after existing DOD communications models. It is from these already established models of communications that the concept of the Precedence Field emerged. The answer for the priorities (Routine – CRITIC/ECP) is defined in Department of Defense (DOD) communications message handling directives and in RFC 791. The remaining two classifications (Internetwork Control and Network Control) are defined in RFC 791.
The following is a synopsis of these classifications as specified by DOD Communications Directives and RFC 791:
A. DOD DD173 Precedence/Priority Filed Explanations (Lowest-Highest):
- Routine: (R) “…is used for all messages that justify transmission by electrical means unless the message delivery is of sufficient urgency to require higher precedence.”
- Priority: (P) “…is used for all messages that require expeditious action by the addressee(s) and/or furnish essential information for the conduct of ongoing operations.”
- Immediate (O) “…is reserved for messages relating to situations that gravely affect the security of National/Allied forces or populace.”
- Flash (Z) “…is reserved for initial enemy contact messages or operational combat messages of extreme urgency.”
- Flash Override (X) “… is reserved for messages relating to the outbreak of hostilities and/or detonation of nuclear devices.”
- CRITIC/ECP “…stands for “Critical and Emergency Call Processing” and should only be used for authorized emergency communications, for example in the United States Government Emergency Telecommunications Service (GETS), the United Kingdom Government Telephone Preference Scheme (GTPS) and similar government emergency preparedness or reactionary implementations elsewhere.”
B. RFC 971 Specific Classifications:
According to RFC 791, the functions of the classifications: Network Control and Internetwork Control are defined as follows:
1. Network Control “…is intended to be used within a network only. The actual use and control of that designation is up to each network.”
2. Internetwork Control “…is intended for use by gateway control originators only.”
RFC 791 further addresses the use of these two classifications by noting that if used, “… it is the responsibility of that network to control the access to, and use of, those precedence designations.”
It should now be apparent why the Precedence field is no longer used in traditional networking applications. Therefore, if you ever see these bits in use, you should find out who’s doing it, and why. The implication of using the priority bits is completely vendor dependent. Consider the following example of IP TOS in a contemporary network environment:
Let’s assume that you have a router that shows three different routes between Honolulu (on the island of Oahu) and Hilo (on the big island of Hawaii). You have a leased copper T1 line (which runs in an undersea cable), a fiber optic link (also in the cable), and a T1 satellite link. You decide that there is the least data loss in the fiber optic cable and you select it to be your link of greatest reliability. You know that most traffic is going across this link (because you have load-balancing routers) and that the satellite link and the leased T1 are almost unused. You decide that the leased T1 will be your link of least delay and that the satellite link will be your link of maximum throughput. These decisions are totally arbitrary; let’s hope you made good decisions.
Let’s further assume that your routers support the speed, throughput, and reliability types of service. You configure the Router and identify the links according to your decisions. Let’s now assume that your workstation has the need, ability, and application software that will try to utilize different types of service for different activities. Perhaps your file transfer utility will select the link of greatest bandwidth while your accounting application selects the link of greatest reliability. Meanwhile, your terminal access software selects the link of highest speed.
Now consider the following criteria: Does your router support alternative types of service? Do your workstations support these options? A failure of any device in the datagram path will result in the traffic not being properly routed to its destination. This is why you need to determine who is responsible if you ever see the Type Of Service byte set to anything other than zero.
If a misconfiguration exists or an error occurs that relates to the IP Type Of Service, a protocol called ICMP (Internet Control Message Protocol) will report the error back to the station sending the failed frame. RFC 1349 discusses the relationship between IP Type Of Service and ICMP messages. For more information on ICMP or to read the text of RFC 1349, please refer to the ICMP Section.
Future Employment of the TOS/Precedence Byte:
Until fairly recently, it was a safe assumption that the IP TOS Byte was essentially obsolete and would be ignored in day-to-day networking situations. However, with the emergence of Internet multimedia transmission and the emergence of protocols such as Session Initiation Protocol (SIP), this field is returning to use.
RFC 2543 (SIP: Session Initiation Protocol, March 1999) specifies a protocol known as Session Initiation Protocol (SIP). This protocol is an application-layer control protocol used for creating, modifying and terminating sessions with one or more participants. Some examples of such activities include Internet multimedia conferences, Internet telephone calls and multimedia distribution. SIP is intended to support communications using Multicast, a mesh of Unicast relations, or a combination of both.
Contained within this protocol specification is the reliance upon the traditional RFC 791 Precedence classifications as identified by “Priority” in the following extract:
“…The resource value is formatted as “namespace””.””Priority value”. The namespace and priority value are assigned by IANA (see IANA Considerations). An initial namespace, “dsn” (Defense Switched Network), contains the priority values, “critic-ecp”, “flash-override”, “flash”, “immediate”, “priority”, “routine”, where “flash-override” is the highest priority and “routine” is the lowest.
As a response header, the value indicates the actual priority selected by the recipient. This priority value may be lower or higher than the request header value. If the header field is missing, the SIP request is treated as if it had the Resource-Priority value of “routine”…” For further information regarding the specifications of SIP, consult RFC 2543 (SIP: Session Initiation Protocol, March 1999)
This topic details the the IP fragmentation and reassembly process which is defined in IP.
Internet Protocol (IP) version 4.0 Fragmentation and Reassembly
The following is an explanation of the IP Fragmentation and Reassembly process used by IP version 4.0. It will examine the purpose of IP Fragmentation, the relevant fields contained within the IP Header and the role of Maximum Transmission Unit (MTU) in determining when IP Fragmentation will be used.
As specified in RFC 791 (Internet Protocol – DARPA Internet Program Protocol Specification, Sept. 1981), the IP Fragmentation and Reassembly process occurs at the IP layer and is transparent to the Upper Layer Protocols (ULP). As a block of data is prepared for transmission, the sending or forwarding device examines the MTU for the network the data is to be sent or forwarded across. If the size of the block of data is less then the MTU for that Network, the data is transmitted in accordance with the rules for that particular network. But what happens when the amount of data is greater than the MTU for the network? It is at this point that one of the functions of the IP Layer, commonly referred to as Fragmentation and Reassembly, will come into play.
Maximum Transmission Unit (MTU)
There are a number of deferring network transmission architectures, with each having a physical limit of the number of data bytes that may be contained within a given frame. This physical limit is described in numerous specifications and is referred to as the Maximum Transmission Unit or MTU of the network. An example of such an MTU would be IEEE 802.3 Ethernet; according to the specifications, the maximum number of data bytes that can be contained within a frame is 1500. The following table lists the MTU of several common network types (from RFC 1191 – MTU Path Discovery, Nov 1990):
|Network Architecture||MTU in Bytes|
|802.3 Ethernet||1500 B|
|4 Mb Token Ring||4464 B|
|16 Mb Token Ring||17914 B|
There are two principle situations in which MTU becomes important:
- The first is when the size of the block of data being transmitted is greater than the MTU. An example of this would be when data is being read using Sun’s Network File System (NFS) which reads data in 8-kilobyte blocks.
- The second situation would arise when data must traverse across multiple network architectures, each with a different MTU. Just such an example would be if the data originated on a 16Mb Token Ring network (MTU = 17914 B) that was connected to another 16Mb Token Ring network (MTU = 17914 B) via an Ethernet network (MTU = 1500B).
Regardless of which situation occurs, the rules that IP Fragmentation follows remain the same and will be discussed later. (See “An example of IP Fragmentation” below)
IP version 4 Fragmentation Fields
Contained within the IP Header, there are three fields that are of concern when discussing IP Fragmentation. (See the sample IP Header diagram below):
0 1 2 3 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ |Version| IHL |Type of Service| Total Length | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Identification |Flags| Fragment Offset | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Time to Live | Protocol | Header Checksum | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Source Address | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Destination Address | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Options | Padding | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ Sample Internet Protocol Header (RFC 791)
*Note – each tick mark represents one bit position.
The three fields concerned with IP Fragmentation are:
|RFC 791 Field Name||Offset Location||Size||Other Reference Names|
|Fragmentation Offset||20-21||13b||Fragment Offset|
(1) Identification – This 16-bit field contains a unique number used to identify the frame and any associated fragments for reassembly.
Given the increasing complexity of networks, it is theoretically possible that fragments from multiple blocks of data might travel along different paths to the destination, possibly arriving out of sequence in relation to one another. That is, it is possible a fragment form block number one might arrive intermixed with the data stream for block number 2 or vice versa. While the function of the Fragment Offset Field is to identify the relative position of each fragment, it is the Identification Field that serves to allow the receiving device to sort out which fragments comprise what block of data. Each fragment from a particular data stream will have the same Identification Field, thus uniquely identifying which block it belongs to. If one or more fragments are lost, the buffer of the device performing the reassembly process will time out and discard all of the fragments. In the event of such a time out, the data will then have to be retransmitted by the sending device.
(2) Flags – This 3-bit field contains the flags that specify the function of the frame in terms of whether fragmentation has been employed, additional fragments are coming, or this is the final fragment.
|Bit Indicator||RFC 791 Definition|
|x1x||Do Not Fragment|
When a receiving station processes each frame, one of the operations it performs is to review the Flags field. Depending on the value indicated by this field, several possible actions are then initiated, including:
(xx1) More Fragments – Indicates that there are additional IP Fragments that comprise the data associated with that specific Identification Field. The receiving device will allocate buffer resources for reassembly and pass all frames containing that unique Identification Field to the buffer.
(xx0) Last Fragment – Indicates that this fragment is the final frame for the data block identified by the Identification Field. The receiving device will now attempt to reassemble the fragments in the order specified by the Fragment Offset field.
(3) Fragment Offset – This 13-bit field indicates the position of a particular fragment’s data in relation to the first byte of data (offset 0).
Because it is entirely possible that the fragments that comprise a block of data might travel along different paths to the destination, it is possible they might arrive out of sequence. While the Identification Field serve to mark which IP fragments belong to which block of data, it is the Fragment Offset Field, sometimes referred to as the Fragmentation Offset Field, that tells the receiving device which order to reassemble them in.
During the IP Fragmentation Reassembly process, if a particular fragment is found to be missing, as indicated by the Fragmentation Offset count, the buffer will enter a wait state until either the missing piece(s) are received or a time out occurs. In the event of such a time out, the buffer simply discards the fragments.
IP Fragmentation and Reassembly by Forwarding Devices
So far, the procedure for IP Fragmentation and Reassembly within a specific network has been discussed. However, there is another situation in which these processes may be utilized to pass frames between dissimilar network physical architectures. Return to the previously mentioned example of a block of data originating on a 16Mb Token Ring network (MTU = 17914B) that is connected to another 16Mb Token Ring network (MTU = 17914B) via an Ethernet network (MTU = 1500B).
|Token Ring||Ethernet||Token Ring|
|Router 1||Router 2|
At the time of transmission, the data block met the MTU restriction for a 16Mb Token Ring Network, however the Router connecting the Token Ring to the Ethernet Network is faced with having to forward this large block onto a network with a smaller MTU. So how does Router #1 handle this situation of Path MTU? The answer is both simple and elegant in that it will simply follow the rules for IP Fragmentation as if was transmitting the frame itself. The only deviation form the process previously discussed is that the Identification Field will be that of the original frame.
Once the data reaches Router #2, it will then perform reassembly of the fragments exactly as previously described and pass the reassembled block of data onto the network with the new MTU.
Sequence examples of IP Fragmentation and IP Fragmentation Reassembly
Regardless of what situation occurs that requires IP Fragmentation, the procedure followed by the device performing the fragmentation must be as follows:
- The device attempting to transmit the block of data will first examine the Flag field to see if the field is set to the value of (x0x or x1x). If the value is equal to (x1x) this indicates that the data may not be fragmented, forcing the transmitting device to discard that data. Depending on the specific configuration of the device, an Internet Control Message Protocol (ICMP) Destination Unreachable -> Fragmentation required and Do Not Fragment Bit Set message may be generated.
- Assuming the flag field is set to (x0x), the device computes the number of fragments required to transmit the amount of data in by dividing the amount of data by the MTU. This will result in “X” number of frames with all but the final frame being equal to the MTU for that network.
- It will then create the required number of IP packets and copies the IP header into each of these packets so that each packet will have the same identifying information, including the Identification Field.
- The Flag field in the first packet, and all subsequent packets except the final packet, will be set to “More Fragments.” The final packets Flag Field will instead be set to “Last Fragment.”
- The Fragment Offset will be set for each packet to record the relative position of the data contained within that packet.
- The packets will then be transmitted according to the rules for that network architecture.
IP Fragment Reassembly
If a receiving device detects that IP Fragmentation has been employed, the procedure followed by the device performing the Reassembly must be as follows:
- The device receiving the data detects the Flag Field set to “More Fragments.”
- It will then examine all incoming packets for the same Identification number contained in the packet.
- It will store all of these identified fragments in a buffer in the sequence specified by the Fragment Offset Field.
- Once the final fragment, as indicated by the Flag Field, is set to “Last Fragment,” the device will attempt to reassemble that data in offset order.
- If reassembly is successful, the packet is then sent to the ULP in accordance with the rules for that device.
- If reassembly is unsuccessful, perhaps due to one or more lost fragments, the device will eventually time out and all of the fragments will be discarded.
- The transmitting device will than have to attempt to retransmit the data in accordance with its own procedures.
Security and IP Fragments
The IP version 4 Fragmentation and Reassembly process suffers from a particular weakness that can be utilized to trigger a Denial of Service Attack (DOS). The receiving device will attempt reassembly following receipt of a frame containing a Flag field set to (xx1), indicating more fragments to follow. Recall that receipt of such a frame causes the receiving device to allocate buffer resources for reassembly.
So what happens if a device is flooded with separate frames, each with the Flag field set to (xx1), but each has the Identification Field set to a different value? According to the rules for IP version 4 Fragmentation and Reassembly, the device would attempt to allocate resources to each separate fragment in preparation for reassembly. However, given a flood of such fragments, the receiving device would quickly exhaust its available resources while waiting for buffer time-outs to occur. The result, of course, would be that possible valid fragments would be lost or encounter insufficient resources to support reassembly. The common term for this type of artificially induced shortage of resources is “Denial of Service Attack”.
To defend against just such DOS attempts, many network security features now include specific rules implemented at the Firewall that change the time-out value for how long they will hold incoming fragments before discarding them.