IPv6 Part 7: No more NAT

One of the biggest culture shocks for me as a security professional is the assumption that IPv6 addressing is end-to-end; in other words, no more network address translation (NAT). Untranslated addresses expose both the host identity and the topology of the local network to the outside world. If an auto-configured IPv6 address is based on the interface MAC address (see IPv6 Part 3: Address auto-configuration) then the hardware vendor of the interface is exposed too (there are alternatives to this as we shall see). However, what I find even more shocking is the level of hostility to NAT within the IPv6 world: it seems to be an article of IPv6 faith that NAT is bad, often without giving any real case against it. I want here to take a good look at NAT and the arguments for and against, without prejudice.

One of the sources of confusion and ignorance there is about NAT comes from confusion of terminology, so I’ll start off by trying to cut through that in this post. It’s important to understand that there are two main different types of NAT. The first is usually referred to as one-to-one NAT (bi-directional NAT according to RFC 2663, Static NAT in the Check Point world). As the name suggests, there is a one-to-one mapping between addresses in the public domain and the private domain. As datagrams pass through the NAT gateway that lies between the two, the source or destination address is rewritten accordingly, and for TCP, UDP and ICMP (and IPv4 datagrams) the header checksum is recalculated. IPv6 will simplify this slightly because there is no longer a datagram checksum to recalculate. One-to-one NAT was first used in the early days of the Internet to handle cases where end-users had changed providers and hadn’t completed the process of readdressing all their hosts using the new (provider-assigned) prefix. An experimental form of prefix translation has now been defined for IPv6 (RFC 6296).

If the NAT gateway uses static rules to map between private and global addresses then the process is stateless: the gateway simply translates addresses packet by packet. However RFC 3022 defines a method called Basic NAT, where bindings between private and global addresses are set up dynamically, from a pool of global addresses; this means that the NAT gateway has to manage the state of these bindings.

The other type of NAT is what RFC 3022 refers to as Network Address Port Translation (NAPT; known as overloading or Port Address Translation in the Cisco world, Hide NAT in the Check Point world). This allows multiple hosts on a private network to connect to the Internet using one global address: very attractive in the IPv4 world with the increasing shortage of globally routable addresses. A private network using NAPT will typically use RFC 1918 addresses internally, which are not globally routable.When a host on the private network initiates a connection to the Internet, the NAPT gateway will typically translate the source address of the initial datagram to the global address of the gateway. It then dynamically translates the source TCP or UDP port of the initial datagram to an available port on the gateway itself. TCP and UDP ports are 16-bit numbers, and outbound source ports are generally allocated in the range above 1023, so this will scale up to a a maximum of about 64,000 simultaneous connections. ICMP datagrams have to be modified in an analogous way.

To take an example, say the NAPT gateway has a global address of 192.0.2.1, and there are two hosts with RFC 1918 addresses, 10.0.0.1 and 10.0.0.2, on the private side. Host 10.0.0.1 initiates a TCP connection to port 80 on 198.51.100.1 with a randomly selected source port of 7680. As the first datagram of the connection passes through the NAT gateway, the gateway translates the source address to 192.0.2.1, and the TCP port to a spare port on itself, let’s say 20231. Then host 10.0.0.2 makes a TCP connection to another destination, port 80 on 203.0.113.1, with the source port set to 1818. The gateway will translate port 1818 to another spare port, in this case 10434. The gateway needs to maintain the state of these mappings, so that when a datagram comes in on the global interface with destination of 192.0.2.1 and TCP port 20231, it knows that this needs to be translated to 10.0.0.1 and port 7680 in order to reach its destination.

NAPT gateways have even more work to do than one-to-one NAT gateways. Not only must their checksum recalculations include the modified port numbers, but they must also now maintain the state of every connection that passes through, and clean this up when the connection is closed. If there are multiple NAPT gateways for redundancy, then this state will need to be replicated between them to keep connections up after a gateway failure.

NAPT has the magical property of allowing a private network to be much larger than it appears from the Internet: a bit like Doctor Who’s TARDIS, which is much larger than it appears to be from the outside. It has come to be almost synonymous with NAT, as it’s by far the most prevalent form of NAT today. However it’s important to remember that the different types of NAT have different properties, and different goals. In the following posts I’ll be keeping this in mind as I go through the objections to NAT and assess their validity.

Yes it’s a lot of bits but…

Let’s take a look at the structure of an IPv6 Global Unicast address (GUA), a globally-routeable address:

IPv6 addresses, like IPv4 addresses, are Big Endian, that is, the most significant bits come first. That’s similar to decimal numbers, where thousands come before hundreds and so on. The most significant bits (the “high-order” bits) at the beginning of a GUA form the global routeing prefix. This is used by Internet Service Providers (ISPs) to route traffic through the global Internet.

The standard IPv6 global routeing prefix assigned to a single-site enterprise is a /48. RFC 3177 specified /48 as the standard prefix for all sites, even home users, but subsequently RFC 6177 recommended longer prefixes for small enterprises and home users. With IPv4 an enterprise may only be assigned a /24, if that. That means that ISPs have (very) roughly 16 million times the address space to work with (248 versus 224); in fact 48 bits provides about 280 trillion prefixes, or roughly 38,000 prefixes for every human on the planet.

It’s the structure at the local end of an IPv6 address that’s surprising for someone like me who has worked with IPv4 all through their career. The network prefix (what in IPv4 is called the subnet mask) is a fixed /64 in length, as opposed to the variable-length subnet masking (VLSM) of IPv4. That means that the interface ID (broadly the IPv6 equivalent of the host ID) is also 64 bits long. In other words there’s enough address space on each subnet for 264 hosts, i.e. 18446744073709551616, or enough M&Ms to fill the Great Lakes.

Now that’s a huge number. With current networking technology it’s difficult to imagine it being practical to have more than, say, 10000 hosts on a single subnet. That only requires 14 bits, so the other (most significant) 50 bits are effectively redundant, in other words 18446744073709535232 addresses or 99.99999999999991%. In practice there is likely to be parallel running of IPv4 and IPv6 on the same subnets for some time, and IPv4 subnets are typically much smaller, say, 256 hosts, so the redundancy will be greater still.

The consequence of this fixed /64 network prefix is that the subnet ID, which is the local part of the network prefix, is typically only sixteen bits. For IPv4, most enterprises use private RFC 1918 addressing combined with Network Address Translation. They often utilise the 10/8 private prefix subnetted down to a /24 for individual subnets, which also provides a 16-bit subnet addressing space. So in practice IPv6 doesn’t provide any additional subnet address space for many enterprises compared to IPv4.

Sixteen bits is still a large number, equivalent to 65536 addresses, and it would be very generous if enterprises used those sixteen bits as a completely flat address space. However they usually structure the local subnet space hierarchically, in order to aggregate routes efficiently and minimise internal routeing tables (just as the global routeing address space is structured). IPv6 does have the advantage for network administrators that addresses are represented in hex rather than IPv4’s decimal; each hex character represents four bits (a “nibble”) as opposed to the 8-bit components of an IPv4 address (an “octet”). This means that the subnet space can be divided conveniently into 4-bit chunks in a way that’s easy to read, and so less prone to error. It is generally recommended (by Coffeen 2014 for example) to follow this practice when designing your IPv6 subnetting structure.

Nevertheless creating a hierarchical structure in this way means that there will be inevitably some inefficiency in the use of the address space. For example, take a site that has 20 buildings with 20 subnets per building. Each nibble can encompass only 16 objects, so you would need to assign two nibbles to the site level of the hierarchy and two to the building level of the hierarchy, thus using up the available subnet space, with quite a lot of redundancy. This is almost a worse-case scenario, but it shows how the address space for local subnetting is not that generous, and looks positively stingy compared to the address space for interfaces.

In summary, IPv6 uses a roughly 48-bit address space for global routeing, That’s a huge amount of prefixes, and to be fair it does deliver the original goal of IPv6, which was to increase the global address space. However, the way that the local part of IPv6 addresses is structured means that there is no useable expansion of address space at the local end. That stuff about galaxies and light-years is (mostly) hype.

In the next post I’ll take a look at why IPv6 addresses are structured in this way.