blog banner

January 4, 2022

The Internet, Explained (1/6)

This post is part of a series.

Some time ago, it occurred to me that I didn't really understand what made the Internet work. Sure, I am all too familiar with what the user experience is like, and my adventures in web development have forced me to acquire a fair bit of networking knowledge, but between the bits of light there existed vast chasms of confusion. So I decided to do some reading, some soul-searching, and publish my findings in the form of a blog post.

Please note that this post is by no means a comprehensive overview of all the technologies involved in the Internet; that would require far more lifetimes than the one which I have to waste. I probably won't cover the Internet's history in very much depth, either, though that is a fascinating topic that I strongly encourage you to look into.

The word "internet" itself offers some hints about its architecture. "Internet" is short for internetworking, the act of unifying numerous computer networks under one protocol. That's essentially what the Internet is, a bunch of independently maintained networks that are joined together such that any two computers on the Internet can reach each other.

A Word on Layers §

The Internet is powered by an ever-expanding family of technologies and protocols. These systems are segregated into layers, and each layer is responsible for providing an interface that allows the next layer to do its job. This layered model serves as a great way to understand the Internet starting from the ground up. In reality, the divisions between these layers are not as concrete as one would hope, and as a result there is quite a bit of disagreement over how to best model the Internet stack (primarily between supporters of the OSI model and the TCP/IP model). Since this debate is incredibly contentious, I will do my best to avoid it.

Physical Layer §

Let's start from the very bottom. No Internet, no networks even. Consider two computers: how can digital data be transferred from one to another? The answer to this question is the physical layer, which describes how to transmit ones and zeroes over a physical transmission medium. This involves everything from the actual physical connectors or radio bands used for communication, to the line codes which specify how to extract a bitstream from the measured values of a signal.

The physical layer is closely tied to the link layer, which leverages the physical layer's capabilities to transfer full frames of data over a network that may be more complicated than a single point-to-point link. It may also take on responsibilities such as error correction and automatic retransmission.

The physical layer and link layer are generally tightly coupled (out of necessity), which is the source of endless frustration. For example, the term "Ethernet" might refer to the physical cable standard, but it could also refer to the Ethernet protocol used to transfer frames over those cables. Which is it?

The full answer is that Ethernet spans several network layers, because it supports a variety of physical transmission media. There's of course the classical twisted pair cable, but Ethernet also works over coaxial cable and optical fiber. Each of these physical layers are managed by different standards, which are collectively maintained by [https://en.wikipedia.org/wiki/IEEE_802.3].

Ethernet itself belongs to a bigger group of IEEE 802 technologies, which all support a link-layer interface called medium access control (MAC). MAC abstracts away some of the details of the physical layer, and allows devices on the same local area network (LAN) to communicate. Each device (specifically, each interface) has a unique 48-bit identifier, known as a MAC address. These addresses are allocated by IEEE to manufacturers of network equipment, though some mobile devices simply generate a random MAC address for privacy reasons. We'll see how MAC addresses are used in routing later.

Beyond Ethernet, there are no shortage of other physical and link-layer technologies. Here are some prolific examples:

There are also plenty of obscure and dead LAN technologies. One example is Token Ring, IBM's Ethernet competitor, which was famously vanquished around the turn of the century.

Switches §

ethernet switch
An Ethernet switch. Image by Simon A. Eugster / CC-BY

Devices called network switches allow communication between nodes on the same LAN. A switch is basically an embedded device with a large number of Ethernet ports. Because all devices on the network communicate with each other by talking through the switch, a star topology is formed.

star topology diagram
In case you can't tell, I'm not exactly the best at making diagrams.

Ethernet frames are fairly simple in structure. As expected, each frame contains the source MAC address and destination MAC address (along with other fields such as a checksum) before the payload, so that the frame can be routed to the correct recipient.

When computer A wants to send a message to computer B, it sends a frame to the switch with computer B's MAC address. However, the switch doesn't know which physical port the connection to computer B is located on, so it relays the frame to all connected computers, hoping that one of them is the intended recipient. When computer B receives the frame, it might reply with another frame. That frame would contain computer B's MAC address as the source, allowing the switch to internally associate the connection it received the frame on with computer B. In the future, when the switch receives a frame with computer B as its destination, it can simply relay the frame on the correct link instead of flooding.

Switches are one of the most ubiquitous building blocks of computer networks, so it's no surprise that they are all over the place. If your router has more than one Ethernet port, chances are it has a built-in switch.

WiFi §

Ethernet is the most common standard for wired communications, but most people access their network wirelessly when at home. So let's talk about WiFi, which is standardized by IEEE 802.11. The biggest difference in wireless networking is that for the purposes of a residential LAN, point-to-point links don't exist. Instead, devices communicate with the LAN by broadcasting on a predetermined frequency. All devices on the LAN receive and decode incoming frames, and simply filter out the ones which aren't intended for them.

However, this comes with the pitfall that only one device can transmit at a given time per channel. If you've ever heard a faint broadcast from a different station while listening to the radio, you've experienced this effect; however, unlike humans, computers cannot distinguish two simultaneous transmissions, so algorithms like Carrier-Sense Multiple Access with Collision Avoidance are necessary. WiFi can't use CSMA/CD because CSMA/CD relies on transmitters being able to listen for a colliding signal while transmitting; that's easy for wired network devices, which transmit and receive on physically isolated wires, but WiFi transceivers have to stop receiving while transmitting, or they'd only be able to hear their own signal. Thus, collision avoidance is used, where transceivers wait until the network has been quiet for some time before starting to transmit. Optionally, channel access can be explicitly mediated through messages that indicate when a node intends to begin transmitting, a scheme called RTS/CTS.

We established that in the case of wired networks, your router is directly responsible for routing frames between physical links, but that's unnecessary in the case of WiFi. In fact, any device with a WiFi transceiver can broadcast its own network; Windows even natively supports this feature. Instead, your router plays the role of a wireless access point, aptly abbreviated to WAP. (In light of "WAP" becoming rather vulgar in recent years, I will use the more succinct abbreviation of just AP for the rest of this post.) The job of an AP is simple; it just serves as an interface between the wireless network and the gateway, which has access to the public Internet. Speaking of which...

Internet Layer §

We've finally reached the Internet layer of the Internet. This is where all the action happens! Buckle up, and let's explore how it works.

Relatively few protocols live in the Internet layer itself. Its primary inhabitants are the Internet Protocol (IP) and Internet Control Message Protocol (ICMP). There are two currently deployed versions of IP, IPv4 and IPv6, which are generally very similar but vary in subtle yet important ways.

In IP, every network interface (usually just one per computer) is associated with an IP address. In IPv4, this address is 32-bits long, and usually written as a series of four numbers (each corresponding to a byte or octet of the IP address) and separated by periods. For example, the IP of this blog at the time of writing is 142.93.26.121. IP addresses are managed by the Internet Assigned Numbers Authority, which assigns blocks of IP addresses to the five Regional Internet Registries. The RIRs, in turn, deal with requests from individuals and businesses for IP allocations.

IPv4 Exhaustion §

Because IPv4 addresses are only 32 bits long, there can only be 232 = ~4 billion unique IPv4 addresses. That seems like a lot, but as early as the 90s the threat of running out of IPv4 addresses has continually loomed over the Internet, made worse by the fact that many parts of IPv4 space are reserved for various purposes. To fix this issue, IPv6 was created. IPv6 addresses are 128 bits long, which is more than enough to serve humanity's needs at the moment. (If all IPv6 addresses were distributed equally among all living humans, every individual could have 47 octillion addresses. It's safe to say that we won't be running out any time soon, especially because sadly a lot of the Internet is still stuck on IPv4.)

xkcd internet map
xkcd 195: a map of IPv4 space circa 2006. Things have only gotten more crowded since then.

Packets §

One of the key innovations that made the Internet possible was packet switching, the idea of transferring digital data throughout a computer network by splitting it into chunks called packets and allowing intermediate nodes (routers) to decide where to send the packet next. Thanks to its ubiquity, packet switching probably seems incredibly mundane and obvious to most programmers today, but at the time (when the predominant paradigm for telecommunications was circuit switching), packet switching was revolutionary. Compared to circuit switching, packet switching allows for much more diverse network topologies, and does not require all the packets in a single data stream to travel to the recipient via the same route, making Internet routing incredibly dynamic. All of these attributes have helped the Internet scale to billions of connected devices.

An IPv4 packet consists of two parts: a header, which provides information about the packet itself, and the data contained within the packet. Some of the info contained within the header includes:

So how exactly do IP packets find their way to their final destination? That's a big question. Let's figure it out.

Obtaining an IP address §

In order to send or receive IP traffic, you first need an IP address. Your computer could obtain one through several methods:

Let's talk about DHCP. DHCP is built on top of UDP, which in turn resides above IP. It allows newly-connected clients to contact the local DHCP server and obtain an IP address. Right off the bat, there are two challenges:

To solve the first problem, clients use a source IP of 0.0.0.0 in their initial DHCP requests. The DHCP server distinguishes incoming requests not by source IP but by MAC address. To discover the DHCP server, all DHCP requests are directed to a special IPv4 address, 255.255.255.255, which broadcasts the packets to all nodes on the network. By seeing which IP replies, a device can figure out which IP the DHCP server is running on.

Okay, so we've obtained an IP address. Now what?

diagram of an IP LAN

Suppose computer 10.0.0.2 wants to send a message to 10.0.0.3. In order to actually deliver a message to 10.0.0.3, our sender needs to know which MAC address to send packets to. It can obtain this information via the Address Resolution Protocol (ARP).

Address Resolution Protocol §

ARP is a protocol that enables the resolution of IP addresses to MAC addresses on a network. It operates on the link-layer. When a computer needs to determine the MAC address given an IP address, it broadcasts an ARP request to the local network. This is done by sending frames addressed to a MAC address of FF:FF:FF:FF:FF:FF, which signals to the switch that the message should be relayed to all connected devices. The device which has the corresponding IP responds to the request with its MAC and IP address. Both of these devices may cache each others' IP and MAC addresses to avoid making another ARP request in the future.

There also exists a second method for the discovery of IP-to-MAC mappings. Under certain circumstances, such as when a device joins a network or obtains a new IP address, it may publish an ARP announcement that prompts all other devices to update their ARP caches.

ARP is ubiquitous among IEEE 802 networks. If your computer is on a WiFi or Ethernet-based network, you can install Wireshark and observe ARP requests happening right before your eyes.

arp requests on my LAN, as seen by wireshark

Some ARP requests seen on my local network.

Your computer doesn't make an ARP request for every single outgoing IPv4 connection. If configured correctly, your computer should be able to tell which addresses belong to the local network (i.e. they can be reached by MAC address) and which reside on the public internet.

Introduction to CIDR §

During the Internet's infancy, the format of IP addresses was much simpler. As seen in RFC 760, published in January 1980, the first 8 bits of each IP address identified which network it originated from, and the remaining 24 bits were unique to each host. This limited the Internet to just 256 networks (technically 254 since addresses beginning with 0 are reserved for local use and addresses beginning with 255 are used for broadcast). It soon become apparent that the Internet would grow to encompass much more than just 256 networks, so the classful network scheme was adopted.

Under classful networking, each network belongs to one of three classes:

(The actual scheme had some more details which are omitted here.)

Eventually, engineers realized that the classful networking scheme was still rather inefficient; the size difference between classes was far too granular, and many IP addresses were going to waste. So classful networks were done away with, and a new system, Classless Inter-Domain Routing (CIDR), was adopted.

Under CIDR, the number of network/host bits is variable, instead of being fixed at 8, 16, or 24. This allowed IPs to be allocated with much less overhead. In CIDR notation, the length of the network prefix is written after the IP and separated with a slash. For example, the IP of this server at the time of writing is 142.93.26.121. It belongs to a bigger subnet, 142.93.16.0/20, which encompasses all IP addresses whose first 20 bits match those of 142.93.16.0. The length of this network's prefix is 20 bits, leaving 12 bits for hosts on the subnet, which gives it a maximum capacity of 212 = 4096 hosts.

Complementary to CIDR is the idea of a netmask. For a given classless network, its netmask is a special 32-bit value where all the prefix bits are set to 1 while all the host bits are cleared. The routing prefix of an address can be obtained by finding the bitwise AND of the address and the netmask. If the routing prefix doesn't match that of the local network, your computer won't perform an ARP lookup, since only machines on the same LAN can be contacted via MAC address.

Going Public §

So what happens when your computer encounters an IP that isn't on the LAN? Thankfully for your poor computer, handling the routing of this packet across the public Internet is mostly outside of its responsibilities. Your computer has a default route, which specifies who to contact to relay packets outside the local network. That device is called the default gateway, since it serves as a gateway to the rest of the world.

Your router itself has a default gateway, too. In fact, your router's view of the internet isn't really different from that of your computer's: it too is just part of a bigger network, owned by your ISP instead of you. Other than that, pretty much all the other details stay the same; your router has a subnet mask, it performs ARP requests, and it relays packets which aren't on its network to the default gateway. In fact, your router probably obtains its address from the ISP through DHCP as well. (That's why rebooting your router may result in your home network being assigned a new IP address. Realistically, however, most routers remember their previous IP address and simply request the same one when they boot up again.)

NAT: A Perpetual Annoyance §

Hey, remember IPv4 exhaustion? Yup. It's me again. Here to make your life slightly worse for the umpteenth time. When will you learn?

As I write this blogpost, IPv6 will have been available for an entire decade in just a couple months, yet IPv6 adoption still languishes at around 35%. As it stands, the majority of the Internet is only available over IPv4, yet IANA allocated all of its IP addresses to the RIRs in 2011; RIPE NCC, one of the five RIRs, was the last to run out of addresses in 2019. Nowadays, if you want to get your hands onto some IPv4 addresses, you'll have to turn to the IP aftermarket, where you'd be paying a price of anywhere from 40 to 120 bucks a pop.

This ever-worsening drought of IP addresses has forced the Internet's biggest players, the ISPs, to get creative. One solution that has been used to combat IPv4 exhaustion is Network Address Translation (NAT). NAT precludes the need to assign every device on a household network a unique IP address. Here's how it works.

nat diagram
A typical NAT setup.

Each device on the LAN has a private IP address. In this case, the addresses are from the 10.0.0.0/8 block, one of three blocks which are reserved for use within private networks. These are not directly usable for accessing the internet; most networks are configured to reject packets originating from outside the network with a private IP. Instead, when the NAT router receives a packet from a device that is destined for the public Internet, it overwrites the original address with the router's public address before relaying it. This way, all the devices on the LAN can access the Internet through just one address.

But wait, you might say. When the router receives an incoming packet, how does it know which device to forward it to? Ah. You've stumbled across one of NAT's biggest flaws. The protocol that is being NATed must provide some way to keep track of which connection is which. For TCP, arguably the most common protocol on the Internet, this is done through the port number. For instance, if 10.0.0.1 makes a request to public IP 100.2.3.4, the router will overwrite both the source address in the IP header and the source port in the TCP header with an arbitrary, unique port. It will remember this port information and, when 100.2.3.4 sends a TCP response, the router can forward the packets to the appropriate device on the local network by reading which port they are addressed to.

The fact that the router must remember which port is associated with which local connection makes NAT stateful, which stands directly at odds with IP's stateless nature. Furthermore, NAT also violates the end-to-end principle, which states that protocol logic should only be handled at each end of the network route; intermediate nodes shouldn't need to know anything about the data they are relaying. Even worse, there are some protocols that simply don't work with NAT.

Finally, while NAT works reasonably well for consumers, it is for the most part fundamentally incompatible with server hosting, which requires listening on a specific well-defined port that often can't be changed. If you've ever tried to host a Minecraft server out of your home network, you're probably familiar with this struggle. This hasn't stopped Google from offering cloud NAT for serving multiple unrelated services on different ports.

NAT doesn't have to stop at the residential level, though. Under carrier-grade NAT (CGNAT), entire WANs are encapsulated under address translation, further reducing IPv4 address usage. IANA allocated the 100.64.0.0/10 block for use with CGNAT in 2012.

Where Does Internet Actually Come From? §

We've established that your router doesn't really do anything special, it just serves as a gateway between your LAN and your ISP's larger network. So how does your ISP get on the Internet? This is where things really start getting interesting.

So far, our trusty little packet has faced some pretty tough trials. He was disassembled and reassembled when he crossed the NAT firewall, and now he's being bounced around the ISP's internal network. How does he find his way to the Promised Land, the destination IP?

Routing within the ISP network may be driven by any one of several protocols. Intermediate System to Intermediate System (IS-IS) and Open Shortest Path First (OSPF) are often used for internal routing, from what I gather, but information on the topology of ISP networks and how they manage their routes is scarce as expected. Both IS-IS and OSPF are link state routing protocols, meaning that routers broadcast their links to the internal network, allowing other routers to build a map of the network and calculate the optimal route to each node using an algorithm such as Dijkstra's algorithm.

But what if the destination isn't in the ISP's network?

Autonomous Systems §

At the very top level, the Internet is organized into autonomous systems. Generally, all the IPs within an AS are controlled by one organization (usually a corporation) and can reach each other without leaving the AS. The AS is defined by the list of routing prefixes that belong to the AS, and each AS is assigned a number (an ASN) by our good friend IANA.

When you look at the Internet at the AS level, the illusion is truly stripped bare, and you can see the Internet for what it is: a bunch of servers, organized into ASes, with connections running between them.

Routing between ASes is universally done via the Border Gateway Protocol, which is a path vector routing protocol as opposed to a link state routing protocol. Instead of broadcasting the existence of links throughout the network, BGP routers advertise which ASes they can reach and the path to take to their peers. This has the benefit of preventing the possibility of a loop, which would pose a serious problem at a global scale. In general, BGP is designed to reduce volatility in Internet routing. To reduce network usage and routing table size, BGP routers may use heuristics to selectively reject routes, a process known as route filtering. (The side-effect of route filtering is that no router on the Internet has a complete view of all routes, so collecting statistics on the routing table requires careful observation from numerous viewpoints.)

interaction between internal and external networks
Diagram of the various routing protocols found on the Internet. Internal links are highighted in red; border routers are highlighted in green.

The BGP routing itself looks just like the routing table you'd find on your router or computer, just much bigger (around 900,000 unique prefixes at the time of writing). Its functionality remains unchanged, though. When a border router receives a packet:

Not every AS is connected with each other; often, Internet packets will travel through networks belonging to several ASes until they reach their destination. You can see this for yourself using the traceroute utility. Many Internet infrastructure operators also run public Looking Glass servers, which allow users to run traceroutes or retrieve BGP route information from their backbone servers. In particular, SdV operates a graphical traceroute tool which lets you visualize all the routes to a certain IP available in their border gateways' routing tables. Here's an example output for 8.8.8.8, which is Google's public DNS project.

bgp map to google dns

The red arrows signify the best path, the one that packets would actually follow in the network. As you can see, there is more than one way to reach the destination; Google is part of France-IX, which SdV is also a member of. The purple rectangles represent border routers within SdV's own network; they aren't shown for other networks since they are normally obscured from public view.

The Politics of Peering §

In the words of Kenneth Finnegan, regarding BGP peering:

This is obviously where human networking becomes exceedingly important in computer networking.

Broadly speaking, inter-AS connections can be categorized into two types:

Note that peering and transit are the same thing from a technical standpoint. Also, BGP peering simply refers to the existence of a connection between two ASes. It doesn't necessarily mean that no settlement is involved.

Another staple of backbone routing is the Internet Exchange Point (IXP). IXPs are essentially a series of interconnected switches, usually managed by a non-profit organization, that allow many ASes to peer with each other without a huge number of cross-connections.

internet exchange cables
One of many switches which make up AMS-IX, one fo the world's largest Internet exchanges. Photo by Fabienne Serriere / CC BY-SA.

IXPs are a pretty great thing for a number of reasons. They provide a way to peer with numerous other ASes, reducing latency for everyone, while avoiding the large number of physical connections that would be necessary to reach the same level of connectedness that would be necessary otherwise.

peering diagram

If you are interested in internet exchanges, I encourage you to checkout the Fremont Cabal Internet Exchange, a burgeoning IXP based out of Fremont, California (which is also my hometown!) The blogpost about its creation is as informative as it is humorous.

Not All ASes Are Made Equal §

Some ASes are more connected than others. You may hear talk about so-called "tier 1" networks; generally speaking, tier 1 networks are ASes which don't need to pay for transit to reach any other AS on the Internet. They are few and far between, mostly because the Internet has grown to the point where it's very hard to reach every network without at least paying somewhere along the way.

Historically, many tier 1 networks were primarily located in the US, resulting in a lot of traffic flowing through US networks. This trend has slowly eroded, however, as Internet infrastructure becomes more advanced in other countries. Conversely, the NSA diverts American traffic offshore so that it can legally perform surveillance.

Tier 1 networks are followed by tier 2 (pay for transit to reach some ASes) and tier 3 (exclusively pay for transit).

BGP Mishaps §

BGP has a history of high-profile screwups with cascading consequences, often taking down large segments of the Internet. Just last year, Facebook suffered a total outage due to what later turned out to be a BGP misconfiguration. BGP is in fact rather vulnerable to hijacking attacks, since any AS could announce a high-priority bogus route that could make affected network prefixes unroutable. Resource Public Key Infrastructure (RPKI) was created to resolve this problem, but adoption still remains a challenge. Here are some notable BGP incidents:

BGP has faced other issues than hijacking, though. Though not a problem with the protocol itself, historically many border routers had a limit of 512K routes, so when the BGP routing table suddenly grew to above that threshold on August 12, 2014, numerous Internet companies faced outage as BGP routing slowed to a crawl.

Multicast and Anycast §

So far, we've only discussed unicast traffic, where packets are routed from one host to the next. However, IP also supports alternative routing schemes.

Anycast routing allows numerous geographically separate hosts to operate under the same IP. When you send a packet to an anycast address, it is delivered to the "closest" host as seen by the core routers along the path. This allows traffic to be automatically distributed between several servers, potentially reducing latency and increasing throughput. However, because IP is stateless, packets addressed to anycast packets are not guaranteed to arrive at the same host, which could potentially disrupt stateful protocols like TCP. As a result, anycast is most popular for scaling protocols where transactions are short-lived (and thus the odds of a connection being disrupted are low). In particular, many DNS services (such as all of the root name servers and many popular DNS resolvers) are scaled using anycast.

Multicast routing allows packets to be delivered to any hosts which have indicated that they are willing to participate in a given multicast group, managed by the Internet Group Management Protocol (IGMP). All multicast packets have destination addresses in the 224.0.0.0/4 range as reserved by IANA. Senders of multicast periodically use IGMP to query the network for participants; hosts interested in receiving multicast packets on a specific address join that group by responding to the query.

One of the most common protocols that uses multicast routing is multicast DNS (mDNS), which is essentially a variant of DNS that works over multicast. It is often used to discover devices on the LAN such as printers.

mdns traffic on my LAN

mDNS traffic on my LAN. I'm pretty sure it's from my printer.

Unlike anycast, multicast generally doesn't work outside of local networks; most providers do not route multicast traffic through their networks because it can potentially be used to amplify a denial of service attack, flooding hosts' networks with packets and preventing users from accessing their services.

IPv6 §

IPv6 is the successor of IPv4, created to solve various problems with the original Internet protocol (beyond just IPv4 exhaustion). However, it still fulfills the same role as IPv4, and as a result most protocols built on top of IP work fine with both protocols (though some modifications may be necessary to work with the longer addresses).

One of IPv6's most attractive features is its longer address length (128 bits), meaning that every device can have its own IP address instead of resorting to ugly solutions such as NAT. This restores the end-to-end principle on the Internet. In addition, IPv6 packet headers have been considerably simplified, allowing for faster processing by routers. One way this was accomplished was by removing the concept of fragmentation from IPv6; instead, the sender is responsible for discovering the maximum MTU that a given route can accept (which is guaranteed to be at least 1280 bytes).

IPv6 also supports multicast traffic; in fact, broadcast is entirely supplanted by multicast. There are special addresses used for routing packets to all nodes or routers on a local network, but these are all implemented using Multicast Listener Discover, IPv6's replacement for IGMP.

Epilogue §

This post is very much still a work in progress. As you may have noticed, the IPv6 section is rather sparse. When I can find time, I plan on writing more and revising various parts of the article (which was written in quite a hurry, as you may have noticed). Until then, check out the other posts in the series!

Further Reading / References §

Comments