The mental model we’ve built so far is two endpoints with packets bouncing between them. That’s a fine simplification for understanding addressing, routing, and transport. The reality is busier: between any two real-world hosts there’s usually a small fleet of middleboxes — firewalls, load balancers, proxies, IDS sensors, VPN gateways — each doing one specific job. And one rung up from that, the internet itself isn’t a single fabric; it’s a hierarchy of operators (IANA, regional registries, ISP tiers, IXPs, CDNs) with rules about who connects to whom.
This is lesson 10 of Networking from Scratch. The previous lessons gave you the protocol stack and the addressing scheme. This one populates that stack with the actual boxes you’ll see on a network diagram — what each one does, where it sits, and how to recognise its job from the symptoms when it misbehaves.
Hosts vs middleboxes
Strictly speaking, a host is anything with an IP address that’s the actual source or destination of traffic — your laptop, a server, a phone, an IoT thermostat. A middlebox sits between hosts and processes traffic that isn’t meant for it. Routers and switches are technically middleboxes too (we covered them in lesson 5), but in everyday usage we mean the boxes that look at the contents of packets, not just the addresses.
Five categories cover the vast majority of middleboxes you’ll encounter:
- Firewalls (filter traffic).
- Load balancers (distribute traffic).
- Proxies (relay traffic on someone else’s behalf).
- IDS / IPS / WAF (inspect for bad traffic).
- VPN gateways (terminate encrypted tunnels).
Firewalls
A firewall’s job is to allow or deny traffic based on rules. The rules can be as simple as “allow TCP port 443 inbound to this IP” or as complex as “inspect the SNI in the TLS handshake and block if the destination domain matches a category list.”
| Type | What it inspects | Example use case |
|---|---|---|
| Stateless (packet filter) | Each packet in isolation: src IP, dst IP, port, protocol | Simple ACLs on a router. Rare as a primary control today. |
| Stateful | The packet plus the connection state it belongs to | The default for any firewall in the last 25 years. Knows that a return packet of an established TCP connection is allowed automatically. |
| Next-generation (NGFW) | Connection state + L7 protocol parsing + identity + threat intel | Decrypts TLS, identifies applications regardless of port, applies user-aware policy. The standard for enterprise perimeter firewalls. |
| Host-based | Same as stateful, but on each host | Windows Defender Firewall, iptables/nftables, macOS firewall. |
The single most important thing to know about a stateful firewall: it tracks every active connection in a connection table. That table has a finite size (sometimes millions of entries, sometimes far fewer on small appliances), and entries time out if traffic stops flowing. That’s why a long-idle SSH session sometimes mysteriously dies after an hour — the firewall’s state entry expired and it now treats your continuation as a brand-new flow that doesn’t match a SYN.
Where firewalls sit
- At the perimeter — between the internet and your office or data centre. Catches inbound attacks and unauthorised outbound.
- Between segments inside the network — the “internal firewall.” Limits east-west traffic so a compromised laptop can’t pivot freely.
- On each host — last line of defence; useful for laptops on untrusted networks.
- In front of cloud workloads — security groups, network ACLs, cloud-native firewall services.
Load balancers
A load balancer takes incoming connections and spreads them across a pool of backend servers. Two reasons we use them:
- Capacity. One server can’t handle the traffic; ten can.
- Availability. If one backend dies, the load balancer routes around it. Users don’t see the outage.
The two kinds you’ll see:
| Layer | What it inspects | Pros | Cons |
|---|---|---|---|
| L4 (transport) | TCP / UDP headers | Fast (no payload parsing). Works for any protocol. | Doesn’t see HTTP paths, headers, or cookies, so can’t make path-based routing decisions. |
| L7 (application) | HTTP requests, often TLS-terminated | Path-based routing, header rewriting, sticky sessions, request-level health checks. | Slower (parses every request), and your traffic must be a protocol the LB understands. |
Concrete examples: AWS NLB and HAProxy in TCP mode are L4. AWS ALB, Nginx, HAProxy in HTTP mode, Cloudflare, and Traefik are L7. Many modern deployments stack them: an L4 load balancer accepts the raw TCP connection and forwards to a fleet of L7 reverse proxies that do the smart routing. That gives you both performance and feature richness.
Common load-balancing algorithms
- Round-robin — pick the next backend in sequence. Simple, no awareness of load.
- Least connections — pick whichever backend has the fewest open connections. Adapts to slow backends.
- IP hash / consistent hashing — the same client always goes to the same backend. Useful when sessions are stored on the backend.
- Weighted — backends get a weight; bigger ones get more traffic. Useful for mixed-capacity fleets.
Whatever the algorithm, every load balancer also runs health checks — periodic probes against each backend. A backend that fails too many checks gets removed from rotation; a backend that recovers is re-added. The health check definition matters: a poorly-written one (just TCP connect) can leave you sending traffic to a process that’s up but not actually serving requests.
Forward proxy vs reverse proxy
A proxy is a middlebox that relays connections on someone else’s behalf. Two flavours, distinguished by who they’re working for:
| Forward proxy | Reverse proxy | |
|---|---|---|
| Works for | The client | The server |
| Sits between | Clients and the wider internet | The internet and your servers |
| Typical use | Outbound web filtering, content caching, anonymity, content scanning | TLS termination, caching, load balancing, WAF integration, hiding internal topology |
| Real examples | Squid in a corporate egress, anonymizing services, ad-blocking gateways | Nginx, Apache, Cloudflare, AWS CloudFront, fronting any modern web app |
Same technology, opposite direction of trust. A reverse proxy is the most common middlebox you’ll set up if you run a web service: it terminates TLS, caches static assets, applies security rules, and forwards what’s left to your application servers. A forward proxy is what you set up if you want to control or observe what’s leaving your network.
IDS, IPS, and WAF
Three security middleboxes that look similar from a distance and have important practical differences.
| Type | Position | Action | Operates on |
|---|---|---|---|
| IDS (Intrusion Detection System) | Out of band — receives mirrored traffic | Alerts only | Any protocol |
| IPS (Intrusion Prevention System) | Inline — traffic flows through it | Alerts and drops malicious packets | Any protocol |
| WAF (Web Application Firewall) | In front of a web app, often inline with the reverse proxy | Allows / blocks / challenges per HTTP request | HTTP / HTTPS only |
The trade-off between IDS and IPS is risk vs visibility. IDS is safer (it can’t accidentally block legitimate traffic), but it can’t actually stop an ongoing attack. IPS can stop attacks but a bad rule can take production down. Most enterprises run an IPS at the perimeter and an IDS deeper inside, where false positives would be catastrophic.
WAFs are the modern web answer to “there’s a known SQL injection in the app, the dev team can’t patch until next sprint.” You write a WAF rule that recognises the attack pattern, and the WAF blocks the request before it reaches the application. Cloud WAFs (Cloudflare, AWS WAF, Azure Front Door) have made this almost commodity.
VPN gateways
A VPN gateway terminates encrypted tunnels from somewhere else — remote workers, branch offices, or partner networks — into your network. The gateway authenticates the remote end, decrypts the tunnel, and routes the unwrapped traffic onto your internal network as if the remote endpoint had been there all along.
The two flavours:
- Remote-access VPN — for individual users with a VPN client (OpenVPN, WireGuard, IPsec, Tailscale, etc.). Each user gets an internal IP from a pool when they connect.
- Site-to-site VPN — for connecting two networks. The two gateways negotiate a tunnel, and traffic from one network destined for the other is silently encrypted, sent through the tunnel, and decrypted on the far side. The hosts on each side don’t know there’s a VPN involved.
From a network-design point of view, the VPN gateway is just another routed interface; from a security point of view, it’s a critical perimeter device that’s a high-value target for attackers.
Where these middleboxes sit on a real network
A typical request to a web service hits these in order:
laptop - your client
|
| (internet)
v
edge firewall - drops anything that isn't a legitimate inbound flow
|
v
load balancer - picks one of N backend frontends
|
v
reverse proxy - terminates TLS, applies WAF rules, caches static content
|
v
app server - the actual code, finally
|
v
database - on a different internal segment, behind another firewall
That’s a deliberately middle-of-the-road shape. Real networks combine boxes (an L7 load balancer that includes WAF, a reverse proxy that does TLS termination and load balancing, etc.) and add detail (caching tier, message queue, identity proxy). The principle holds: each box has one job, packets pass through them in a defined order, and the diagram is always finite.
The internet hierarchy: who owns what
Zoom out from any single network and the internet itself has structure. There are five or six entities that matter for understanding why your packets reach Google in 25 ms but Australia in 200.
IANA — the registry
The Internet Assigned Numbers Authority maintains the global registries: IP address blocks, AS numbers, port numbers, protocol numbers, DNS root zone. IANA itself doesn’t hand out individual IP addresses to end users; it allocates large blocks to the regional registries.
The five RIRs
Regional Internet Registries hand out IP blocks within their geographic area:
| RIR | Region |
|---|---|
| ARIN | US, Canada, parts of the Caribbean |
| RIPE NCC | Europe, Middle East, parts of Central Asia |
| APNIC | Asia-Pacific |
| LACNIC | Latin America and the Caribbean |
| AFRINIC | Africa |
If you ever need to look up who owns an IP, the answer comes from one of these (via whois). Each RIR allocates blocks to ISPs, who allocate down to enterprises, who use them on hosts.
ISP tiers
Internet service providers don’t all have the same role. The hierarchy:
| Tier | What they do | Examples |
|---|---|---|
| Tier 1 | The backbone. They peer with each other for free; they don’t pay anyone for transit. Reach the entire internet through their own network and peering relationships. | AT&T, NTT, Lumen (CenturyLink), Telia Carrier, GTT, Tata, Telefónica |
| Tier 2 | Peer locally for free with other regional networks; buy transit from Tier 1 to reach the rest. The big regional ISPs. | Comcast, BT, Vodafone, Cox, Bell Canada, KPN |
| Tier 3 | Buy transit from Tier 1 or Tier 2 for everything. They don’t typically peer; they pay. Last-mile to homes and small businesses. | Most local cable companies, residential ISPs, regional fiber providers |
When you trace a route from your home to a server far away, you’re seeing this hierarchy: your packet goes from your Tier-3 ISP to a Tier-2 they buy from to a Tier-1 backbone, across that backbone, then back down through another Tier-2 and Tier-3 to the destination.
Internet exchange points (IXPs)
An IXP is a physical facility where lots of networks meet to exchange traffic directly with each other instead of paying upstream providers to carry it. Major ones (DE-CIX in Frankfurt, AMS-IX in Amsterdam, LINX in London, Equinix Ashburn, Equinix San Jose) carry petabits of traffic per second. The economics are simple: if Comcast and Netflix both have a presence at the same IXP, they can swap traffic for the cost of a cross-connect cable instead of paying a Tier-1 to carry it — and the latency drops because the physical path is shorter.
Content delivery networks (CDNs)
CDNs (Cloudflare, Akamai, Fastly, AWS CloudFront, Google’s edge) cache content close to users so the last mile is fast even if the origin is on the other side of the planet. From a client’s perspective, when you request a file from a CDN-fronted site, you get it from the nearest cache — often within the same metro area — not from the origin server. Most of the modern web rides on CDNs; if you’ve ever wondered how the same video feels “local” whether you watch from Toronto or Tokyo, this is the answer.
How to recognise what’s in the path
You usually can’t see middleboxes from the outside — that’s often the point. But there are tells.
| Symptom | Likely middlebox involved |
|---|---|
| Connections work for a while then drop after long idle | Stateful firewall’s connection table aged out the entry |
| Some HTTP paths are blocked but others work | L7 firewall, WAF, or reverse proxy doing path-based routing |
| You get a different server response from different geographic locations | CDN serving from local PoPs |
| TLS certificate is for a hostname different from the one in the URL | Reverse proxy doing SNI-based routing or TLS interception |
| Latency suddenly jumps in the middle of a traceroute | You crossed a transit boundary — e.g., your ISP handed off to a Tier-1 |
| You can reach the destination by IP but not by name | DNS / split-horizon / hosts file (lesson 8) |
| You can resolve but not connect | Firewall denying the port, or routing missing |
What you can now answer
- What’s the difference between a stateful and stateless firewall? — Stateful tracks connections; stateless looks at each packet in isolation.
- What does a load balancer at L4 vs L7 actually see? — L4 sees TCP/UDP only; L7 parses application protocol (HTTP) and can route on path or header.
- Forward proxy vs reverse proxy? — Forward works for clients; reverse works for servers. Same protocol, opposite trust direction.
- IDS vs IPS? — IDS alerts only; IPS blocks. Trade-off is false-positive risk vs ability to stop attacks.
- What are ISP tiers? — Tier 1 (backbone, no upstream), Tier 2 (regional, buys some transit), Tier 3 (last-mile, buys most/all transit).
- What’s an IXP? — A physical place where networks peer directly to swap traffic without paying upstream providers.
What’s next
You now have the “what kinds of boxes are out there” map. The next lesson takes that map and spreads it onto the cloud: cloud networking 101 for on-prem admins — what a VPC actually is, the cloud’s twist on subnetting, peering and transit gateways, public/private trade-offs, and the “pets vs cattle” mindset shift that changes how you think about IP addresses when servers are disposable.