What is Actually Inside a Network: Firewalls, Load Balancers, and the Internet Hierarchy

The mental model we’ve built so far is two endpoints with packets bouncing between them. That’s a fine simplification for understanding addressing, routing, and transport. The reality is busier: between any two real-world hosts there’s usually a small fleet of middleboxes — firewalls, load balancers, proxies, IDS sensors, VPN gateways — each doing one specific job. And one rung up from that, the internet itself isn’t a single fabric; it’s a hierarchy of operators (IANA, regional registries, ISP tiers, IXPs, CDNs) with rules about who connects to whom.

This is lesson 10 of Networking from Scratch. The previous lessons gave you the protocol stack and the addressing scheme. This one populates that stack with the actual boxes you’ll see on a network diagram — what each one does, where it sits, and how to recognise its job from the symptoms when it misbehaves.

Hosts vs middleboxes

Strictly speaking, a host is anything with an IP address that’s the actual source or destination of traffic — your laptop, a server, a phone, an IoT thermostat. A middlebox sits between hosts and processes traffic that isn’t meant for it. Routers and switches are technically middleboxes too (we covered them in lesson 5), but in everyday usage we mean the boxes that look at the contents of packets, not just the addresses.

Five categories cover the vast majority of middleboxes you’ll encounter:

Firewalls (filter traffic).
Load balancers (distribute traffic).
Proxies (relay traffic on someone else’s behalf).
IDS / IPS / WAF (inspect for bad traffic).
VPN gateways (terminate encrypted tunnels).

Firewalls

A firewall’s job is to allow or deny traffic based on rules. The rules can be as simple as “allow TCP port 443 inbound to this IP” or as complex as “inspect the SNI in the TLS handshake and block if the destination domain matches a category list.”

Type	What it inspects	Example use case
Stateless (packet filter)	Each packet in isolation: src IP, dst IP, port, protocol	Simple ACLs on a router. Rare as a primary control today.
Stateful	The packet plus the connection state it belongs to	The default for any firewall in the last 25 years. Knows that a return packet of an established TCP connection is allowed automatically.
Next-generation (NGFW)	Connection state + L7 protocol parsing + identity + threat intel	Decrypts TLS, identifies applications regardless of port, applies user-aware policy. The standard for enterprise perimeter firewalls.
Host-based	Same as stateful, but on each host	Windows Defender Firewall, `iptables`/`nftables`, macOS firewall.

The single most important thing to know about a stateful firewall: it tracks every active connection in a connection table. That table has a finite size (sometimes millions of entries, sometimes far fewer on small appliances), and entries time out if traffic stops flowing. That’s why a long-idle SSH session sometimes mysteriously dies after an hour — the firewall’s state entry expired and it now treats your continuation as a brand-new flow that doesn’t match a SYN.

Where firewalls sit

At the perimeter — between the internet and your office or data centre. Catches inbound attacks and unauthorised outbound.
Between segments inside the network — the “internal firewall.” Limits east-west traffic so a compromised laptop can’t pivot freely.
On each host — last line of defence; useful for laptops on untrusted networks.
In front of cloud workloads — security groups, network ACLs, cloud-native firewall services.

Load balancers

A load balancer takes incoming connections and spreads them across a pool of backend servers. Two reasons we use them:

Capacity. One server can’t handle the traffic; ten can.
Availability. If one backend dies, the load balancer routes around it. Users don’t see the outage.

The two kinds you’ll see:

Layer	What it inspects	Pros	Cons
L4 (transport)	TCP / UDP headers	Fast (no payload parsing). Works for any protocol.	Doesn’t see HTTP paths, headers, or cookies, so can’t make path-based routing decisions.
L7 (application)	HTTP requests, often TLS-terminated	Path-based routing, header rewriting, sticky sessions, request-level health checks.	Slower (parses every request), and your traffic must be a protocol the LB understands.

Concrete examples: AWS NLB and HAProxy in TCP mode are L4. AWS ALB, Nginx, HAProxy in HTTP mode, Cloudflare, and Traefik are L7. Many modern deployments stack them: an L4 load balancer accepts the raw TCP connection and forwards to a fleet of L7 reverse proxies that do the smart routing. That gives you both performance and feature richness.

Common load-balancing algorithms

Round-robin — pick the next backend in sequence. Simple, no awareness of load.
Least connections — pick whichever backend has the fewest open connections. Adapts to slow backends.
IP hash / consistent hashing — the same client always goes to the same backend. Useful when sessions are stored on the backend.
Weighted — backends get a weight; bigger ones get more traffic. Useful for mixed-capacity fleets.

Whatever the algorithm, every load balancer also runs health checks — periodic probes against each backend. A backend that fails too many checks gets removed from rotation; a backend that recovers is re-added. The health check definition matters: a poorly-written one (just TCP connect) can leave you sending traffic to a process that’s up but not actually serving requests.

Forward proxy vs reverse proxy

A proxy is a middlebox that relays connections on someone else’s behalf. Two flavours, distinguished by who they’re working for:

	Forward proxy	Reverse proxy
Works for	The client	The server
Sits between	Clients and the wider internet	The internet and your servers
Typical use	Outbound web filtering, content caching, anonymity, content scanning	TLS termination, caching, load balancing, WAF integration, hiding internal topology
Real examples	Squid in a corporate egress, anonymizing services, ad-blocking gateways	Nginx, Apache, Cloudflare, AWS CloudFront, fronting any modern web app

Same technology, opposite direction of trust. A reverse proxy is the most common middlebox you’ll set up if you run a web service: it terminates TLS, caches static assets, applies security rules, and forwards what’s left to your application servers. A forward proxy is what you set up if you want to control or observe what’s leaving your network.

IDS, IPS, and WAF

Three security middleboxes that look similar from a distance and have important practical differences.

Type	Position	Action	Operates on
IDS (Intrusion Detection System)	Out of band — receives mirrored traffic	Alerts only	Any protocol
IPS (Intrusion Prevention System)	Inline — traffic flows through it	Alerts and drops malicious packets	Any protocol
WAF (Web Application Firewall)	In front of a web app, often inline with the reverse proxy	Allows / blocks / challenges per HTTP request	HTTP / HTTPS only

The trade-off between IDS and IPS is risk vs visibility. IDS is safer (it can’t accidentally block legitimate traffic), but it can’t actually stop an ongoing attack. IPS can stop attacks but a bad rule can take production down. Most enterprises run an IPS at the perimeter and an IDS deeper inside, where false positives would be catastrophic.

WAFs are the modern web answer to “there’s a known SQL injection in the app, the dev team can’t patch until next sprint.” You write a WAF rule that recognises the attack pattern, and the WAF blocks the request before it reaches the application. Cloud WAFs (Cloudflare, AWS WAF, Azure Front Door) have made this almost commodity.

VPN gateways

A VPN gateway terminates encrypted tunnels from somewhere else — remote workers, branch offices, or partner networks — into your network. The gateway authenticates the remote end, decrypts the tunnel, and routes the unwrapped traffic onto your internal network as if the remote endpoint had been there all along.

The two flavours:

Remote-access VPN — for individual users with a VPN client (OpenVPN, WireGuard, IPsec, Tailscale, etc.). Each user gets an internal IP from a pool when they connect.
Site-to-site VPN — for connecting two networks. The two gateways negotiate a tunnel, and traffic from one network destined for the other is silently encrypted, sent through the tunnel, and decrypted on the far side. The hosts on each side don’t know there’s a VPN involved.

From a network-design point of view, the VPN gateway is just another routed interface; from a security point of view, it’s a critical perimeter device that’s a high-value target for attackers.

Where these middleboxes sit on a real network

A typical request to a web service hits these in order:

laptop          - your client
  |
  | (internet)
  v
edge firewall   - drops anything that isn't a legitimate inbound flow
  |
  v
load balancer   - picks one of N backend frontends
  |
  v
reverse proxy   - terminates TLS, applies WAF rules, caches static content
  |
  v
app server      - the actual code, finally
  |
  v
database        - on a different internal segment, behind another firewall

That’s a deliberately middle-of-the-road shape. Real networks combine boxes (an L7 load balancer that includes WAF, a reverse proxy that does TLS termination and load balancing, etc.) and add detail (caching tier, message queue, identity proxy). The principle holds: each box has one job, packets pass through them in a defined order, and the diagram is always finite.

The internet hierarchy: who owns what

Zoom out from any single network and the internet itself has structure. There are five or six entities that matter for understanding why your packets reach Google in 25 ms but Australia in 200.

IANA — the registry

The Internet Assigned Numbers Authority maintains the global registries: IP address blocks, AS numbers, port numbers, protocol numbers, DNS root zone. IANA itself doesn’t hand out individual IP addresses to end users; it allocates large blocks to the regional registries.

The five RIRs

Regional Internet Registries hand out IP blocks within their geographic area:

RIR	Region
ARIN	US, Canada, parts of the Caribbean
RIPE NCC	Europe, Middle East, parts of Central Asia
APNIC	Asia-Pacific
LACNIC	Latin America and the Caribbean
AFRINIC	Africa

If you ever need to look up who owns an IP, the answer comes from one of these (via whois). Each RIR allocates blocks to ISPs, who allocate down to enterprises, who use them on hosts.

ISP tiers

Internet service providers don’t all have the same role. The hierarchy:

Tier	What they do	Examples
Tier 1	The backbone. They peer with each other for free; they don’t pay anyone for transit. Reach the entire internet through their own network and peering relationships.	AT&T, NTT, Lumen (CenturyLink), Telia Carrier, GTT, Tata, Telefónica
Tier 2	Peer locally for free with other regional networks; buy transit from Tier 1 to reach the rest. The big regional ISPs.	Comcast, BT, Vodafone, Cox, Bell Canada, KPN
Tier 3	Buy transit from Tier 1 or Tier 2 for everything. They don’t typically peer; they pay. Last-mile to homes and small businesses.	Most local cable companies, residential ISPs, regional fiber providers

When you trace a route from your home to a server far away, you’re seeing this hierarchy: your packet goes from your Tier-3 ISP to a Tier-2 they buy from to a Tier-1 backbone, across that backbone, then back down through another Tier-2 and Tier-3 to the destination.

Internet exchange points (IXPs)

An IXP is a physical facility where lots of networks meet to exchange traffic directly with each other instead of paying upstream providers to carry it. Major ones (DE-CIX in Frankfurt, AMS-IX in Amsterdam, LINX in London, Equinix Ashburn, Equinix San Jose) carry petabits of traffic per second. The economics are simple: if Comcast and Netflix both have a presence at the same IXP, they can swap traffic for the cost of a cross-connect cable instead of paying a Tier-1 to carry it — and the latency drops because the physical path is shorter.

Content delivery networks (CDNs)

CDNs (Cloudflare, Akamai, Fastly, AWS CloudFront, Google’s edge) cache content close to users so the last mile is fast even if the origin is on the other side of the planet. From a client’s perspective, when you request a file from a CDN-fronted site, you get it from the nearest cache — often within the same metro area — not from the origin server. Most of the modern web rides on CDNs; if you’ve ever wondered how the same video feels “local” whether you watch from Toronto or Tokyo, this is the answer.

How to recognise what’s in the path

You usually can’t see middleboxes from the outside — that’s often the point. But there are tells.

Symptom	Likely middlebox involved
Connections work for a while then drop after long idle	Stateful firewall’s connection table aged out the entry
Some HTTP paths are blocked but others work	L7 firewall, WAF, or reverse proxy doing path-based routing
You get a different server response from different geographic locations	CDN serving from local PoPs
TLS certificate is for a hostname different from the one in the URL	Reverse proxy doing SNI-based routing or TLS interception
Latency suddenly jumps in the middle of a traceroute	You crossed a transit boundary — e.g., your ISP handed off to a Tier-1
You can reach the destination by IP but not by name	DNS / split-horizon / hosts file (lesson 8)
You can resolve but not connect	Firewall denying the port, or routing missing

What you can now answer

What’s the difference between a stateful and stateless firewall? — Stateful tracks connections; stateless looks at each packet in isolation.
What does a load balancer at L4 vs L7 actually see? — L4 sees TCP/UDP only; L7 parses application protocol (HTTP) and can route on path or header.
Forward proxy vs reverse proxy? — Forward works for clients; reverse works for servers. Same protocol, opposite trust direction.
IDS vs IPS? — IDS alerts only; IPS blocks. Trade-off is false-positive risk vs ability to stop attacks.
What are ISP tiers? — Tier 1 (backbone, no upstream), Tier 2 (regional, buys some transit), Tier 3 (last-mile, buys most/all transit).
What’s an IXP? — A physical place where networks peer directly to swap traffic without paying upstream providers.

What’s next

You now have the “what kinds of boxes are out there” map. The next lesson takes that map and spreads it onto the cloud: cloud networking 101 for on-prem admins — what a VPC actually is, the cloud’s twist on subnetting, peering and transit gateways, public/private trade-offs, and the “pets vs cattle” mindset shift that changes how you think about IP addresses when servers are disposable.

Tags: #Beginner #CCST #Firewall #Internet #Load Balancer #Networking