Under the Hood

Posted in software -

Imagine that you work in a large and somewhat old-fashioned office building. If you want to send a message to your buddy Joe over at XYZ Corp, this is how it goes. You write out your letter on a piece of paper and put a sticky note on it saying, “Please send to Joe Smith at XYZ Corp,” and hand it to your secretary. She (I said this was an old-fashioned place) puts the letter in an envelope and puts Joe Smith’s name on it. Then she looks up the address for XYZ Corp and writes that on the envelope, along with your return address. Then she hands it off to the guys in the mail room.

What they do is interesting. They look at the address for XYZ Corp and say, “Hmm… that’s out of town. It needs to go to Central.” So they put your envelope inside another envelope and write “Central Post Office” on it.

When it gets to the central post office, they open the outer envelope and read the address on your letter. They say, “Oh, this is going to Chicago,” or wherever. So they put your envelope inside another envelope and write “Central Post Office, Chicago” on it.

Then it gets to the Central Post Office in Chicago. They open up the envelope addressed to them and see the address for XYZ Corp. So they put it in another envelope that just has the 9-digit zip code for the XYZ Corp building on it.

It shows up at XYZ Corp, and the guys in their mail room open up that envelope and see that it’s addressed to Joe Smith. Somebody runs it upstairs to Joe’s secretary, and she opens the envelope and hands Joe your letter. When Joe sends a reply back, it works the same way.

This is how the Internet works.

It’s actually more complicated - there are more middlemen - but that’s fundamentally how it all works. It’s all these little letters (called “packets”) flying around a very, very fast postal system. This is a pretty clear match for email, but it’s also how everything from web pages to streaming video to Voice Over IP works.

When you “go to” a web site, you’re really mailing out a request for a web page. It’s like writing off to a mail-order catalog company. There’s a standard form that defines how you ask for web pages. You fill it out and send it in. You’re sending this little form that says, “I want to see http://www.bluegraybox.com/index.html". That request goes out through this metaphorical postal system to the bluegraybox.com server, and some little toiling minion there xeroxes off another copy of the index.html document and mails it back to you.

Like I said, it’s more complicated than that. How do you keep email and web pages and FTP sites all running on the same machine without tripping over each other? Imagine your office building has a bunch of different departments in it, but they all share the same mail room. Instead of a billing department and a sales department, you’ve got an email department and a web department and an FTP department. The way it works is that they each have different post office box numbers. Whenever a letter comes into your building, the mailroom guys just have to drop it in the right box. One of the rules of this postal system is that the addresses on letters have to have a box number. Furthermore, these box numbers are standardized, so that box 80 is the normal box number for the web department, box 25 is for email, etc. So when you send off your web page request, you know that it’s a web page request, and wherever it’s going, it should be going to box 80.

This is also how you keep your responses straight. Even if you’re the only guy at your company, you could be downloading a couple of mp3s in the background while you’re popping up new browser windows right and left. If all that stuff is landing in the same inbox, you’ll never sort it out. So each time you ask for a web page, you set up a new post office box just for its responses. Your downloads go to boxes 5001 and 5002, and your web pages end up in 5003, 5004, and so on.

Here’s the next wrinkle. Say you send off a request for some big, fat mp3 file that won’t all fit in one envelope. So it gets broken up into a whole bunch of separate letters. Now on top of that, like the real post office, stuff can get lost: Mail trucks get stolen; Some yutz in New Jersey cuts through a long-distance fiber-optic cable with a backhoe. Even if nothing gets lost, there’s no guarantee that everything is going to show up in the order you sent it.

So what do you do? First, you send a letter that effectively says, “Hey, I want to send a whole bunch of letters back and forth with you. Here’s the address you should send all the replies to.” This is where your web browser says, “Connecting to …” in the status bar. Once you’ve got an OK back, you say, “Send me that mp3 file.” You get a whole flood of letters back, numbered “1 of 23”, “2 of 23” and so on. You count through them and realize that you’re missing number 17, so you send another message saying to re-send it. When you finally have all the letters, you can put them in the right order, open them up, and glom the mp3 file together.

This little procedure we’re going through here is called a Protocol. It’s not part of the postal system itself, but it’s a set of rules that people have agreed on for how to use the postal system. In this case, it’s a way of communicating reliably through a system that isn’t reliable. It’s a layer of communications on top of a layer of communications. It’s very meta.

The internet is built up of layers of these protocols. Like the envelopes inside envelopes, at each stage, you’re only concerned with the outermost layer. You slip on or peel off your envelope, and everything else is just the stuff inside it, be it one layer or many. IP, the Internet Protocol, is the postal system - simple, but not entirely reliable. TCP, the Transmission Control Protocol, provides the reliable, ordered delivery on top of IP. Email (SMTP), the web (HTTP), and others are actually another layer on top of TCP. Essentially, they all define standard forms for different mail-order requests.

Again, it’s even more complicated than that. But for now, lots and lots of little letters zipping back and forth across the world. That’s the way to think of it.

Newer article
Programming Mindset
Older article
Software Pin Factory