How Websites Work · Sam Claus

Disclaimer

The content of this website and the samcla.us domain are property of Sam Claus (me). I control every aspect of every page of the website. This includes subdomains like api.samcla.us.

I could change the "Published" and "Last Edited" dates of any article to whatever I want.
If you copy the URL of one of the pages of my website and send it to someone, I could change the content of the page in between the time you viewed it and the time they will view it. I could attempt to make them think you are sending them some sort of neonazi manifesto when really you were trying to send them a recipe for squash.

The content of my website is hosted on various servers which are managed by companies. Technically, they could change the content. It would be difficult for your internet service provider (ISP) or some other middle-man to meddle with the content, thanks to the power of HTTPS. Feasibly, someone could figure out my credentials for logging in to the servers that host my website's content; then they could change it.

All of the stories and opinions expressed on this website are my own. (Unless my website gets tampered with by someone else like I described above.) Either way, I encourage you to question everything you read, no matter where you are reading it.

In this guide, I will assume that you understand little about computers but have been using them regularly for years, perhaps even for your whole life.

You likely use a web browser frequently. Popular web browsers include Google Chrome, Mozilla Firefox, Safari, and Microsoft Edge. All of those web browsers have similar interfaces: once you know how to use one, you basically know how to use all of them. Most of their differences are “under the hood”. For example, when Chrome launched in 2008, it took the world by storm not because it looked so much different from Internet Explorer (the de facto web browser back then), but because it was so much faster at loading websites.

By the end of this guide, you should have a solid understanding of what happens after typing a website name (like youtube.com or samcla.us) in your browser in order to make the website appear and function.

Files

You are likely familiar with the concept of files: standalone pieces of information that you can move around, view and modify using applications (“apps”) on your computer. When I say “computer”, I don’t just mean your desktop or laptop; modern smartphones work very similarly, even if you don’t usually see the hierarchy of files and directories/folders.

Files generally have extensions at the end of their name to say what kind of information they contain. Some common examples:

.png (Portable Network Graphics) files contain 2D visual images, such as pictures you have taken with a camera. PNG files include information about transparency, so the picture can be partially see-through when shown on top of other content.
.jpeg (Joint Photographic Experts Group) files are similar to PNG files, but do not contain information about transparency. They are typically smaller than similar PNG files.
.txt (text) files are very simple and contain text, which is typically human-readable, but could just be a bunch of random characters in any order—gibberish.
.docx files are produced/used by Microsoft Word. They contain text just like .txt files, but also all sorts of formatting information, potentially some embedded PNG images, etc.

Changing a file’s extension (by renaming it) does not change the content/validity of the file in any way. Some versions of Windows and Mac OS X will warn you when you try to do so, but only because many applications will not let you open a file unless it is “marked” with the correct extension. Changing the extension back should fix any problems.

Websites ARE files

You have likely created and/or viewed .pdf (Portable Document Format) files before, especially if you have printed anything on paper.

Websites are like PDF files; when you visit https://samcla.us in your web browser, the browser will load and display the “website PDF” so to speak, just like double-clicking a PDF file on your computer displays the information using your default PDF viewer application.

That is a good way to reason about websites at a high-level, but they are slightly more complicated in reality. Websites are not PDF files, but rather .html (Hyper-Text Markup Language) files. Additionally, while a website could consist of a single HTML file, they often consist of many HTML files, not to mention HTML files can embed images and other types of files.

HTML

HTML files describe the structure of a webpage. Here is some HTML:

<aside>
    <img
        src="/lighthouse-point-washington.webp"
        width="504"
        height="378"
        alt="One of the sub-peninsulas of Lighthouse Point, a state park in Washington, on a clear Summer day.">
    <p>
        <strong>NOTE:</strong> this is only a very simple example. Some HTML files
        can be massive.
    </p>
    <button>Click me, click me!</button>
</aside>

When displayed by your browser, with the styling (more on that, later) used by this website, that creates:

As you may have noticed, HTML files are technically plain old text (as in, .txt) files. They can be viewed and edited using any program that can view/edit .txt files. However, it does not work the other way around: not all .txt files are valid .html files. HTML files are expected to follow a very rigid structure that is essentially a tree of elements.

Generally, each element will be drawn (by your browser) on your screen as a rectangular box and consists of:

an opening tag (such as <img> or <strong>),
0 or more attributes (such as src="/lighthouse-point-washington.webp") on the opening tag to customize that particular element (be it an image, video, paragraph of text, whatever),
0 or more child elements which are typically drawn inside of the parent element,
and a closing tag (such as </strong>) which is just the opening tag with a slash at the beginning.

In our example, we used the following types of elements:

<aside>, which indicates that we are briefly branching off from the main topic of an article, and is therefore very useful for blogs and online versions of textbooks
<img>, which tells the browser to load a visual image (PNG, JPEG, etc.) from a URL provided via the src attribute and display it on the page
<p>, which indicates a paragraph of text
<strong>, which indicates emphasis and is generally stylized as bold text
<button>, which creates a clickable button

CSS and JavaScript

.css (Cascading Style Sheet) files customize the appearance/style of a website. CSS is exceptionally powerful and lets you do anything from changing the font of an element’s text, to spacing between elements, to applying powerful 3D transformations and animations to elements. Without CSS, this page you are reading would just be a bunch of Times New Roman headings/paragraphs/lists straight out of the 90s.

.js (JavaScript) files give programmers full control over a webpage. Using JavaScript, you can modify the HTML and CSS as desired, access a user’s webcam if they are viewing the website on a laptop, save information on their computer, and so much more. JavaScript is a full-fledged programming language and I will not describe it in detail here because it would take too long.

Here is some example CSS which draws a 1-pixel-thick red outline around every element:

* {
    outline: 1px solid red;
}

Here, try those styles out on this webpage:

See? This webpage really is just a tree of elements, each of which is a rectangular box, potentially with more rectangular boxes inside of it, and so forth. Notice that even the italicized or bolded sections of text within paragraphs have their own dedicated elements to change the font style from the surrounding defaults. Boxes in boxes in boxes in boxes. Whenever you click the Toggle Styles button, it triggers some JavaScript I wrote to change the CSS styles on this page.

In order to use CSS and JavaScript, HTML files need to tell the browser about them, similarly to loading images:

<link rel="stylesheet" href="/some/url/to/my/stylesheet.css">
<script type="text/javascript" src="/another/url/to/my/code.js"></script>

It’s really all just files and code all the way down

Now, I say things like “CSS files customize the appearance of the webpage” but I should clarify: CSS files, like any other files, are just static information that can be stored in your computer, transferred over the internet, etc. Your browser is responsible for reading the code in the CSS file(s) and then drawing the HTML on your screen accordingly.

Ultimately, your web browser itself is also just files containing code for doing stuff like draw webpages: when you click on your browser’s “icon”, your computer hardware will read over the code files for the browser and do what they say to do.

Fundamentally, web browsers are code that is designed by humans to read other code (HTML/CSS/JS) and then draw some stuff on the screen. They listen for clicks from your mouse and keystrokes from your keyboard, maybe listen for touches from your touchscreen, and then adjust accordingly, such as drawing a different webpage if you click on an HTML link element that was drawn on the screen.

The Internet

Okay, so we’ve established a basic understanding of the sorts of files a website needs to function, i.e., get displayed by your web browser for you to look at. However, these files are rarely stored on your computer, so your web browser must grab them from somewhere.

That somewhere is another computer, which may be very similar to the computer you are using. It may be running Windows or Linux (or even Mac OS X but that is uncommon), and it will generally have the same types of hardware: a processor, a hard drive to store information, etc. That said, the computer is unlikely to have a display, mouse, or keyboard. There is no need for such devices on a computer that is dedicated solely to hosting websites. It need only be connected to the internet via an ethernet cord much like you would use in your own home. We will refer to this variety of computer as a server, because it serves us files (HTML, CSS, JS, etc.) when we ask it for them.

But what is the internet? To explain it, I’ll use an analogy. If each computer is a house that needs to send and receive mail/messages, the internet is the postal system. Instead of having to wait for people in vehicles to transport mail through a network of warehouses to get it from point A to point B, computers send signals over a massive network of wires. Many of those wires are now fiber optic, meaning they are strands of plastic or glass that light shines down. Think about someone sending you an SOS signal from a tower several miles away: they switch the light off and on in a particular way (morse code) to communicate that they need help from you. Light travels so quickly that you could theoretically transfer a massive 100GB file from your computer to a computer on the other side of the world in a matter of seconds. However, that’s assuming you have a direct connection (a single wire) to that computer, and that we have technology to “blink” the light at ultra high frequency. Reality is messier. You are connected to warehouses full of computers that are responsible for forwarding along your message to other warehouses of computers, and so on and so forth, until the message finally gets to the end destination. Along the way, traffic often builds up and your message might be stored and not transmitted to the next warehouse in the chain until previous messages have been forwarded. Still, the internet is incredibly fast: sending an email takes only a couple of seconds vs. days or weeks to send a physical letter.

When you visit https://samcla.us, your browser follows some standard steps (more on that later) to “get in touch” with the server. It will then make requests to the server via the internet, e.g., “Hi, can you give me the HTML file for the website samcla.us?” If all goes well, the server will serve (send back) a response containing the HTML and your browser will begin to read through it and display it on your screen.

IP Addresses

So we now understand that websites get pulled from servers, which are just ordinary computers that have software designed to hand out HTML/CSS/JS files. But how does your computer contact those servers? Every computer that is connected to the internet, including yours, has an IP (Internet Protocol) address, which is essentially a home address so it can send and receive mail. Servers also have IP addresses.

The internet has grown quite a bit since its inception in end of the 20th century, so IP addresses have gone through multiple versions. IPv4 (version 4) addresses have been used for a long time and will not go away anytime soon. If you search the internet for “what is my ip address”, you will likely be told something akin to 70.164.194.118, which is an IPv4 address. Each IPv4 address consists of exactly 4 numbers, each of which are between 0 and 255 (inclusive). If you do the math, that means there are roughly 4.3 billion possible IPv4 addresses. However, when you include mobile phones and other devices, not just desktops and laptops, there are definitely WAY more than 4.3 billion computers in the world. That means it is hard to assign a unique “home address” to each computer, a problem known as IPv4 address exhaustion.

IPv6 (version 6) addresses are basically just bigger than IPv4 addresses, which means more possible number combinations, and therefore more unique “home addresses”. IPv6 is more complicated than IPv4, especially because you are allowed to omit certain parts from an address to shorten it when possible, but a fully written out example is 2345:0425:2CA1:0000:0000:0567:5673:23b5. Scary! 😱

Ports, UDP, and TCP

While you can consider IP address as the equivalent of physical “home addresses”, that’s generally not enough information. Just like there are often multiple people living in one house, who need to each send/receive mail, you typically have many applications running on your computer that need to send/receive mail independently of one another.

Let’s say you have a Windows computer, and you are currently listening to Spotify (a music app), browsing Wikipedia using Google Chrome (a web browser), and waiting for email notifications from Microsoft Outlook (an email app). Each of these apps needs to talk to, i.e., exchange mail with, one or more server computers via the internet.

This is just an example and all of the following information holds true even if you are on a Mac or an Android phone and using a completely different set of applications than the examples I gave.

Spotify will send requests (letters, to continue the mail analogy) to the Spotify server(s) to ask things like:

“What are the names of my playlists? What songs are in each?”
“I am currently trying to play Creep by Radiohead. Can you give me the noises for that song as an MP3 file?”
“Can you please delete Better Now by Post Malone from my Liked Songs playlist? Thanks.”
etc.

Google Chrome will send requests to the Wikipedia servers to ask:

“Can you give me the HTML for the article on Nelson Mandela?”
“Hey, I got the HTML for the article on Nelson Mandela. It says I should load this one CSS file. Can you give me that file?”
“Oh, I also need the portrait of Nelson Mandela so I can display it. Thanks again.”
etc.

Microsoft Outlook will send requests to the Outlook servers to ask:

“Is there any new mail I should notify the user about?”
Waits 5 seconds.
“Is there any new mail I should notify the user about?”
Waits 5 seconds.
“Is there any new mail I should notify the user about?”
Waits 5 seconds.
“Is there any new mail I should notify the user about?”
Waits 5 seconds.
“Is there any new mail I should notify the user about?”
Waits 5 seconds.
“Is there any new mail I should notify the user about?”
And so on and so forth, as long as the application is running in the background.
“Oh, by the way, the user just clicked on the email from their grandma. Can you please send me the content of that email so I can display it? Please mark it as read while you are at it.”
etc.

Imagine if all the responses from those servers arrived at your computer marked with nothing other than the “home address”. It’s the responsibility of Windows (or whatever operating system you are using) to distribute the incoming mail to the appropriate applications. What if it handed Spotify the HTML for the Nelson Mandela article and said “Here’s that MP3 of Creep by Radiohead that you asked for.” Presumably, Spotify would try to interpret the HTML file as an MP3 containing volume/pitch/etc. information and you would hear whatever awful noises resulted coming out of your speakers.

Therefore, someone came up with the concept of ports, which work very similarly across all the common operating systems: Linux, Windows, Mac OS X, iOS, Android, etc. When an application such as Spotify asks Windows to send a request to a server, Windows essentially assigns a port, which is just a number, to that request. It then sends the request, including the port number, to the Spotify server. When the Spotify server sends a response back, it includes both the IP (home) address of the computer and the port number. The IP address is used by internet infrastructure to relay the response all the way back to your computer from California or wherever the Spotify server is. Windows then inspects the port number in the response, realizes it belongs to Spotify, and therefore gives the response to Spotify so it can do whatever it is designed to do with the response.

It is important to remember that IP addresses do not contain port numbers. Port numbers are basically an add-on. When your computer sends packets of data (information, requests, messages, letters, whatever you want to call it) to other computers, the IPv4 (or IPv6) address is the primary piece of information. Most computers do not support only the internet protocol (IP), though. They support UDP and TCP, which you can think of as enhanced versions of the internet protocol where ports are included on every packet, in addition to the IP address. You need not concern yourself with UDP/TCP right now beyond the fact that they are supported by all modern operating systems; describing them in more detail would warrant a separate guide.

Domain Names

If you had to memorize an IPv4 address for every website you like to visit, the web would be difficult to use. Therefore, browsers use the Domain Name System (DNS) to look up the IP addresses for domain names like youtube.com or samcla.us.

Essentially, when you purchase a new computer, the operating system (Linux, Windows, OS X, etc.) comes preconfigured with a small database of IP addresses for domain name servers. When you visit a URL like https://samcla.us/guides/how-websites-work, your browser extracts the domain name and contacts DNS servers until one of them can give it the proper IP address for the domain name. These servers are core web infrastructure and if they were to be taken down, websites would stop working for most people because their computers would not be able to figure out the proper IP address to contact when they try to visit a website.

HTTP

Generally, when two computers communicate over TCP or UDP, it is a 2-way street: either computer can send the other a message at any given time. Both computers are equal “peers”. Websites are built on the concept of requests and responses. Your browser, the client, requests something, such as an HTML or image or CSS file, and the server responds with the content of the file or an error message or some other arbitrary response.

This is where HTTP (Hyper Text Transfer Protocol) comes in. HTTP is built “on top of” TCP, much in the same way that .html files are still valid .txt files. When two computers communicate using HTTP, they are still sending information using the TCP protocol, i.e., including an IP address and port number on every message. HTTP is just stricter and specifies that a client (such as a web browser or a native application) must always “message first” (make a request) and then the server should send back a response. Each request/response is also expected to have a particular structure.

Here is an example HTTP request, with a lot of nitty-gritty details removed:

GET /guides/how-websites-work HTTP/1.1
Accept: text/html,application/xhtml+xml,application/xml;q=0.9,image/avif,image/webp,image/apng,*/*;q=0.8,application/signed-exchange;v=b3;q=0.7
Accept-Encoding: gzip, deflate, br, zstd
Cache-Control: no-cache
Connection: keep-alive
Upgrade-Insecure-Requests: 1
User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/128.0.0.0 Safari/537.36

“S” is for “Secure”

The Web

Putting it all together

Going Deeper

If your browser, while reading through the HTML, finds some links to CSS and JavaScript, images, etc., it will make additional requests to the server to ask for those files so it can use them when putting together the webpage. However, browsers try to display things as quickly as possible. That is why you will often see a webpage appear, and then large images will finish loading after everything else has already been displayed.

Priorities

The order in which the browser loads things is a bit of a balancing act. Generally, people want a webpage to show something as quickly as possible, without waiting for every little detail, especially things you can't see until you scroll, to load. That said, if your browser displays the HTML before loading the CSS, you may briefly see everything as an ugly wall of Times New Roman (or whatever your computer's default font is) text, which will then get suddenly replaced with the properly styled version. That situation is referred to by web developers as a Flash Of Unstyled Content (FOUC) and is considered bad for the user experience.

So you see? Loading images "lazily" often makes sense, but some things are considered critical to a webpage's desired appearance, and therefore should be waited on before displaying anything. "Good" software, in my opinion, is made when the people developing it have a set of priorities in mind and tailor it to meet those priorities as best as possible.