The HTTP protocol
I mention HTTPS in particular because things are different from an HTTPS connection.
I analyze URL requests only
Modern browsers have the capability of knowing if the thing you wrote in the address bar is an actual URL or a search term, and they will use the default search engine if it’s not a valid URL.
I assume you type an actual URL.When you enter the URL and press enter, the browser first builds the full URL.If you just entered a domain, like codeverb.com , the browser by default will prepend HTTP:// to it, defaulting to the HTTP protocol.
DNS Lookup phase
The domain name is a handy shortcut for us humans, but the internet is organized in such a way that computers can look up the exact location of a server through its IP address, which is a set of numbers like 22.214.171.124 (IPv4).
First, it checks the DNS local cache, to see if the domain has already been resolved recently.
Chrome has a handy DNS cache visualizer you can see at chrome If nothing is found there, the browser uses the DNS resolver, using the gethostbyname POSIX system call to retrieve the host information.
gethostbyname first looks in the local hosts file, which on macOS or Linux is located in
/etc/hosts , to see if the system provides the information locally.If this does not give any information about the domain, the system makes a request to the
The address of the DNS server is stored in the system preferences.Those are 2 popular DNS servers:
- 126.96.36.199 : the Google public DNS server
- 188.8.131.52 : the CloudFlare DNS server
The browser performs the DNS request using the UDP protocol.
TCP and UDP are two of the foundational protocols of computer networking.
They sit at the same conceptual level, but TCP is connection-oriented, while UDP may be a connectionless protocol, more lightweight, wont to send messages with little overhead.
The DNS server may need the domain IP within the cache. It not, it’ll ask the basis DNS server.That’s a system (composed of 13 actual servers, distributed across the planet) that drives the entire internet.
The DNS server doesn’t know the address of every and each name on the earth .
What it knows is where the top-level DNS resolvers are.
A top-level domain is that the domain extension: .com , .it , .pizza then on. Once the basis DNS server receives the request, it forwards the request thereto top-level
domain (TLD) DNS server.
Say you are looking for codeverb.com . The root domain DNS server returns the IP of the
.com TLD server.Now our DNS resolver will cache the IP of that TLD server, so it doesn’t need to ask the basis DNS server again for it.
The TLD DNS server will have the IP addresses of the authoritative Name Servers for the
domain we are looking for.Those are the DNS servers of the hosting provider. They are usually more than 1, to serve as backup. For example:
The DNS resolver starts with the first, and tries to ask the IP of the domain (with the
subdomain, too) you are looking for.
TCP request handshaking
With the server IP address available, now the browser can initiate a TCP connection to that. A TCP connection requires a bit of handshaking before it can be fully initialized and you can start sending data.
Sending the request
The request is a plain text document structured in a precise way determined by the
communication protocol. It’s composed of 3 parts:
- the request line
- the request header
- the request body
The request line:-
The request line sets, on a single line:
- the HTTP method
- the resource location
- the protocol version
GET / HTTP/1.1
The request header
The request header is a set of field: value pairs that set certain values.
There are 2 mandatory fields, one of which is Host , and the other is Connection , while all the other fields are optional:
Host: codeverb.com Connection: close
Host indicates the domain name which we want to target, while Connection is always set to close unless the connection must be kept open.
Some of the most used header fields are:
Once the request is sent, the server processes it and sends back a response.
The response starts with the status code and the status message. If the request is successful and returns a 200, it will start with:
The request might return a different status code and message, like one of these:
404 Not Found
301 Moved Permanently
500 Internal Server Error
304 Not Modified
The response then contains a list of HTTP headers and the response body (which, since we’re making the request in the browser, is going to be HTML).
If you like this post, don’t forget to share 🙂