How servers work (in Node.js)
Let's explore how web servers work. How does a server actually get requests and send responses? And what do HTTP and TCP have to do with it?
A lot of layers work together to make web servers possible (e.g. frameworks, HTTP, TCP, and Ethernet). Each layer uses different core concepts and abstractions—for example at the framework level we work with applications whereas at the TCP level we work with sockets and connections. Its fascinating to peek at how they work—even if we mostly work in one layer.
We'll start with a simple server written with a Node.js web framework (Express.js). Then we'll look at how Express.js builds an HTTP server for you. Finally we'll see how it's all a TCP server powering your HTTP layer. As we dive deeper we'll touch on networking basics such as HTTP, and TCP (with example code).
Node.js lets us write web applications quickly, especially with frameworks like Express.js. Here’s a Express.js “Hello world!” server:
const app = require('express')();
app.get('/', (req, res) => res.send('Hello world!'));
app.listen(8080);
How does it work?
Node.js lets us answer that by providing lower-level access. You have similar capabilities in Node.js as C++ because Node.js has wrappers over the operating system’s networking API (TCP sockets).
Do not copy/paste this code into your production applications unless you know what you’re doing. This code is not meant to be secure or optimized.
Let’s begin with a web framework
Here’s our Express.js “Hello world!” server again:
const app = require('express')();
app.get('/', (req, res) => res.send('Hello world!'));
app.listen(8080);
Not much to think about here. Even if you have zero experience with servers and networking you can just run this code and visit http://localhost:8080
in your browser.
We can still see glimpses of the lower layers though. Just look for all the code that has nothing to do with our application logic:
- what does the
get
method onapp
mean? - why do we call
listen
onapp
to start the server? - what does the port number passed to
app.listen
mean, other than the fact that its also in the URL (http://localhost:8080
)?
Now let’s take away web framework
Express.js uses the Node.js http
module under the hood. Here’s our server built with http
directly:
const http = require('http');
const server = http.createServer();
server.on('request', (req, res) => {
if (req.method === 'GET' && req.url === '/') {
res.end('Hello world!');
}
});
server.listen(8080);
The first major difference is the listener for the request
event on server
. The listen
method to start the express
server makes sense now—it’s all just an EventEmitter
! We must begin to think in terms of asynchronous I/O events.
The unexplained app.get
method from the express
server is now just a check on req.method
for the word GET
, a Hypertext Transfer Protocol (HTTP) method. We are explicitly in HTTP-land now.
There is only one request handler, whereas we could have added multiple in the express
server for different HTTP methods and URLs (with app.get
, app.post
, etc.). That’s because there only one request event. http
doesn’t know how your server works—it just provides a request and lets you respond. We must catch the request event and route it ourselves.
Compare the res.end
method with the res.send
method from the Express.js example. When you call res.send
you may think that sending a response is a singular action, whereas res
is actually a stream that you can write to several times before you end it. Once again we must think in terms of asynchronous I/O events.
Already you’re getting a sense of what Express.js does under the hood for you.
Why don’t we implement HTTP ourselves next?
The next layer down uses the Node.js net
module, which http
uses to send messages. Here’s our server built with net
:
const net = require('net');
const server = net.createServer();
server.on('connection', socket => {
socket.on('data', data => parseIncoming(socket, data));
socket.on('close', () => socket.destroy());
});
function parseIncoming(socket, data) {
const CRLF = '\\r\\n';
// incoming data is binary
const message = data.toString();
const [requestLine] = message.split(CRLF);
const [method, url] = requestLine.split(' ');
const req = {
method,
url
};
const res = {
end(body) {
const message =
'HTTP/1.1 200 OK' + CRLF +
`Date: ${new Date().toUTCString()}` + CRLF +
CRLF +
body;
socket.end(message);
}
};
server.emit('request', req, res);
}
server.on('request', (req, res) => {
if (req.method === 'GET' && req.url === '/') {
res.end('Hello world!');
}
});
server.listen(8080);
That’s a lot more code! 😱
Welcome to the transport layer, which handles connections and messages. The message contents are just an implementation detail—you could send anything: text, images, audio. It’s all converted to binary before being sent. First let’s see how to construct valid messages:
What is HTTP, really?
Hypertext Transfer Protocol (HTTP) is mostly just a way to write messages. It’s the syntax that browsers (as a client) and most web servers use when speaking to each other.
For example, when we visit http://localhost:8080
the browser sends this HTTP request message to the server:
GET / HTTP/1.1
Host: localhost:8080
This is just the basic message. Actual requests may have extra detail.
The first line (GET / HTTP/1.1
) is called the request line. It specifies a method (GET)
, a request target (our URL: /
) and the HTTP version (HTTP/1.1
). The second line is called a header, and in this case it the Host header. Don’t worry yet about what it means, just know that it is required by HTTP/1.1.
New lines in an HTTP message must be carriage-return line-feeds (CRLF, or \r\n
). Don’t use plain new-lines (\n
)!
When our server responds to the browser, it sends back an HTTP response message:
HTTP/1.1 200 OK
Date: Tue, 15 Nov 1994 08:12:31 GMT
Hello world!
The first line (HTTP/1.1 200 OK
) is called the status line. It specifies the HTTP version (HTTP/1.1
), the status code (200
), and the status description (OK
). The second line is a Date header, which is required in almost all cases by HTTP/1.1.
At the bottom of the message you see our Hello world! response. This is called the message body and must be separated from the headers by a blank line. The body itself can contain plain new-lines (\n
).
Let’s focus next on how to actually send these messages:
Enter TCP
The transport layer needs a protocol too, and the default is the Transmission Control Protocol (TCP). net
lets us create a TCP server which passes messages to our application (and also lets it send messages). Our application then parses the message as an HTTP request and also sends HTTP responses. That’s why TCP is called a transport-layer protocol and HTTP is called an application-layer protocol.
Back to the code
Now we’re see the abstraction breaking further. A web server is actually two things: a way to construct messages (HTTP), and a way to send them (TCP). Hey, that’s a stack!
Here’s the code again:
const net = require('net');
const server = net.createServer();
server.on('connection', socket => {
socket.on('data', data => parseIncoming(socket, data));
socket.on('close', () => socket.destroy());
});
function parseIncoming(socket, data) {
const CRLF = '\\r\\n';
// incoming data is binary
const message = data.toString();
const [requestLine] = message.split(CRLF);
const [method, url] = requestLine.split(' ');
const req = {
method,
url
};
const res = {
end(body) {
const message =
'HTTP/1.1 200 OK' + CRLF +
`Date: ${new Date().toUTCString()}` + CRLF +
CRLF +
body;
socket.end(message);
}
};
server.emit('request', req, res);
}
server.on('request', (req, res) => {
if (req.method === 'GET' && req.url === '/') {
res.end('Hello world!');
}
});
server.listen(8080);
Compare this to the http
server code. The first noticeable difference is the connection
event on server
. We represent the connection by socket
, and once connected the client can send data
through socket
. And when the connection closes we manually destroy the socket.
Unlike the request
event on the http
server we don’t have the HTTP method and URL readily available on the socket’s data
event. In fact the message is not even a string—it’s binary. Before we can handle the request we need to parse the incoming message. That’s where parseIncoming
comes in.
parseIncoming
extracts request data and creates familiar req
and res
objects for our request handler. res.end
attaches our status line and response headers to the body before calling socket.end
, which is what actually sends the message. Finally it emits a request
event on the server, which lets us return to our familiar request handler.
Our parser doesn’t actually do much—it only parses the request line. A real parser would process the header lines and possibly upgrade to an encrypted connection before even calling our application’s request handler.
N.B. In this sample code res
is not actually a stream, unlike the http
server.
Turns out the http
module does a lot for us: creates a TCP server, parses incoming messages, constructs and sends responses, and handles connections. No wonder all the popular Node.js web frameworks use it!
What about implementing TCP itself?
Operating systems implement TCP themselves. Implementing TCP is hard and the existing implementations have been hardened through years of experience. It is possible to implement TCP yourself but not with Node.js core modules. We can’t simply require
the lower level module in Node.js.
If you’re interested in diving further, researching the Internet Protocol Suite is a good start.
A productive environment is made of this
This learning exercise was a good primer in HTTP and TCP. You likely won’t touch these again unless you want to build an HTTP parser or a web framework. For reference, the Node.js HTTP parser is written in C and uses only 40 bytes of data per message stream.
All these abstractions make Node.js productive for web development. There are a lot of batteries included—they are just hidden from you until you want to use them.
“the most productive programming environments are the ones that let you work at different levels of abstraction”—Joel on Software
The fact that Node.js provides a built-in web server also helps, because you don’t need to decide which server to tie your application to. Python, for example, uses the Web Server Gateway Interface (WGSI) and you must tie your code to a WSGI-server. It’s older cousin, the Common Gateway Interface (CGI) had a similar approach.
Explore further
- Sockets: you may have wondered if the
data
event can be called multiple times on the TCP server. It can, and this is the basis of socket-based applications and push messaging. This is how frameworks like socket.io work. Here is the Wikipedia page on Network sockets. - The Node.js website has a detailed guide to using the
http
module. - Node.js also has modules for DNS, HTTPS, HTTP/2, and UDP.
- Here is the Linux TCP implementation.