Learn Node.js: A Beginner's Guide

JavaScript is undoubtedly one of the most popular programming languages out there today, and for good reason. It can easily be run in your browser, on a server, on your desktop, or even on your phone as an app. One of the most popular and easiest ways to write JavaScript is using Node.js.

There are quite a few resources out there to learn Node.js, but not many of them really give you the background, tools, and resources you need to actually succeed at writing Node code.

So what I'm aiming to do here is provide you with a guide that I had wish I had when just starting out. I'll start with a short description of what Node actually is and what it's doing behind the curtain, then I'll give you some concrete examples that you can try yourself right in the browser, and finally I'll give you a bunch of resources to guide you through some more useful and practical examples/concepts.

Note that this guide will not teach you how to code, but instead it will guide you through the basics of the Node run-time and npm.

What is Node

Node is a server-side cross-platform runtime environment that runs on the V8 JavaScript engine, which powers Google's Chrome browser. This is really the heart of Node and is the component that actually parses and executes the code.

The V8 engine does this by compiling the JavaScript to native machine code, which makes it much faster than an interpreter. To speed things up even more, the compiled code is optimized (and re-optimized) dynamically at runtime based on heuristics of the code's execution profile. This means that as the program runs, the engine actually tracks its performance and makes the code faster based on certain factors that are tracked.

As a runtime, Node's big focus is to use an event-driven, non-blocking IO model to make it lightweight and fast. For some, this programming model can be a bit confusing at first, but it actually does a great job of simplifying development for heavy-IO applications, like websites.

This design is ideal for optimizing your code's throughput and scalability, which is a big reason for why it has become so popular. For example, someone got it to handle 600,000 concurrent websocket connections, which is insane. Now, he had to do a bit of custom configuration, but that doesn't make it any less impressive. This is exactly why companies like IBM, Microsoft, and PayPal are using Node for their web services.

Now, Node doesn't even need to be fast to make it attractive. One of my favorite features is actually the package manager, npm. A lot of languages lack a good package manager like this. npm is a command line tool you can use to initialize modules, manage dependencies, or run tests, among other things.

npm logo

The public repository is open for anyone to download and publish code to. As of the time of this writing, npm is hosting over 210,000 modules, ranging from websites to command line tools to API wrappers.

Here is an example of a package I created. You can see that the main page is the README, which describes what the package does and how to use it. You also get a quick rundown of other info, like number of downloads, the repository location, and the software license used.

What Node is good for

Among other things, Node is probably best-suited for building websites and tools that require real-time, synchronous interaction. Chat sites/apps are a good example of this since they're usually very IO-heavy. The non-blocking event-driven model allows it to handle lots of requests simultaneously.

It is also very good for creating the front-end for web APIs (via REST). This is because it is optimized for event-driven IO (which I already touched on) and it handles JSON natively, so there is very little to no parsing needed.

What Node is not good for

On the other end, let's see what Node is not good at. Most notably, it is very ill-suited to performing heavy computational tasks. So if you wanted to do something like machine learning with Node, you probably won't have the best experience.

Node is also still fairly young, so it's still under rapid development. In the past few months we've gone from v0.12.x to v5.1.x. So if you need something more stable then this probably isn't for you.

And as for the asynchronous programming "problem", I think the first part of this Quora answer does a good job explaining it:

[Its] lack of inherent code organization is a huge disadvantage. It gets especially exacerbated when the development team as a whole isn't familiar with asynchronous programming or standard design patterns. There are just too many ways for code to get unruly and unmaintainable.

Although asynchronous programming is a good thing overall, it does add complexity to your programs.

The Node REPL

Ok, on to some code. We're going to start out pretty simple and just run a few commands in the REPL (read-eval-print loop), which is just an application that lets you interactively run Node code in a shell. A program written here is executed piece-wise instead of all at once.

I'll assume you're already familiar with JavaScript, so we'll just go through some Node-specific things throughout this article.

Let's try out one of the built-in modules that come with Node, like the crypto module.

Assuming you have Node installed already, run the node command in your shell, and type in the following code to the prompt line by line:

var crypto = require('crypto');

crypto.createHash('md5').update('hello world').digest('hex');  

After entering the last line in the REPL, (or by clicking the 'run' button above) you should see 5eb63bbbe01eeed093cb22bb8f5acdc3 printed out to the console.

The crypto module is loaded using the require() function, which handles the resolution and loading of code for you. More info on how that works here.

Once the module has been loaded, you can use its functions, which in this case we use createHash(). Since REPLs execute code piece-wise they typically print out the returned value for each line, like you saw here.

You can use REPLs like this to quickly test out code without having to write it to a new file and execute it. It's almost like a general purpose sandbox environment.

Your first program

REPLs are fun and all, but they will only get us so far. So let's move on and write our first real Node program. We won't worry about using third-party modules quite yet (but don't worry, we will later), so let's see what built-in code is available to us first. There is a good amount of code provided to you already, including (but not limited to):

  • fs: Simple wrappers provided over standard POSIX functions
  • http: A lower-level HTTP server and client
  • os: Provides some basic methods to tell you about the underlying operating system
  • path: Utilities for handling and transforming file paths
  • url: Utilities for URL resolution and parsing
  • util: Standard utility functions like debugging, formatting, and inspection

The amount of code built-in isn't on the level of Python, but it'll do the job. The real benefits come when you start getting in to the third-party modules.

For our first program, we'll create a simple utility that determines your location using your IP address (kinda creepy, I know):

var http = require('http');

var options = {  
    hostname: 'ipinfo.io',
    port: 80,
    path: '/json',
    method: 'GET'
};

var req = http.request(options, function(res) {  
    var body = '';

    res.setEncoding('utf8');
    res.on('data', function(chunk) {
        body += chunk;
    });

    res.on('end', function() {
        var json = JSON.parse(body);
        console.log('Your location: ' + json.city + ', ' + json.region);
    });
});

req.end();  

Copy the code above and paste it in a file named 'index.js'. Then, on the command line, navigate to the directory with the file you just created and run it with:

$ node index.js

You should see 'Your location: [CITY], [REGION]' printed to the command line. The city/region printed out will probably be pretty close to you, but not exact. Also, if no city/region is printed then that just means your IP info wasn't in the database.

Since this code doesn't use any 3rd-party dependencies, it doesn't need to have a package.json file or node_modules folder, which we'll explain more about in the next section.

Your first package

Note that throughout this article I use 'package' and 'module' interchangeably.

For just about every website/tool/project you create with Node.js you'll also want to create a module around it. This is so you can specify dependencies, tests, scripts, repositories, etc.

A typical module consists of a few important things:

  • package.json: A JSON file containing all of the module information
  • node_modules/: A directory containing all the dependencies
  • index.js: The main code file
  • README.md: Documentation about the module
  • test/: A directory of tests for the module

There are a ton of other things you can add to a module (like a .npmignore file, a docs directory, or editor configuration files), but the things listed above are some of the most common you'll see.

To show how all of this works, throughout the rest of this section we'll create our own package that builds on the previous example.

Instead of just telling you your location based on your IP address, we'll use some popular Node packages to create a tool that lets you find the location of any website's server. We'll call it twenty (see why).

Initializing the package

First, create and navigate to a new directory for your project:

$ mkdir twenty
$ cd twenty

Then, use npm to initialize the project:

$ npm init

This utility will walk you through creating a package.json file.  
It only covers the most common items, and tries to guess sane defaults.

See `npm help json` for definitive documentation on these fields  
and exactly what they do.

Use `npm install <pkg> --save` afterwards to install a package and  
save it as a dependency in the package.json file.

Press ^C at any time to quit.  
name: (twenty)  
version: (0.0.1)  
description: Locates the city/region of a given URL/IP address  
entry point: (index.js)  
test command:  
git repository:  
keywords:  
license: (MIT)  
About to write to /Users/scott/projects/twenty/package.json:

{
  "name": "twenty",
  "version": "0.0.1",
  "description": "Locates the city/region of a given URL/IP address",
  "main": "index.js",
  "scripts": {
    "test": "echo \"Error: no test specified\" && exit 1"
  },
  "author": "Scott Robinson <scott@stackabuse.com> (http://stackabuse.com)",
  "license": "MIT"
}


Is this ok? (yes) yes  

Fill out each prompt (starting at name: (twenty)), or don't enter anything and just press return to use the default settings. This will create a properly configured package.json file containing the following JSON:

{
  "name": "twenty",
  "version": "0.0.1",
  "description": "Locates the city/region of a given URL/IP address",
  "main": "index.js",
  "scripts": {
    "test": "echo \"Error: no test specified\" && exit 1"
  },
  "author": "Scott Robinson <scott@stackabuse.com> (http://stackabuse.com)",
  "license": "MIT"
}

This file is your starting point where all the project-specific info is saved.

Install dependencies

To add dependencies to your project, you can use the npm install command from the command line. For example, to add our first dependency, request, try running this:

$ npm install --save request

The install command will download the latest request package from npm and save it in the node_modules directory. Adding the --save flag tells npm to save the package details in package.json under the 'dependencies' section:

"dependencies": {
    "request": "2.67.0"
}

Now you'll be able to use the request module anywhere in your project code.

Improving the code

The request module gives you functions to easily make all kinds of HTTP requests. The HTTP example we showed above wasn't too bad, but request makes the code even more compact and easier to read. The equivalent code using request would look like this:

var request = require('request');

request('http://ipinfo.io/json', function(error, response, body) {  
    var json = JSON.parse(body);
    console.log('Your location: ' + json.city + ', ' + json.region);
});

It would be more interesting if we could find the location of any IP address, and not just our own, so let's allow the user to enter an IP address as a command line argument. Like this:

$ node index.js 8.8.8.8

To access this argument within our program, Node makes it available in the global process object as process.argv, which is an array. For the command we just executed above, process.argv would be ['node', 'index.js', '8.8.8.8'].

To make things even easier, we'll use the yargs package to help us parse out command line arguments. With a simple program like this, yargs isn't really necessary, but I'll be improving on twenty in a later article, so we might as well add it now.

Just like request, we'll install it with:

$ npm install --save yargs

Modifying the code to use yargs to grab the argument (or default to our own IP if no argument was give), we end up with this:

var request = require('request');  
var argv = require('yargs').argv;

var path = 'json';

path = argv._[0] || path;

request('http://ipinfo.io/' + path, function(error, response, body) {  
    var json = JSON.parse(body);
    console.log('Server location: ' + json.city + ', ' + json.region);
});

So far this tool is great for command line usage, but what if someone wants to use it as a dependency in their own code? As of now the code hasn't been exported, so you wouldn't be able to use it anywhere else but the command line. To tell Node which functions/variables to make available, we can use module.exports.

var request = require('request');  
var argv = require('yargs').argv;

var findLocation = function(ip, callback) {  
    var path;
    if (typeof(ip) === 'function' || !ip) path = 'json';
    else path = ip;

    request('http://ipinfo.io/' + path, function(error, response, body) {
        var json = JSON.parse(body);
        callback(null, json.city + ', ' + json.region);
    });
};

module.exports = findLocation;  

Great! Now anyone that downloads this package can require it anywhere in their code and use the findLocation() function.

But, you may have noticed that now we can't use it as a command line tool anymore. We don't want to put the rest of the old code in there like this, though:

var request = require('request');  
var argv = require('yargs').argv;

var findLocation = function(ip, callback) {  
    var path;
    if (typeof(ip) === 'function' || !ip) path = 'json';
    else path = ip;

    request('http://ipinfo.io/' + path, function(error, response, body) {
        var json = JSON.parse(body);
        callback(null, json.city + ', ' + json.region);
    });
};

var arg = argv._[0] || path;

// This runs every time the file is loaded
findLocation(arg, function(err, location) {  
    console.log('Server location: ' + location);
});

module.exports = findLocation;  

This would be bad because then any time someone require()s this file to use the findLocation() function it'll print their own location to the command line. We need a way to determine if this file was called directly with node index.js and not by require(), so if it was called directly then we'll check the command line for arguments. This can be done by checking require.main against module, like this: if (require.main === module) {...}, which leaves us with:

var request = require('request');  
var argv = require('yargs').argv;

var findLocation = function(ip, callback) {  
    var path;
    if (typeof(ip) === 'function' || !ip) path = 'json';
    else path = ip;

    request('http://ipinfo.io/' + path, function(error, response, body) {
        var json = JSON.parse(body);
        callback(null, json.city + ', ' + json.region);
    });
};

if (require.main === module) {  
    findLocation(argv._[0], function(err, location) {
        console.log('Server location: ' + location);
    });
}

module.exports = findLocation;  

Now we can use this code both on the command line and as a dependency.

Note: There is a better way to do the CLI/library hybrid, but we'll keep it simple and stick with this method for now. See twenty on Github for more info, specifically the bin directory and package.json settings.

Publishing your package

Finally, we'll want to make it available to others on npm. All you need to do to make the package available is run this in the package directory:

$ npm publish

You'll be prompted for your username and password, and then the code is pushed to the registry.

Keep in mind that you'll either need to scope your package or change its name since the name 'twenty' is already taken by me.

Where to go from here

With Node's popularity booming, there are tons of resources all over the internet, including some articles on this site =)

Here are a few from Stack Abuse that might be helpful:

And here are some other good resources around the internet:

Just keep in mind that the vast majority of what you learn won't be from videos or articles, it'll be from what you learn on your own exploring the language, tools, and packages.

So while articles like this are nice for getting started, make sure you focus more on writing code than reading about someone else writing code. Experience trumps everything else.

Conclusion

We only covered a small fraction of what Node and npm have to offer, so check out some of the resources I linked to above to find out more.

And I can't stress enough how important it is for you to get experience actually writing code. npm makes it really easy to browse packages and find their repositories. So find a package that's useful or interesting to you and see how it works underneath.

Are you a Node novice? What other Node topics do you want to learn about? Let us know in the comments!