Reading a File Line by Line in Node.js

Introduction

In Computer Science, a file is a resource used to record data discretely in a computer's storage device. Node.js doesn't override this in any way and works with anything that is considered a file in your filesystem.

Reading files and resources have many usages:

  • Statistics, Analytics, and Reports
  • Machine Learning
  • Dealing with large text files or logs

Sometimes, these files can be absurdly large, with gigabytes or terabytes being stored, and reading through them in entirety is inefficient.

Being able to read a file line by line gives us the ability to seek only the relevant information and stop the search once we have found what we're looking for. It also allows us to break up the data into logical pieces, like if the file was CSV-formatted.

Readline (from v0.12 and on)

Node.js has the native module to read files that allows us to read line-by-line. It was added in 2015 and is intended to read from any Readable stream one line at a time.

This fact makes it a versatile option, suited not only for files but even command line inputs like process.stdin. The documentation on readline module could be found here.

As readline is a native module. You don't have to use npm to any other package manager to add it, just require:

const readline = require('readline');  

and you're good to go!

As the readline method should be supplied with a stream, we have to create it first using another native module - fs:

const fs = require('fs');  

The next step is to create the object that will read from the stream using createInterface() function:

const readInterface = readline.createInterface({  
    input: fs.createReadStream('/path/to/file'),
    output: process.stdout,
    console: false
});

Make sure you substitute /path/to/file with the actual path to a file in your filesystem.

Once the preparation is done - reading a file line-by-line and printing its content to the console could be done by:

readInterface.on('line', function(line) {  
    console.log(line);
});

Here we're essentially saying that whenever the line event occurs in the readInterface it should call our function and pass it the content read from the stream. In our case, we don't want to overcomplicate things and just print it out to the console.

Line-Reader

After a detailed explanation of how you could read a file line-by-line using the native Node.js module, let's take a look at a shorter version of it using the open-source line-reader module from npm.

As it's a non-native module, we need to make sure we have initialized the npm project in a proper way with npm init and then install it:

$ npm install --save line-reader

This will install the dependency and add it to the package.json file.

Once it's done, reading a file line-by-line is similar to the previous example only without creating a readInterface in the middle:

const lineReader = require('line-reader');

lineReader.eachLine('/path/to/file', function(line) {  
    console.log(line);
});

A quite useful feature here is to stop reading when some condition turns true. This is achieved by simply returning false from the callback function.

For example, we could read a file line by line until we find a line that has the word "STOP" in it:

lineReader.eachLine('path/to/file', function(line) {  
    console.log(line);
    if (line.includes('STOP') {
        return false; // stop reading
    }
});

There's a slightly different approach, which uses two nested callbacks and syntax that may seem more natural to the Java developers out there:

lineReader.open('/path/to/file', function(reader) {  
    if (reader.hasNextLine()) {
        reader.nextLine(function(line) {
            console.log(line);
        });
    }
});

Here, we're using the open() function, which doesn't provide us with the lines from a file instantly, but it rather gives us a reader. It has its own set of functions like hasNextLine() and nextLine() which allow us to have a bit more control over the process of reading a file line-by-line in Node.js.

N-readlines

A different syntax is provided by the npm module n-readlines:

Let's install it:

$ npm install --save n-readlines

And require it:

const lineByLine = require('n-readlines');  

In order to be able to read from a file, we should create a new object, providing a path to our file as an argument:

const liner = new lineByLine('/path/to/file');  

Getting the lines from file is done by calling the next function:

let line;

while (line = liner.next()) {  
    console.log(line);
}

An interesting function of the n-readlines module is reset(). It resets the pointer and starts the reading process from the very beginning of the file.

Note: It works only if the end is not reached.

Common Mistakes

A common mistake when reading a file line-by-line in Node.js is reading the whole file into memory and then splitting its content by line breaks.

Here's an incorrect example which might overload your system if you provide it a large enough file:

require('fs').readFileSync('/path/to/file', 'utf-8').split(/\r?\n/).forEach(function(line) {  
    console.log(line);
});

On first glance, it seems that the output is the same for this approach as well as for the previous ones, and in fact, it works fine for small files. But go ahead and try working with a big one. It's definitely not something you want to see in your production system.

Conclusion

There are multiple ways of reading a file line by line in Node.js, and the selection of the appropriate approach is entirely a programmer's decision.

You should think of the size of the files you plan to process, performance requirements, code style, and modules that are already in the project. Make sure to test on some corner cases like huge, empty, or non-existent files, and you'll be good to go with any of the provided examples.

Author image
Ukraine, Kiev Twitter Website