Top 10 Mistakes Made by Node.js Developers

From its inception, Node.js has been met with a mix of accolades and critiques, a debate that continues today. Often lost in these discussions is the fact that every programming language and platform has weaknesses, often stemming from how we use them. Despite the challenges Node.js presents in writing secure code, and its ease in creating highly concurrent code, the platform has a proven track record. Over the years, it has been instrumental in building countless robust and sophisticated web services that scale effectively and have demonstrated their reliability through their longevity online.

However, Node.js, like any other platform, is susceptible to developer errors. Some of these mistakes can hinder performance, while others can make Node.js seem entirely unsuitable for the task at hand. This article will explore ten common pitfalls encountered by developers new to Node.js, and how to sidestep them on the path to becoming proficient with the platform.

Mistake #1: Obstructing the Event Loop

Node.js, like browser-based JavaScript, provides a single-threaded environment. This means that concurrency is achieved not by running multiple parts of your application simultaneously, but rather through the asynchronous handling of I/O-bound tasks. For instance, while Node.js waits for the database engine to retrieve a document, it can shift its attention to another part of the application:

1
2
3
4
// Trying to fetch an user object from the database. Node.js is free to run other parts of the code from the moment this function is invoked..
db.User.get(userId, function(err, user) {
	// .. until the moment the user object has been retrieved here
})

However, introducing a CPU-intensive code block in a Node.js instance handling thousands of client connections can bring the event loop to a standstill, forcing all clients to wait. CPU-bound operations include actions like sorting a massive array, executing an excessively long loop, and similar tasks. For instance:

1
2
3
4
5
function sortUsersByAge(users) {
	users.sort(function(a, b) {
		return a.age < b.age ? -1 : 1
	})
}

While invoking the “sortUsersByAge” function might be acceptable with a small “users” array, using it on a large array will severely impact overall performance. If this operation is unavoidable and you are confident that nothing else is awaiting the event loop (for example, as part of a command-line tool where synchronous execution is inconsequential), then this might not be an issue. However, within a Node.js server instance managing thousands of concurrent users, such a pattern can be disastrous.

If the user array is retrieved from a database, a more efficient solution would be to fetch it pre-sorted directly from the database. If a loop calculating the sum of a large financial transaction history is blocking the event loop, offloading it to an external worker/queue system can prevent the event loop from being monopolized.

As you can see, there isn’t a one-size-fits-all solution for this type of Node.js challenge; each instance needs individual attention. The core principle is to avoid performing CPU-heavy tasks within the front-facing Node.js instances that clients connect to concurrently.

Mistake #2: Multiple Callback Invocations

Callbacks have been integral to JavaScript since its early days. Web browsers handle events by passing references to (often anonymous) functions that act like callbacks. In Node.js, callbacks were once the sole method for asynchronous elements to communicate, until the introduction of promises. Despite this, callbacks remain in use, and package developers still structure their APIs around them. A frequent issue in Node.js related to callbacks involves invoking them multiple times. Typically, a function provided by a package to perform an asynchronous operation expects a function as its final argument, which is called upon completion of the asynchronous task:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
module.exports.verifyPassword = function(user, password, done) {
	if(typeof password !== ‘string’) {
		done(new Error(‘password should be a string’))
		return
	}

	computeHash(password, user.passwordHashOpts, function(err, hash) {
		if(err) {
			done(err)
			return
		}
		
		done(null, hash === user.passwordHash)
	})
}

Note the presence of a return statement each time “done” is called, except for the final instance. This is because invoking the callback doesn’t automatically terminate the current function’s execution. If the first “return” were omitted, providing a non-string password to this function would still lead to “computeHash” being called. Depending on how “computeHash” handles such scenarios, “done” might be called multiple times. Developers utilizing this function elsewhere might be caught off guard when their provided callback is invoked repeatedly.

Careful coding is the key to averting this Node.js pitfall. Some Node.js developers make it a practice to include a return keyword before each callback invocation:

1
2
3
if(err) {
	return done(err)
}

Since the return value often holds little significance in many asynchronous functions, this approach provides a straightforward way to avoid such problems.

Mistake #3: The Perils of Callback Nesting

The practice of deeply nesting callbacks, often dubbed “callback hell”, isn’t a Node.js problem in itself. However, it can quickly lead to unmanageable code:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
function handleLogin(..., done) {
	db.User.get(..., function(..., user) {
		if(!user) {
			return done(null, ‘failed to log in’)
		}
		utils.verifyPassword(..., function(..., okay) {
			if(okay) {
				return done(null, ‘failed to log in’)
			}
			session.login(..., function() {
				done(null, ‘logged in’)
			})
		})
	})
}

The complexity of the task directly correlates with the severity of this issue. By nesting callbacks in this way, we end up with code that is prone to errors, difficult to comprehend, and a nightmare to maintain. One workaround is to break down these tasks into smaller functions and then connect them. However, arguably one of the most elegant solutions is to employ a utility Node.js package designed to manage asynchronous JavaScript patterns, such as Async.js:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
function handleLogin(done) {
	async.waterfall([
		function(done) {
			db.User.get(..., done)
		},
		function(user, done) {
			if(!user) {
			return done(null, ‘failed to log in’)
			}
			utils.verifyPassword(..., function(..., okay) {
				done(null, user, okay)
			})
		},
		function(user, okay, done) {
			if(okay) {
				return done(null, ‘failed to log in’)
			}
			session.login(..., function() {
				done(null, ‘logged in’)
			})
		}
	], function() {
		// ...
	})
}

Async.js offers a variety of functions, like “async.waterfall”, to handle different asynchronous scenarios. For the sake of brevity, we’ve used simplified examples, but real-world situations are often far more intricate.

Mistake #4: Expecting Synchronous Callback Execution

While asynchronous programming with callbacks isn’t unique to JavaScript and Node.js, these technologies have popularized it. In contrast to other programming languages where the execution order is predictable (statements execute sequentially unless explicitly instructed otherwise, often limited to conditional statements, loops, and function calls), JavaScript’s callbacks introduce uncertainty. A function might not execute until its awaited task completes, with the current function running to its end without pausing:

1
2
3
4
5
6
7
function testTimeout() {
	console.log(“Begin”)
	setTimeout(function() {
		console.log(“Done!”)
	}, duration * 1000)
	console.log(“Waiting..”)
}

Calling the “testTimeout” function will first display “Begin”, followed by “Waiting..” and finally “Done!” after approximately one second.

Any action that needs to occur after a callback has fired must be called from within that callback.

Mistake #5: “exports” vs. “module.exports”

Node.js treats each file as a self-contained module. If your package includes two files, say “a.js” and “b.js”, and “b.js” needs access to functions defined in “a.js”, then “a.js” must export these functions by adding them as properties to the exports object:

1
2
// a.js
exports.verifyPassword = function(user, password, done) { ... }

By doing so, any code requiring “a.js” will receive an object containing the “verifyPassword” function as a property:

1
2
// b.js
require(‘a.js’) // { verifyPassword: function(user, password, done) { ... } } 

However, if you want to export this function directly, not as a property of an object, you can overwrite exports. But be cautious: you must not treat it as a global variable:

1
2
// a.js
module.exports = function(user, password, done) { ... }

Here, “exports” is treated as a property of the module object. The distinction between “module.exports” and “exports” is crucial, often tripping up developers new to Node.js.

Mistake #6: Throwing Errors Within Callbacks

JavaScript supports the concept of exceptions. Mirroring the syntax found in most traditional languages with exception handling, like Java and C++, JavaScript allows you to “throw” and catch exceptions using try-catch blocks:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
function slugifyUsername(username) {
	if(typeof username === ‘string’) {
		throw new TypeError(‘expected a string username, got '+(typeof username))
	}
	// ...
}

try {
	var usernameSlug = slugifyUsername(username)
} catch(e) {
	console.log(‘Oh no!’)
}

However, try-catch blocks don’t behave as you might expect in asynchronous scenarios. For instance, trying to safeguard a large code block containing numerous asynchronous operations within a single try-catch block might not work as intended:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
try {
	db.User.get(userId, function(err, user) {
		if(err) {
			throw err
		}
		// ...
		usernameSlug = slugifyUsername(user.username)
		// ...
	})
} catch(e) {
	console.log(‘Oh no!’)
}

If the callback to “db.User.get” fires asynchronously, the scope containing the try-catch block might have already exited, rendering it unable to catch any errors thrown within the callback.

Node.js utilizes a different error handling approach, making it crucial to adhere to the (err, …) pattern for all callback function arguments. The first argument of every callback is expected to be an error object if one occurs.

Mistake #7: The Illusion of Integers

In JavaScript, all numbers are floating-point; there’s no distinct integer data type. While this might not seem like a significant issue initially (how often do you encounter numbers large enough to exceed the limits of floating-point representation?), problems tend to arise when you least expect them. Because floating-point numbers have a finite capacity to represent integers accurately, exceeding this limit in calculations can lead to inaccuracies. Surprisingly, the following expression evaluates to true in Node.js:

1
Math.pow(2, 53)+1 === Math.pow(2, 53)

Unfortunately, the peculiarities of numbers in JavaScript don’t end there. Despite Numbers being floating-point, operators designed for integers function as expected:

1
2
5 % 2 === 1 // true
5 >> 1 === 2 // true

However, unlike arithmetic operators, bitwise operators and shift operators only operate on the least significant 32 bits of large “integer” values. For instance, attempting to right-shift “Math.pow(2, 53)” by 1 will always result in 0. Similarly, performing a bitwise OR of 1 with the same large number will yield 1.

1
2
3
Math.pow(2, 53) / 2 === Math.pow(2, 52) // true
Math.pow(2, 53) >> 1 === 0 // true
Math.pow(2, 53) | 1 === 1 // true

While you might not encounter scenarios requiring large numbers frequently, when you do, numerous big integer libraries are available. These libraries provide implementations of essential mathematical operations for high-precision numbers, such as node-bigint.

Mistake #8: Overlooking the Power of Streaming APIs

Let’s imagine building a basic proxy-like web server that fetches content from another web server to fulfill requests. For instance, a simple web server serving Gravatar images:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
var http = require('http')
var crypto = require('crypto')

http.createServer()
.on('request', function(req, res) {
	var email = req.url.substr(req.url.lastIndexOf('/')+1)
	if(!email) {
		res.writeHead(404)
		return res.end()
	}

	var buf = new Buffer(1024*1024)
	http.get('http://www.gravatar.com/avatar/'+crypto.createHash('md5').update(email).digest('hex'), function(resp) {
		var size = 0
		resp.on('data', function(chunk) {
			chunk.copy(buf, size)
			size += chunk.length
		})
		.on('end', function() {
			res.write(buf.slice(0, size))
			res.end()
		})
	})
})
.listen(8080)

In this example, the image is fetched from Gravatar, read into a Buffer, and then served in response to the request. This approach might suffice for relatively small Gravatar images. However, consider a scenario where the proxied content is several gigabytes in size. A far more efficient approach would be:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
http.createServer()
.on('request', function(req, res) {
	var email = req.url.substr(req.url.lastIndexOf('/')+1)
	if(!email) {
		res.writeHead(404)
		return res.end()
	}

	http.get('http://www.gravatar.com/avatar/'+crypto.createHash('md5').update(email).digest('hex'), function(resp) {
		resp.pipe(res)
	})
})
.listen(8080)

Here, the fetched image is directly piped to the client without the need to read the entire content into a buffer.

Mistake #9: “console.log” - Not Your Debugging Companion

In Node.js, “console.log” provides a way to print almost anything to the console. Passing an object to it results in a JavaScript object literal representation. It accepts an arbitrary number of arguments, printing them neatly with spaces. While tempting for debugging, relying on “console.log” in production code is strongly discouraged.

Avoid scattering “console.log” statements throughout your code for debugging and then commenting them out later. Instead, leverage one of the excellent libraries designed specifically for this purpose, such as debug.

Packages like this offer convenient ways to enable or disable specific debug lines at application startup. For instance, debug allows you to suppress debug output to the terminal by simply not setting the DEBUG environment variable. Using it is straightforward:

1
2
3
// app.js
var debug = require('debug')('app')
debug('Hello, %s!', 'world')

To activate debug lines, execute your code with the environment variable DEBUG set to “app” or “*”:

1
DEBUG=app node app.js

Mistake #10: The Missing Supervisor

Whether your Node.js code is deployed in production or running in your local development environment, a supervisor program to oversee and manage it is invaluable. A common practice among developers building modern applications is to embrace the “fail fast” philosophy. If an unexpected error occurs, instead of attempting to handle it, allow your program to crash and rely on a supervisor to restart it within seconds.

The benefits of supervisor programs extend beyond simply restarting crashed programs. These tools can also restart programs when specific files are modified, making the Node.js development process much smoother.

A plethora of supervisor programs are available for Node.js, including:

Each tool comes with its strengths and weaknesses. Some excel at managing multiple applications on a single machine, while others shine in log management. Regardless, all are valid choices when you decide to incorporate a supervisor into your workflow.

Conclusion

As you’ve seen, some of these Node.js pitfalls can have detrimental effects on your application. Others might cause frustration while attempting to implement even the simplest of tasks. While Node.js has lowered the barrier to entry for newcomers, certain areas remain prone to errors. Many of these issues will resonate with developers coming from other programming backgrounds, but they are particularly prevalent among developers new to Node.js. Fortunately, they are easily avoidable. This guide aims to equip beginners with the knowledge to write better Node.js code, ultimately leading to the development of more robust and efficient software for everyone.