6 NODE.JS MODULES YOU SHOULD BE USING

Node.js was created six years ago and took some time to gain popularity. But as its kinks have been worked out, many websites are finding success with it. One aspect of Node.js that makes it popular is the large number of third-party modules available. The list continues to grow, and, as a result, it's hard to figure out which modules you need and which you don't need. There are many “best modules” lists out there, but they tend to be quite random. In this story, we worked to develop a list of modules that would be highly valuable to the ProgrammableWeb audience.

Some Words About Asynchronous Programming

Remember that Node.js is a platform that uses JavaScript for its language. To master Node.js, you need to master JavaScript. JavaScript includes the ability to pass functions as arguments to other functions. Node.js takes advantage of this by using a callback mechanism. For example, if you want to make a database call, you provide your query parameters along with a function that gets called after the database call completes. But this opens up some oddities that, if you're not careful, will result in bugs.

If you have a line of code that follows immediately after your database call, will that line of code execute before or after the callback function occurs? Under normal situations, the line following will indeed execute before the callback. And what if in that following line you're trying to access some of the data you were planning to retrieve from the call to the database? You can't, because in all likelihood the data hasn't been retrieved yet.

This can sound like a mess, but it really isn't. It just means that to get the most out of Node.js, you need to master its asynchronous nature and how your JavaScript code works together with the asynchronicity. The article Understanding process.NextTick is, in my opinion, one of the best explanations of how Node.js uses what's called an event loop to handle its asynchronous nature. (The author notes himself that the article is a bit outdated, but it is still an excellent explanation.)

You also need to understand how Node.js now has a function called setImmediate, and how it differs from nextTick. You can learn that by reading the entire (but short) page in Node.js's documentation for timers.

And finally, once you fully understand how all that works, you'll understand why the following modules will help you get your work done. When I write a Web application, I will typically do several steps with each incoming request. I might look up something in a database. I might save something else into a database table. I might log something, and so on. All of these happen in sequence, often requiring some results of the previous step, and most of them deal with callback functions. This can make for a mess in your code if you nest your callback functions:

lookupsomething(query, function(err, resp) {
    storesomething(data, function(err, resp) {
        logsomething(message, function(err, resp) {
            res.render(page);
        };
    });
});

With just a couple steps, it's not awful, but if you need to add a step in between another step, it can become a headache. For this I use the module called async.js. This is our first module, which I discuss next.

Async.js Module

async.js has several functions that help you organize what might otherwise be a mess of callback functions with ordering and timing problems. One function is called waterfall. You provide waterfall with an array of functions, and waterfall calls them in sequence, one after the other. Each function gets a callback function, and at the end of the function you call the callback function, which results in the next function getting called. The beautiful thing here is that if you need to add a step in between two steps, you just insert it in between two functions in the array. Even better: While you're making calls to a callback, async.js helps manage your program's call stack and timing appropriately.

async.js also has a handy function called eachSeries for taking an array of data, and calling a single function repeatedly, each time passing the next element in the array into the function. The reason you need this kind of thing is again because of the callback nature. Without the help of async.js, imagine you have an array of items and you want to insert each item into a database, with each insert requiring a separate database call. This mightappear to work at first:

for (var i=0; i<data.length; i++) {
    writedata(data[i], function(err, resp) {
        ...
    });
}
console.log('Finished saving!');

The writedata function writes the data, and here it gets called multiple times. However, each iteration doesn't actually call it yet because of the asynchronous nature in the Node.js database drivers (assuming they were coded correctly). Instead, all these calls get queued up and don't run until the event loop finishes. So when the console.log line runs, in fact, the data hasn't been saved; the code hasn't even started saving it.

What do you do if you have an error? That's where the eachSeries function comes in. Instead, you provide a single function, which receives as it parameter the next value from the array. At the end of your function you call a certain callback function that results in async.js starting the next iteration. You can learn about it and see examples in the async.js documentation.

Q module

While async.js is great, another aspect to Node.js programming is using what are called promises. Promises are functions that get called after a callback function completes. The approach looks like this:

someDataLookup()
    .next(anotherfunction)
    .next(yetanotherfunction);

The first function gets called, and, when it's finished, the function called anotherfunctiongets called. Then after that's finished, the function called yetanotherfunction gets called.

As an exercise to fully understand how promises work, I'd like to challenge you to first consider how the promise function itself comes into existence and how it could get called, and then try building a mechanism yourself without using a helper library. This will help you understand the callback nature and timing nature of both Node.js and JavaScript itself.

Look at the first function; you call it manually by adding the parentheses after it. But look at the next two functions: anotherfunction and yetanotherfunction. In these two cases, you're not immediately calling them; instead, you're leaving off the parentheses and passing them into the next function. The next() function is what calls them. And where does the next function come from? The someDataLookup function must return an object containing a next function. And in the second call to next, the next function must also return an object containing a next function, so that it can be called again.

But when exactly does next() get called? This is all tricky and important in understanding how promises work. One great way to learn is by doing. Think about the questions I posed and how you would have to answer them. Try writing some code that does exactly what I just described. You'll quickly see what must take place inside someDataLookup, what it must return, what the next function must do, and what the two other functions must do. Then you'll be ready to make use of some sophisticated libraries that help you with promises. One is called Q. Q takes away the hard work of building your own promise system. For example, if you have to do a series of database calls and want to use promises to make them happen, that's where Q can help.

Note that if you want to learn Q, take a look at a GitHub Gist that I created a couple years ago that explains some basic features. Also look at the forks; several people have forked it and created some even better examples.

HTTP Requests and API Calling: request

Since this is ProgrammableWeb, you're likely to be calling API calls. Many APIs include SDKs that provide functions to call the API. (And, when they do, when coded correctly for Node.js, they're likely to be asynchronous.) But in a lot of cases, it's easy to just make the HTTP call yourself. That's where the request module comes in.

The request library is incredibly easy to use. Suppose you want to grab an HTML page. You just call request, passing a string with the URL, and a callback function that receives the page data:

var request = require('request');

request('http://www.example.com', function(err, resp, body) {

    console.log(body);

});

This is perhaps the simplest example of request. The body parameter is the actual text of the Web page. But you can do much more: You can full control over the HTTP protocol (HTTP, HTTPS), the method (GET, POST, PUT, etc.), the data you send and how it's sent, the headers, and so on. To see what all is available, scroll down on the documentation page to the detail on the request method itself.

Request also supports OAuth, but only version 1. If you want to make client calls with OAuth 2.0, you'll need another library. There are several available, including simple-oauthand node-oauth.

Encryption: bcrypt

When you create Web server software or access other servers (such as through APIs), you sometimes need to do some encryption. There are many times and places you would need encryption, and there are many types. One type in particular deals with hashes. Essentially a hash is a long number that's calculated from an existing string of text (or any type of data). The hash is supposed to work one way: You can figure out the hash from the string, but in theory you can't go the other direction.

The bcrypt algorithm was created with the purpose of calculating hashes. One thing that makes it unique is that as computer become more powerful, the algorithm can be adapted with faster computers and basically told to take longer to essentially slow down login processes and ward off brute-force attacks. This may or may not fit into your own needs, but if it does, the bcrypt module is a great option.

However, I want to provide an important caution: If you're not familiar with encryption and security, please don't try to build your own login system. Instead, use one that is built by security experts and well-tested (such as Passport, which is an entire system for adding login and session capabilities to your app.). While programmers are usually encouraged to learn as much as they can and write code based on learning from others, security is the exception: Security code should only be created by the experts. If you're learning security, great; but don't deploy your code to production until you are an actual expert in security. Otherwise you're just asking for trouble. But on the flip side, security code should be inspected by everyone. The creators of encryption and security code typically want as many eyes on it as possible to find any holes.

The usual Node.js module for bcrypt is bcrypt.js. Because using any kind of security and cryptograhy requires making sure you fully understand the details, I'm not going to say much about the how-to here; instead, check out the home page, and carefully read the documentation and explore the examples.

Email With Nodemailer

If your application needs to send out email, you have some options. One of these options isn't very good, although a lot of people use it.

Here's the deal. Sending email isn't anything mysterious. But, for some reason, a lot of email server configurations involve pointing the server to another email server already configured for SMTP access and letting that server do the hard work. But that server needs to know how to send out email. What's the secret that that server knows? The secret is that every domain set up to receive email has a set of DNS records in what's called its zone file that provide the name of the server that receives email. That's it. So, in order to send email without using an intermediate server, you simply access the DNS servers, get the information for the email server, connect to that server, and deliver the message. But instead of doing that yourself, you can let a module do it for you. That's what Nodemailer does, and it allows your software to send email directly out without the need for an intermediate SMTP server. Rather, the code knows the SMTP protocol and does it itself. If you do want it to use an external server to send email, you can go that route, as well. For example, if you have a gmail account and want to send your email from that gmail account, you can do so.

Using Nodemailer requires first creating what's called a transporter object that configures your email-sending, and then calling a sendMail function. You can include the usual fields such as from, to, cc, bcc, and so on. You can also specify both a text version of the email and an HTML version of the email, and you can even include file attachments.

As usual, the approach is asynchronous, meaning you provide a callback function that gets called after the email is sent. If an error occurs, you'll be notified in the callback, as well. The callback function receives two parameters: an error and a result. The documentation provides the full details.

Processing HTML with Cheerio.js

As a final module, one problem I've encountered is processing HTML. If your Node.js application needs to retrieve HTML from a remote site, pulling the information you need out of it can be a headache. You can try to scan the HTML as a string or use regular expressions, but an easier approach is a module called cheerio. With cheerio, you read in some HTML and process it using the same syntax as jQuery. So, for example, if the HTML has a DIV element with an ID and you need the text inside the element, you can simply use the good old jQuery selector syntax of $('#id').html(). One reason I've found cheerio to be so helpful is the HTML I've had to process isn't always well-formed; that is, tags don't always have closing tags. And some characters such as less-than and greater-than might appear in the HTML (instead of using entities). Trying to parse that manually is difficult. The cheerio module takes care of all that.

If you have a string containing HTML, all you need to do is call cheerio's load function, which returns an object that contains all the jQuery methods. Most people save that object into a variable called $, giving you the usual $ syntax of jQuery.

With cheerio, you can also modify the document. You add and remove elements and attributes and classes, like you can with jQuery itself. Cheerio effectively manages a document object model itself just like the browser does. Then, grab the HTML as text with a simle html() call:

$.html()

Conclusion

Whenever I build a project, I usually add most of these modules to my project. Together they make up a framework upon which I build my Node.js applications. Even if I do a quick little test app, I still will likely have some of these because they help me code more quickly.

Have you had success with these modules? What are some of your favorites? Please let us know in the comments section.

SoftTez

Rechercher dans ce blog