官术网_书友最值得收藏!

Implementing download throttling

Node provides pause and resume methods for incoming streams but not for outbound streams. Essentially, this means we can easily throttle upload speeds in Node, but download throttling requires a more creative solution.

Getting ready

We'll need a new server.js file along with a big enough file to serve. With the dd command-line program, we can generate a file for testing purposes, as follows:

dd if=/dev/zero of=50meg count=50 bs=1048576

The preceding command will create a 50 MB file named 50meg, which we'll be serving.

Note

For a similar Windows tool that can be used to generate a large file, check out http://www.bertel.de/software/rdfc/index-en.html.

How to do it...

To keep things as simple as possible, our download server will serve just one file, but we'll implement it in a way that allows us to easily plug in some router code to serve multiple files. First, we will require our modules and set up an options object for file and speed settings, as follows:

var http = require('http');
var fs = require('fs');

var options = {}
options.file = '50meg';
options.fileSize = fs.statSync(options.file).size;
options.kbps = 32;

Note

If we were serving multiple files, our options object would be largely redundant; however, we're using it here to emulate the concept of a user-determined file choice. In a multifile situation, we would instead be loading file specifics based upon the requested URL.

To see how this recipe can be configured to serve and throttle more than one file, check out the routing recipes in Chapter 1, Making a Web Server.

The http module is for the server and the fs module is used to create a readStream method and grab the size of our file.

We're going to be restricting how much data is sent out at once, but first we need to get the data in. So let's create our server and initialize a readStream method as follows:

http.createServer(function(request, response) {
  var download = Object.create(options);
  download.chunks = new Buffer(download.fileSize);
  download.bufferOffset = 0;

  response.writeHeader(200, {'Content-Length': options.fileSize});

  fs.createReadStream(options.file)
  .on('data', function(chunk) {
    chunk.copy(download.chunks, download.bufferOffset);
    download.bufferOffset += chunk.length;
  })
  .once('open', function() {
    //this is where the throttling will happen
  });
}).listen(8080);

We've created our server and specified a new object called download, which inherits from our options object. We add two properties to our request-bound download object: a chunks property that collects the file chunks inside the readStream data event listener and a bufferOffset property that will be used to keep track of the number of bytes loaded from disk.

All we have to do now is the actual throttling. To achieve this, we simply apportion out the specified number of kilobytes from our buffer every second, thus achieving the specified kilobytes per second. We'll make a function for this, which will be placed outside of http.createServer and we'll call our function throttle, as follows:

function throttle(download, cb) {
  var chunkOutSize = download.kbps * 1024, timer = 0;


  (function loop(bytesSent) {
    var remainingOffset;
    if (!download.aborted) {
      setTimeout(function () {
        var bytesOut = bytesSent + chunkOutSize;
        if (download.bufferOffset>bytesOut) {
          timer = 1000;
          cb(download.chunks.slice(bytesSent,bytesOut));
          loop(bytesOut);
          return;
        }

        if (bytesOut>= download.chunks.length) {
          remainingOffset = download.chunks.length - bytesSent;
          cb(download.chunks.slice(remainingOffset,bytesSent));
          return;
        }

        loop(bytesSent); //continue to loop, wait for enough data
      },timer);
    }
  }(0));

  return function () { //return function to handle abort scenario
    download.aborted = true;
  };

}

The throttle function interacts with the download object created on each server request to measure out each chunk according to our predetermined options.kbps speed. For the second parameter (cb), the throttle function accepts a functional callback. The cb parameter in turn takes one parameter, which is the chunk of data the throttle function has determined to send. Our throttle function returns a convenience function that can be used to end the loop on abort, avoiding infinite looping.

We initialize download throttling by calling our throttle function in the server callback when the readStream method opens, as follows:

//...previous code
  fs.createReadStream(options.file)
  .on('data', function (chunk) {
    chunk.copy(download.chunks,download.bufferOffset);
    download.bufferOffset += chunk.length;
  })
  .once('open', function () {
 var handleAbort = throttle(download, function (send) {
 response.write(send);
 });

 request.on('close', function () {
 handleAbort();
 });
  });

}).listen(8080);

How it works...

The key to this recipe is our throttle function; let's walk through it. To achieve the specified speed, we send a chunk of data of a certain size every second. The size is determined by the desired number of kilobytes per second. So if download.kbps is 32, we'll send 32 KB chunks every second.

Buffers work in bytes, so we set a new variable called chunkOutSize and multiply download.kbps by 1,024 to realize the appropriate chunk size in bytes. Next, we set a timer variable, which is passed into setTimeout. It is first set to 0 on two accounts. For one, it eliminates an unnecessary initial 1,000 millisecond overhead, allowing our server the opportunity to immediately send the first chunk of data, if available. Secondly, if the download.chunks buffer is not full enough to accommodate the demand of chunkOutSize, the embedded loop function recurses without changing timer. This causes the CPU to cycle in real time until the buffer loads enough data to deliver a whole chunk (a process, which should take less than a second).

Once we have enough data for the first chunk, timer is set to 1000 because from here on out, we want to push a chunk every second.

The loop function is the gut of our throttling engine; it's a recursive function that calls itself with one parameter—bytesSent. The bytesSent parameter allows us to keep track of how much data has been sent so far and we use it to determine which bytes to slice out of our download.chunks buffer using Buffer.slice. The Buffer.slice variable takes two parameters: start and end. These two parameters are fulfilled with bytesSent and bytesOut, respectively. The bytesOut variable is also used against download.bufferOffset to ensure we have enough data loaded for a whole chunk to be sent out.

If there is enough data, we proceed to set the timer variable to 1000 to initiate our chunk per second policy, and then pass the result of download.chunks.slice into cb, which becomes our send parameter.

Back inside our server, our send parameter is passed to response.write within our throttle callback so each chunk is streamed to the client. Once we've passed our sliced chunk to cb, we call loop(bytesOut) for a new iteration (thus bytesOut transforms into bytesSent). We then return from the function to prevent any further execution.

The third and final place where bytesOut appears is in the second conditional statement of the setTimeout callback, where we use it against download.chunks.length. This is important to handle the last chunk of data. We don't want to loop again after the final chunk has been sent, and if the options.kbps callback doesn't divide exactly into the total file size, the final bytesOut variable would be larger than the size of the buffer, which if passed into the slice method unchecked would cause an Object Out of Bounds (OOB) error.

So, if bytesOut equals or is greater than the memory allocated to the download.chunks buffer (that is, the size of our file), we slice the remaining bytes from our download.chunks buffer and return from the function without calling loop, effectively terminating recursion.

To prevent infinite looping when the connection is closed unexpectedly (for instance on connection failure or client abort), throttle returns another function, which is caught in the handleAbort variable and called in the close event of response. The function simply adds a property to the download object to say that the download has been aborted. This is checked on each recursion of the loop function. As long as download.aborted isn't true, it continues to iterate; otherwise the looping stops short.

Note

There are (configurable) limits on operating systems that define how many files can be opened at once. We would probably want to implement caching in a production download server to optimize file system access. For file limits on Unix systems, see http://stackoverflow.com/questions/34588/how-do-i-change-the-number-of-open-files-limit-in-linux.

There's more...

What about resuming downloads?

Enabling a resume request from broken downloads

If a connection breaks or a user accidentally aborts a download, the client may initiate a resume request by sending a Range HTTP header to the server. A Range header would look something like the following:

Range: bytes=512-1024

When a server agrees to handle a Range header, it sends a 206 Partial Content status and adds a Content-Range header in the response. Where the entire file is 1 MB, a Content-Range reply to the above Range header might look like the following:

Content-Range: bytes 512-1024/1024

Notice that there is no equal sign (=) after bytes in a Content-Range header. We can pass an object into the second parameter of fs.createReadStream, which specifies where to start and end reading. Since we are simply handling a resume request, we only need to set the start property as follows:

//requires, options object, throttle function, etc...
download.readStreamOptions = {};
download.headers = {'Content-Length': download.fileSize};
download.statusCode = 200;
if (request.headers.range) {
  download.start = request.headers
  .range.replace('bytes=','').split('-')[0];
  download.readStreamOptions = {start: +download.start};
  download.headers['Content-Range'] = "bytes " + download.start +
  "-" + download.fileSize + "/" + download.fileSize;
  download.statusCode = 206; //partial content
}
response.writeHeader(download.statusCode, download.headers);
fs.createReadStream(download.file, download.readStreamOptions)
//...rest of the code....

By adding some properties to download and using them to conditionally respond to a Range header, we can now handle resume requests.

See also

  • The Setting up a router recipe discussed in Chapter 1, Making a Web Server
  • The Caching content in memory for immediate delivery recipe discussed in Chapter 1, Making a Web Server
  • The Communicating via TCP recipe discussed in Chapter 9, Integrating Networking Paradigms
主站蜘蛛池模板: 邵东县| 铅山县| 连江县| 西畴县| 敖汉旗| 盖州市| 嫩江县| 嫩江县| 昆明市| 东安县| 肇源县| 巴塘县| 高要市| 九台市| 赤水市| 光山县| 五台县| 稻城县| 孙吴县| 峨眉山市| 东乡县| 阿勒泰市| 徐闻县| 西城区| 师宗县| 右玉县| 信宜市| 丘北县| 且末县| 陆川县| 札达县| 西昌市| 仁怀市| 安塞县| 贵港市| 新田县| 博罗县| 武平县| 石台县| 高邮市| 司法|