- Node Cookbook(Second Edition)
- David Mark Clements
- 1595字
- 2021-07-16 12:04:30
Implementing download throttling
Node provides pause
and resume
methods for incoming streams but not for outbound streams. Essentially, this means we can easily throttle upload speeds in Node, but download throttling requires a more creative solution.
Getting ready
We'll need a new server.js
file along with a big enough file to serve. With the dd
command-line program, we can generate a file for testing purposes, as follows:
dd if=/dev/zero of=50meg count=50 bs=1048576
The preceding command will create a 50 MB file named 50meg
, which we'll be serving.
Note
For a similar Windows tool that can be used to generate a large file, check out http://www.bertel.de/software/rdfc/index-en.html.
How to do it...
To keep things as simple as possible, our download server will serve just one file, but we'll implement it in a way that allows us to easily plug in some router code to serve multiple files. First, we will require our modules and set up an options
object for file and speed settings, as follows:
var http = require('http'); var fs = require('fs'); var options = {} options.file = '50meg'; options.fileSize = fs.statSync(options.file).size; options.kbps = 32;
Note
If we were serving multiple files, our options
object would be largely redundant; however, we're using it here to emulate the concept of a user-determined file choice. In a multifile situation, we would instead be loading file specifics based upon the requested URL.
To see how this recipe can be configured to serve and throttle more than one file, check out the routing recipes in Chapter 1, Making a Web Server.
The http
module is for the server and the fs
module is used to create a readStream
method and grab the size of our file.
We're going to be restricting how much data is sent out at once, but first we need to get the data in. So let's create our server and initialize a readStream
method as follows:
http.createServer(function(request, response) { var download = Object.create(options); download.chunks = new Buffer(download.fileSize); download.bufferOffset = 0; response.writeHeader(200, {'Content-Length': options.fileSize}); fs.createReadStream(options.file) .on('data', function(chunk) { chunk.copy(download.chunks, download.bufferOffset); download.bufferOffset += chunk.length; }) .once('open', function() { //this is where the throttling will happen }); }).listen(8080);
We've created our server and specified a new object called download
, which inherits from our options
object. We add two properties to our request-bound download
object: a chunks
property that collects the file chunks inside the readStream
data
event listener and a bufferOffset
property that will be used to keep track of the number of bytes loaded from disk.
All we have to do now is the actual throttling. To achieve this, we simply apportion out the specified number of kilobytes from our buffer every second, thus achieving the specified kilobytes per second. We'll make a function for this, which will be placed outside of http.createServer
and we'll call our function throttle
, as follows:
function throttle(download, cb) { var chunkOutSize = download.kbps * 1024, timer = 0; (function loop(bytesSent) { var remainingOffset; if (!download.aborted) { setTimeout(function () { var bytesOut = bytesSent + chunkOutSize; if (download.bufferOffset>bytesOut) { timer = 1000; cb(download.chunks.slice(bytesSent,bytesOut)); loop(bytesOut); return; } if (bytesOut>= download.chunks.length) { remainingOffset = download.chunks.length - bytesSent; cb(download.chunks.slice(remainingOffset,bytesSent)); return; } loop(bytesSent); //continue to loop, wait for enough data },timer); } }(0)); return function () { //return function to handle abort scenario download.aborted = true; }; }
The throttle
function interacts with the download
object created on each server request to measure out each chunk according to our predetermined options.kbps
speed. For the second parameter (cb
), the throttle
function accepts a functional callback. The cb
parameter in turn takes one parameter, which is the chunk of data the throttle
function has determined to send. Our throttle
function returns a convenience
function that can be used to end the loop on abort, avoiding infinite looping.
We initialize download throttling by calling our throttle
function in the server callback when the readStream
method opens, as follows:
//...previous code fs.createReadStream(options.file) .on('data', function (chunk) { chunk.copy(download.chunks,download.bufferOffset); download.bufferOffset += chunk.length; }) .once('open', function () { var handleAbort = throttle(download, function (send) { response.write(send); }); request.on('close', function () { handleAbort(); }); }); }).listen(8080);
How it works...
The key to this recipe is our throttle
function; let's walk through it. To achieve the specified speed, we send a chunk of data of a certain size every second. The size is determined by the desired number of kilobytes per second. So if download.kbps
is 32, we'll send 32 KB chunks every second.
Buffers work in bytes, so we set a new variable called chunkOutSize
and multiply download.kbps
by 1,024 to realize the appropriate chunk size in bytes. Next, we set a timer
variable, which is passed into setTimeout
. It is first set to 0 on two accounts. For one, it eliminates an unnecessary initial 1,000 millisecond overhead, allowing our server the opportunity to immediately send the first chunk of data, if available. Secondly, if the download.chunks
buffer is not full enough to accommodate the demand of chunkOutSize
, the embedded loop
function recurses without changing timer
. This causes the CPU to cycle in real time until the buffer loads enough data to deliver a whole chunk (a process, which should take less than a second).
Once we have enough data for the first chunk, timer
is set to 1000
because from here on out, we want to push a chunk every second.
The loop
function is the gut of our throttling engine; it's a recursive function that calls itself with one parameter—bytesSent
. The bytesSent
parameter allows us to keep track of how much data has been sent so far and we use it to determine which bytes to slice out of our download.chunks
buffer using Buffer.slice
. The Buffer.slice
variable takes two parameters: start
and end
. These two parameters are fulfilled with bytesSent
and bytesOut
, respectively. The bytesOut
variable is also used against download.bufferOffset
to ensure we have enough data loaded for a whole chunk to be sent out.
If there is enough data, we proceed to set the timer
variable to 1000
to initiate our chunk per second policy, and then pass the result of download.chunks.slice
into cb
, which becomes our send
parameter.
Back inside our server, our send
parameter is passed to response.write
within our throttle
callback so each chunk is streamed to the client. Once we've passed our sliced chunk to cb
, we call loop(bytesOut)
for a new iteration (thus bytesOut
transforms into bytesSent
). We then return from the function to prevent any further execution.
The third and final place where bytesOut
appears is in the second conditional statement of the setTimeout
callback, where we use it against download.chunks.length
. This is important to handle the last chunk of data. We don't want to loop again after the final chunk has been sent, and if the options.kbps
callback doesn't divide exactly into the total file size, the final bytesOut
variable would be larger than the size of the buffer, which if passed into the slice
method unchecked would cause an Object Out of Bounds (OOB) error.
So, if bytesOut
equals or is greater than the memory allocated to the download.chunks
buffer (that is, the size of our file), we slice the remaining bytes from our download.chunks
buffer and return from the function without calling loop
, effectively terminating recursion.
To prevent infinite looping when the connection is closed unexpectedly (for instance on connection failure or client abort), throttle returns another function, which is caught in the handleAbort
variable and called in the close
event of response
. The function simply adds a property to the download
object to say that the download has been aborted. This is checked on each recursion of the loop
function. As long as download.aborted
isn't true, it continues to iterate; otherwise the looping stops short.
Note
There are (configurable) limits on operating systems that define how many files can be opened at once. We would probably want to implement caching in a production download server to optimize file system access. For file limits on Unix systems, see http://stackoverflow.com/questions/34588/how-do-i-change-the-number-of-open-files-limit-in-linux.
There's more...
What about resuming downloads?
If a connection breaks or a user accidentally aborts a download, the client may initiate a resume request by sending a Range
HTTP header to the server. A Range
header would look something like the following:
Range: bytes=512-1024
When a server agrees to handle a Range
header, it sends a 206 Partial Content
status and adds a Content-Range
header in the response. Where the entire file is 1 MB, a Content-Range
reply to the above Range
header might look like the following:
Content-Range: bytes 512-1024/1024
Notice that there is no equal sign (=
) after bytes
in a Content-Range
header. We can pass an object into the second parameter of fs.createReadStream
, which specifies where to start and end reading. Since we are simply handling a resume request, we only need to set the start property as follows:
//requires, options object, throttle function, etc... download.readStreamOptions = {}; download.headers = {'Content-Length': download.fileSize}; download.statusCode = 200; if (request.headers.range) { download.start = request.headers .range.replace('bytes=','').split('-')[0]; download.readStreamOptions = {start: +download.start}; download.headers['Content-Range'] = "bytes " + download.start + "-" + download.fileSize + "/" + download.fileSize; download.statusCode = 206; //partial content } response.writeHeader(download.statusCode, download.headers); fs.createReadStream(download.file, download.readStreamOptions) //...rest of the code....
By adding some properties to download
and using them to conditionally respond to a Range
header, we can now handle resume
requests.
- Learning Single:page Web Application Development
- Java程序設(shè)計(慕課版)
- 從零開始:數(shù)字圖像處理的編程基礎(chǔ)與應(yīng)用
- 程序設(shè)計與實踐(VB.NET)
- GraphQL學(xué)習(xí)指南
- Learning Real-time Processing with Spark Streaming
- Python網(wǎng)絡(luò)爬蟲從入門到實踐(第2版)
- 程序員考試案例梳理、真題透解與強化訓(xùn)練
- Learn Programming in Python with Cody Jackson
- Python Geospatial Development(Second Edition)
- Hands-On GPU:Accelerated Computer Vision with OpenCV and CUDA
- Scientific Computing with Scala
- 碼上行動:用ChatGPT學(xué)會Python編程
- 時空數(shù)據(jù)建模及其應(yīng)用
- Python從入門到精通(第3版)