JavaScript Enumerable.Map() with WebWorkers

For those with short attention spans, here’s how you call the function:

map(enumerable, mapFunction, callback, numWorkers);

I wanted an easy way to divide up a parallelizable task with Web Workers, so I create a Worker enabled Map function for arrays and objects. It works just like the map function in your favorite functional languages, except that it executes asynchronously with callback. And of course, that it will go several times faster on multicore machines.

The function creates a pool of workers (32 by default) and divides the work up among them. It reassembles the results into a new object of the same type as the original. Array order is preserved.

Here’s the function:

function map(data, mapper, callback, numWorkers) {

  // Support arrays & objects
  var length = 0;
  for(var d in data) { length++; }
  var result = new data.constructor;

  numWorkers = Math.min(numWorkers || 32, length);
  var workers = [];
  var messagesReceived = 0;

  // Create the workers
  for (var i=0; i < numWorkers; i++) {
    workers[i] = new Worker("mapper.js");
    workers[i].addEventListener('message', function(e) {
      result[e.data.key] = e.data.value;
      // Check if we have finished the job.  This should probably be more robust.
      if (++messagesReceived == length) { callback(result) };
    }, false);
  }

  // Just send out all the tasks.  The messages get queued by the browser.
  // It would probably be better to queue up two or three tasks per worker (to minimize downtime)
  // and add tasks to the queues as results come back.
  var nextItem=0
  for (var d in data) {
    workers[nextItem++ % numWorkers].postMessage({key: d, value: data[d], mapper: "(" + String(mapper) + ")(value)"});
  }

}

And here is the worker code. The worker is pretty bare bones, as you might have assumed.

// Minion
onmessage = function(e) {
    var value = e.data.value;
    postMessage({key: e.data.key, value: eval(e.data.mapper)});
}

Next, I’m thinking I’ll do a WebWorker implementation of MapReduce. I was thinking that I would use the syntax from CouchDB in the interest of standardization (emit, in other words) but I am far from an expert on these things and would love to hear any feedback.

Comments

Contact Us

We'd love to hear from you. Get in touch!

Phone

+1 617-379-2752

Mail

P.O. Box 961436
Boston, MA 02196