Skip to content

The proposal of using Multi Threads #4355

@SukkaW

Description

@SukkaW

Since #550, the original creator of Hexo, @tommy351 want to speed up Hexo with multi core rendering. However, the #550 is never continued due to the difficulties of managing multiple Hexo instance.

Recently I have brought up Node.js worker_threads for a project (OI-wiki/OI-wiki#2288) and learned something about worker_threads. With Node.js add support for worker_threads, it is now possible to bring up multi core rendering for Hexo again.

Limit

Workers Thread is designed to run CPU intensive tasks with simple algorism:

Independent Input => Workers Calculating => Independent Output

Thus we cannot run many difficult functions inside workers.

Design

As creating workers and destroy workers is still expensive (worker_threads are required to contact with main_thread), we should only create limited number of worker_threads (In OI-wiki/OI-wiki#2288 I use the length of CPU Threads). Thus, a WorkerPool util should be made.

The WorkPool is designed to queue the task, manage task and make sure next task would run in an idle worker, thus it should have those method:

  • init(): Init a worker pool with the queue (the queue could be an array). This will be called in constructor.
  • run(input): add a task to the queue, with input passed to the workers. A Promise will be returned (the result could be retrieved by const output = await workerPool.run(input)).
  • destroy(): after all tasks is finished, destroy all the worker_threads created.

And here is an example about how to use WorkPool:

// index.js
const { join } = require('path');
const { WorkerPool } = require('hexo-util');

const workerPath = join(__dirname + '/some_worker.js');
const cpuNums = require('os').cpus().length;

const pool = new WorkerPool(workerPath, cpuNums);

const tasksList = /* some stuff goes here ... */
const result = {};

Promise.all(tasksList.map(async task => {
  const output = await pool.run(task);

  // do something with output, maybe writeFile or push to a resultArray.
  result[taskId] = output;
}).then(() => {
  pool.destroy();

  // do something with result object.
});
// some_worker.js
const { isMainThread, parentPort } = require('worker_threads');

if (isMainThread) {
  throw new Error('It is not a worker, it seems like a Main Thread');
}

async function job(input) {
  // some stuff...
  return output;
}

parentPort.on('message', async input => {
  const output = await job(input);
  parentPort.postMessage(output);
});

As you can see, the example I given is suitable for some of filters (likes meta_generator, backtick_code_filter) that we pass input to the filter and get output from it. But for more complicated job (like post rendering & template rendering) workers_thread still can't help.

cc @hexojs/core @tommy351

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions