Skip to content

Overview

HyperQueue is a tool designed to simplify execution of large workflows on HPC clusters. It allows you to execute a large number of tasks in a simple way, without having to manually submit jobs into batch schedulers like PBS or Slurm. You just specify what you want to compute – HyperQueue will automatically ask for computational resources and dynamically load-balance tasks across all allocated nodes and cores.

Features#

Resource management

  • Batch jobs are submitted and managed automatically
  • Computation is distributed amongst all allocated nodes and cores
  • Tasks can specify resource requirements (# of cores, GPUs, memory, ...)

Performance

  • Scales to millions of tasks and hundreds of nodes
  • Overhead per task is around 0.1 ms
  • Task output can be streamed to a single file to avoid overloading distributed filesystems

Simple deployment

  • HQ is provided as a single, statically linked binary without any dependencies
  • No admin access to a cluster is needed

Last update: May 22, 2022
Created: May 1, 2021
Back to top