GPU-accelerated image processing in ImageJ using CLIJ
CLIJ was successfully tested on a variety of Intel, Nvidia and AMD GPUs. See the full list of tested systems
No. Common Intel Core and AMD Ryzen processors contain built-in GPUs which are compatible with CLIJ. However, as dedicated graphics cards come with their own GDDR-memory, additional speed-up can be gained by utilizing dedicated GPUs though.
CLIJ was successfully tested on Windows, MacOS, Fedora linux and Ubuntu linux. Current GPU and OpenCL drivers must be installed.
In order to exploit GPU-accelerated image processing, one should
The simplest way for measuring the speedup of workflows is using time measurements before and after execution, e.g. in ImageJ macro:
time = getTime(); // gives current time in milliseconds
// ...
// my workflow
// ...
print("Processing the workflow took " + (getTime() - time) + " msec"));
However, in order to make these measurements reliable, some hints shall be given:
ImageJ macros benchmarking CPU/GPU performance can be found here and here
For more professional benchmarking, we recommend the OpenJDK Java Microbenchmark Harness (JMH). As the name suggests, this involves Java programming. You find more details here
To give an overview, some of CLIJs operations have been benchmarked with JMH
With some limitations, yes. You find details and installation instructions here
If you use CLIJ from ImageJ macro, you cannot execute it in parallel from several threads. If you use CLIJ from any other programming language, please use one CLIJ instance per thread. By using multiple threads in combination with multiple CLIJ instances, you can also execute operations on multiple graphics cards at a time.
Yes. When processing images of the same size and type, it is recommended to reuse memory instead of releasing memory and reallocating memory in every iteration. An example macro demonstrating this can be found here
No. While algorithms on the CPU can make use of double-precision, common GPUs only support single precision for floating point numbers. Furthermore, following priorities were set while developing CLIJs filters:
For example, the minimum filter of ImageJ takes different neighborhoods into account when being applied in 2D and 3D. CLIJs filters are consistent in 2D and 3D. Thus, results may differ between ImageJ and CLIJ as shown in Figure 1. Figure 1: Comparing CLIJs mean filter (center) and ImageJs mean filter (right) in 2D (top) and 3D (bottom). The result can be reproduced by running the this example macro with radius = 1:
CLIJ in general uses the strategy clamp to edge
assuming pixels outside the image have the same pixel value as the closest border pixel of the image. For transforms such as rotation, translation, scaling, and affine transforms, ‘zero-padding’ is applied assuming pixels having value 0 out of the image.
No. All numeric spatial parameters in CLIJ such as radius and sigma are always entered in pixels. There is no operation in CLIJ which makes use of any physical units.
Pixel coordinates in X, Y and Z are zero-based indiced.
In general no. CLIJ supports two and three dimensional images. If the third dimension represents channels or frames, these images can be processed using CLIJs 3D filters. When processing 4D or 5D images, it is recommended to split them into 3D blocks.
No. There are no in-place operations implemented in CLIJ. No built-in operation overwrites its input images. However, when implementing your own custom OpenCL-code and wrapping it into CLIJ plugins, in-place operations may be supported depending on used hardware, driver version and supported OpenCL version.
No. The currently active image window in ImageJ plays no role in CLIJ. Input and output images must be specified in macros by name explicitly.
If a specified output image does not exist in GPU memory, it will be generated automatically with a size defined by the executed operation with respect to input image and given parameters.
If a specified output image exists already in GPU memory, it will be overwritten. If the output image has the wrong size, it will not be changed.
CLIJ operations called from ImageJ macro have no return values. They either process pixels and save results to images or they save their results to ImageJs results table.
Binary output images are filled with pixel values 0 and 1. Any input image can serve as binary image and will be interpreted by differentiating 0 and non-zero values. In order to pull a binary image back to ImageJ which is compatible, use pullBinary()
. This delivers a binary 8-bit image with 0 and 255 as pixel values.
Yes. CLIJ brings OpenCL-kernel caching and the possibility of image/pixel-type-independent OpenCL. These benefits come with small performance loss. Calling an OpenCL kernel via ClearCL directly may be about a millisecond faster than calling it via CLIJ. Example code demonstrating this is available here
Images and buffers are defined in the OpenCL standard. We tried to have as many operations as possible compatible to both, images and buffers. Differences are:
We recommend using buffers in general for maximum device compatibility.
Yes. As operations executed on the GPU anyway don’t make use of user interface elements, CLIJs operations in general run headless and need no user interaction. Furthermore, it can be run from the command line and in cloud systems using docker.