BRL-CAD has one of the oldest and fastest parallel ray tracing implementations around but it doesn't currently leverage the GPU very much. With implicit geometry and constructive solid geometry (CSG) Boolean operations, it also has a very different approach to ray tracing that has its own set of academic challenges.
The project here was to introduce a GPGPU pipeline into BRL-CAD using OpenCL for existing primitives and parallelize them for faster computation compared to the non-GPGPU implementation.
Some of the primitives that were planned to be converted in this 2020 season of Google Summer of Code are polyhedron with an arbitrary number (ARBN), FASTGEN4 CLINE, hyperboloids of one sheet (HYP), etc.
Following patches include modifications for better functioning and resolution of bugs encountered while rendering via OpenCL in MGED terminal.
- OpenCL_CMAKE: https://sourceforge.net/p/brlcad/patches/549/
- OpenCL rendering bug: https://sourceforge.net/p/brlcad/patches/551/
Following are the patches submitted for OpenCL versions of some of the primitives.
- CLINE: https://sourceforge.net/p/brlcad/patches/541/
- ARBN: https://sourceforge.net/p/brlcad/patches/543/
- PIPE: https://sourceforge.net/p/brlcad/patches/545/
- VOL: https://sourceforge.net/p/brlcad/patches/547/
- METABALL: https://sourceforge.net/p/brlcad/patches/548/
- HYP: https://sourceforge.net/p/brlcad/patches/553/
We shall take the example of HYP primitive here for comparing the performance of C and OpenCL versions of rendering.
First Test: Both the C and OpenCL renderings should look same
On the left is the C rendering of the HYP primitive and on the right is its counterpart rendered after enabling OpenCL. Well, on first look, they do look same. First test passed!
Second Test: The performance of the OpenCL version should be much better than the C version
So, let's compare the time displayed in their rendering logs now. While the time taken by the C version to render was 0.31 seconds, the time taken by the OpenCL version to render was 0.02 seconds. That is more than 15x improvement in the performance! Hurray!
Third Test: Checking the difference between images at a pixel level
Using the pixdiff and pix-png command of BRL-CAD, a pixdiff image shown on the right was created. The image shows where the colour values differ by more than 1 value in one or more of the RGB channels. Most of the pixels are offset in intensity by just a little bit, which implies some calculation is just slightly off. However, the hits and misses appears to be correct.
Fourth Test: To confirm the hits are identical
On comparing the nirt shotline in both the renderings, the hits were found to be identical, which implies the mismatch in intensity found in above test is a normal issue.
To Do List
- Memory Allocation : One of the major challenges I faced was finding the OpenCL counterpart of bu_malloc which works fine while passing the variables from .c to .cl file. While I tried with malloc as is used normally in C programs, it apparently didn't work. The renderings were simply not occurring in the OpenCL version. Hence, the HYP and SUPERELL primitives, which didn't use the bu_malloc command, rendered easily while others didn't. The patches submitted above have no compilation or computation error. If the memory allocation thing works, they'd render just as fine.
- Using Print statement: Another issue where I couldn't find light at the end of the tunnel was using print statement in OpenCL files. It'd seem that simply using printf would work, but it wouldn't go inside the kernel. The print statement would also be useful for debugging the memory allocation issue, as to where and why it is failing.