Latest revision |
Your text |
Line 248: |
Line 248: |
| * This change on the code resulted in a significant improvement on the performance, as you can see in the following table: | | * This change on the code resulted in a significant improvement on the performance, as you can see in the following table: |
| | | |
− | (Running the OpenCL over the NVIDIA OpenCL SDK on GPU - Debug build)
| |
| [[File:time_comparison.png|frame|center|x200px]] | | [[File:time_comparison.png|frame|center|x200px]] |
| | | |
Line 279: |
Line 278: |
| * Here is a table with a time comparison between the code in the trunk, the code with the new tree structure and the code currently in the opencl branch (RPN tree): | | * Here is a table with a time comparison between the code in the trunk, the code with the new tree structure and the code currently in the opencl branch (RPN tree): |
| | | |
− | (Release build)
| |
| [[File:comparison.png|frame|center|x200px]] | | [[File:comparison.png|frame|center|x200px]] |
| | | |
Line 288: |
Line 286: |
| * Made some optimizations in the ANSI C bool_eval() prototype and now the new code is slightly faster than the code currently in the trunk! This difference is more noticeable when rendering the havoc scene with the command "rt -s2048": 2.53sec vs 2.18sec. | | * Made some optimizations in the ANSI C bool_eval() prototype and now the new code is slightly faster than the code currently in the trunk! This difference is more noticeable when rendering the havoc scene with the command "rt -s2048": 2.53sec vs 2.18sec. |
| | | |
− | (Release build)
| |
| [[File:time.png|frame|center|x200px]] | | [[File:time.png|frame|center|x200px]] |
| | | |
Line 307: |
Line 304: |
| * Here is a table with the time comparison of the previous implementation (bool_eval() with RPN tree) and the current version with the new tree representation: | | * Here is a table with the time comparison of the previous implementation (bool_eval() with RPN tree) and the current version with the new tree representation: |
| | | |
− | (running the OpenCL over the Intel OpenCL SDK on CPU - Release build) | + | (running the OpenCL over the Intel OpenCL SDK on CPU) |
| [[File:new_tree_times.png|frame|center|x200px]] | | [[File:new_tree_times.png|frame|center|x200px]] |
| | | |
Line 317: |
Line 314: |
| | | |
| * Committed the new code to perform boolean evaluation in the opencl branch code (https://sourceforge.net/p/brlcad/code/70074/) | | * Committed the new code to perform boolean evaluation in the opencl branch code (https://sourceforge.net/p/brlcad/code/70074/) |
− |
| |
− | === 14 August ===
| |
− |
| |
− | * Discussed with my mentor, Vasco, the plan for the next weeks via skype.
| |
− |
| |
− | * Will be cleaning the current code and prepare a patch ticket against the trunk with the new CSG boolean evaluation in OpenCL.
| |
− |
| |
− | * Next will start changing the rendering loop of the OpenCL code to follow the behaviour of the ANCI C code, where boolean evaluation is performed in a parcial fashion.
| |
− |
| |
− | === 15 August ===
| |
− |
| |
− | * Cleaning and refactoring the code
| |
− |
| |
− | * Preparing patch against trunk code
| |
− |
| |
− | === 17 August ===
| |
− |
| |
− | * Submitted patch with code to perform boolean evaluation of CSG with OpenCL (https://sourceforge.net/p/brlcad/patches/474/)
| |
− |
| |
− | * Started working on new rendering loop, where the 'store_segs', the 'rt_boolweave' and 'rt_boolfinal' kernels will be merged into a single kernel, so the weave of segments and evaluation of partitions can take place as soon as new segments are created.
| |
− |
| |
− | === 18 August ===
| |
− |
| |
− | * Merged the 'store_segs' and 'rt_boolweave' kernels into a new kernel. Still trying to avoid repeating the weave of segments already weaved in the ray.
| |
− |
| |
− | * Planning to add the 'rt_boolfinal' kernel next to follow the behaviour of the ANSI C code.
| |
− |
| |
− | === 21 - 25 August ===
| |
− |
| |
− | * Changes on 'rt_boolfinal' function in order to add this kernel into the single kernel (rt_shootray kernel: store_segs + rt_boolweave + rt_boolfinal).
| |
− |
| |
− | * Fixed a bug that caused the ray tracing to crash with this new system, by storing the index of the head of partitions for the current ray and passing it by argument to the rt_boolweave function.
| |
− |
| |
− | * Trying to fix the problem of some partitions being evaluated too early, that caused some pixels to shade the incorrect partitions.
| |
− |
| |
− | * In the end, I couldn't figure out a way to weave and evaluate partitions in a partial way using the BVH (bounding volume hierarchy) because the BVH nodes aren't in spatial order. This optimization was promising and it should work if we change the code to store the nodes in a spatial subdivision structure like the kd-tree, as the Ansi C code does.
| |
− |
| |
− | * Since we are processing all the hits before weaving segments and evaluating partitions, there is no need to keep evaluating partitions after the first opaque partition in the ray is evaluated. Because the partitions are already ordered by its in_hit point, and because all segments of the ray are processed, there is no possibility to have a partition closer to the ray origin after evaluating the first partition, so it is unnecessary and expensive to keep evaluating partitions for the ray.
| |
− |
| |
− | * By stopping the evaluation of partitions after the first partition evaluated of the ray is found, the performance of the OpenCl code increased significantly, as we can see in the following table:
| |
− |
| |
− | (running the OpenCL over the Intel OpenCL SDK on CPU - Release build)
| |
− | [[File:times.png|frame|center|x200px]]
| |
− |
| |
− | === GSoC17 is Over!! ===
| |
− |
| |
− | * Google Summer of Code 2017 comes to an end! It was an amazing experience and I couldn't be happier with this first introduction to open source software development!
| |
− |
| |
− | * I would like to thank the BRL-CAD community for giving me this opportunity and for always being available to help!
| |
− |
| |
− | * A special thanks to Vasco Costa, for the great mentoring and guidance through the summer!!
| |
− |
| |
− | * Here is the work product link that I submitted: https://github.com/MarcoSDomingues/GSoC17
| |