Editing User:Marco-domingues/GSoC17/Log

From BRL-CAD

Warning: You are not logged in. Your IP address will be publicly visible if you make any edits. If you log in or create an account, your edits will be attributed to your username, along with other benefits.

The edit can be undone. Please check the comparison below to verify that this is what you want to do, and then save the changes below to finish undoing the edit.
Latest revision Your text
Line 248: Line 248:
 
* This change on the code resulted in a significant improvement on the performance, as you can see in the following table:
 
* This change on the code resulted in a significant improvement on the performance, as you can see in the following table:
  
(Running the OpenCL over the NVIDIA OpenCL SDK on GPU - Debug build)
 
 
[[File:time_comparison.png|frame|center|x200px]]
 
[[File:time_comparison.png|frame|center|x200px]]
  
Line 279: Line 278:
 
* Here is a table with a time comparison between the code in the trunk, the code with the new tree structure and the code currently in the opencl branch (RPN tree):
 
* Here is a table with a time comparison between the code in the trunk, the code with the new tree structure and the code currently in the opencl branch (RPN tree):
  
(Release build)
 
 
[[File:comparison.png|frame|center|x200px]]
 
[[File:comparison.png|frame|center|x200px]]
  
Line 288: Line 286:
 
* Made some optimizations in the ANSI C bool_eval() prototype and now the new code is slightly faster than the code currently in the trunk! This difference is more noticeable when rendering the havoc scene with the command "rt -s2048": 2.53sec vs 2.18sec.
 
* Made some optimizations in the ANSI C bool_eval() prototype and now the new code is slightly faster than the code currently in the trunk! This difference is more noticeable when rendering the havoc scene with the command "rt -s2048": 2.53sec vs 2.18sec.
  
(Release build)
 
 
[[File:time.png|frame|center|x200px]]
 
[[File:time.png|frame|center|x200px]]
  
Line 307: Line 304:
 
* Here is a table with the time comparison of the previous implementation (bool_eval() with RPN tree) and the current version with the new tree representation:  
 
* Here is a table with the time comparison of the previous implementation (bool_eval() with RPN tree) and the current version with the new tree representation:  
  
(running the OpenCL over the Intel OpenCL SDK on CPU - Release build)
+
(running the OpenCL over the Intel OpenCL SDK on CPU)
 
[[File:new_tree_times.png|frame|center|x200px]]
 
[[File:new_tree_times.png|frame|center|x200px]]
  
Line 343: Line 340:
  
 
* Planning to add the 'rt_boolfinal' kernel next to follow the behaviour of the ANSI C code.
 
* Planning to add the 'rt_boolfinal' kernel next to follow the behaviour of the ANSI C code.
 
=== 21 - 25 August ===
 
 
* Changes on 'rt_boolfinal' function in order to add this kernel into the single kernel (rt_shootray kernel: store_segs + rt_boolweave + rt_boolfinal).
 
 
* Fixed a bug that caused the ray tracing to crash with this new system, by storing the index of the head of partitions for the current ray and passing it by argument to the rt_boolweave function.
 
 
* Trying to fix the problem of some partitions being evaluated too early, that caused some pixels to shade the incorrect partitions.
 
 
* In the end, I couldn't figure out a way to weave and evaluate partitions in a partial way using the BVH (bounding volume hierarchy) because the BVH nodes aren't in spatial order. This optimization was promising and it should work if we change the code to store the nodes in a spatial subdivision structure like the kd-tree, as the Ansi C code does.
 
 
* Since we are processing all the hits before weaving segments and evaluating partitions, there is no need to keep evaluating partitions after the first opaque partition in the ray is evaluated. Because the partitions are already ordered by its in_hit point, and because all segments of the ray are processed, there is no possibility to have a partition closer to the ray origin after evaluating the first partition, so it is unnecessary and expensive to keep evaluating partitions for the ray.
 
 
* By stopping the evaluation of partitions after the first partition evaluated of the ray is found, the performance of the OpenCl code increased significantly, as we can see in the following table:
 
 
(running the OpenCL over the Intel OpenCL SDK on CPU - Release build)
 
[[File:times.png|frame|center|x200px]]
 
 
=== GSoC17 is Over!! ===
 
 
* Google Summer of Code 2017 comes to an end! It was an amazing experience and I couldn't be happier with this first introduction to open source software development!
 
 
* I would like to thank the BRL-CAD community for giving me this opportunity and for always being available to help!
 
 
* A special thanks to Vasco Costa, for the great mentoring and guidance through the summer!!
 
 
* Here is the work product link that I submitted: https://github.com/MarcoSDomingues/GSoC17
 

Please note that all contributions to BRL-CAD may be edited, altered, or removed by other contributors. If you do not want your writing to be edited mercilessly, then do not submit it here.
You are also promising us that you wrote this yourself, or copied it from a public domain or similar free resource (see BRL-CAD:Copyrights for details). Do not submit copyrighted work without permission!

To edit this page, please answer the question that appears below (more info):

Cancel Editing help (opens in new window)