Investigate performance of setting thread affinityBRL-CAD
Status: ClosedTime to complete: 72 hrs Mentors: SeanTags: performance, multithreading, smp, benchmark

BRL-CAD's raytrace library (LIBRT) is pervasively multithreaded using routines defined in our basic utility library (LIBBU) for detecting an using multiple CPUs/cores/threads. On Linux, BSD, or Mac OS X, you can set the affinity of a process to stay on a processor.

This task involves making minor modifications to the LIBBU parallel interface using sched_setaffinity and/or pthread_attr_setaffinity_np (or similar affinity mechanism depending on the platform) and then evaluating the performance impact using our BRL-CAD Benchmark suite ('benchmark' command).

Code:

  • src/libbu/parallel.c
  • src/libbu/semaphore.c
Uploaded Work
File name/URLFile sizeDate submitted
parallel.patch2.4 KBDecember 09 2012 18:32 UTC
aff.tar.gz5.0 KBDecember 10 2012 13:07 UTC
new.patch2.8 KBDecember 10 2012 16:16 UTC
aaf_new.tar.gz4.5 KBDecember 10 2012 16:31 UTC
aff_latest.tar.gz5.0 KBDecember 10 2012 16:43 UTC
aaf_latest.tar.gz4.7 KBDecember 10 2012 17:06 UTC
newp.patch2.9 KBDecember 11 2012 11:35 UTC
fixed.patch2.8 KBDecember 11 2012 16:28 UTC
per3.patch3.8 KBDecember 12 2012 18:45 UTC
per2.patch3.8 KBDecember 12 2012 18:45 UTC
per3.patch3.9 KBDecember 12 2012 18:49 UTC
per2.patch3.9 KBDecember 12 2012 18:49 UTC
per3.patch3.9 KBDecember 12 2012 18:51 UTC
per2.patch3.9 KBDecember 12 2012 18:51 UTC
Comments
Skriptkidon December 8 2012 19:15 UTCTask Claimed

I would like to work on this task.

Sean on December 8 2012 19:26 UTCTask Assigned

This task has been assigned to Skriptkid. You have 72 hours to complete this task, good luck!

Skriptkidon December 9 2012 17:54 UTCCalling the function

I'm writing a funtion which does the job of setting the affinity mask. But where is this function called. And where am I supposed to write this? In parallel.c or semaphore.c?

Skriptkidon December 9 2012 18:33 UTCReady for review

The work on this task is ready to be reviewed.

Skriptkidon December 9 2012 18:33 UTCMake

the flag '-lpthread' has to be passed to gcc. Otherwise, this won't compile.

Daniel Rossberg on December 10 2012 10:19 UTCCould you please provide the output of 'benchmark' too

I.e. we need 2 outputs: benchmark with and without your changes to parallel.c

Daniel Rossberg on December 10 2012 10:19 UTCTask Needs More Work

One of the mentors has sent this task back for more work. Talk to the mentor(s) assigned to this task to satisfy the requirements needed to complete this task, submit your work again and mark the task as complete once you re-submit your work.

Skriptkidon December 10 2012 10:59 UTCFunction

Where should the function for setting the affinity be called in parallel.c? At the beginning? At the end?

Skriptkidon December 10 2012 13:07 UTCReady for review

The work on this task is ready to be reviewed.

Daniel Rossberg on December 10 2012 13:29 UTCI looks like your patch works in general

However ther are some shortfalls



  • You shouldn't include pthread.h again.  Especially not protected with defines.  The code will be compiled with MS Visual C/C++ too.

  • The usual whitespace, tabs vs. blanks problem ... (you know, what I'm talking about?)

Daniel Rossberg on December 10 2012 13:29 UTCTask Needs More Work

One of the mentors has sent this task back for more work. Talk to the mentor(s) assigned to this task to satisfy the requirements needed to complete this task, submit your work again and mark the task as complete once you re-submit your work.

Skriptkidon December 10 2012 15:55 UTCDidn't get the first part

I didnt understand the first part. If i don't include pthread.h, it won't compile.

Skriptkidon December 10 2012 16:16 UTCReady for review

The work on this task is ready to be reviewed.

Skriptkidon December 10 2012 16:17 UTCSmall edit to note

Now the function has been moved to parallel.c. they are not in worker.c or view.c. Just one function prototype in parallel.c

Daniel Rossberg on December 10 2012 16:24 UTCThere are still some issues


  • the line-indents are wrong (8 blanks = 1 tab)

  • undo your change to semaphore.c: you only added 4 spaces

  • remove the #define _GNU_SOURCE from parallel.c

Daniel Rossberg on December 10 2012 16:24 UTCTask Needs More Work

One of the mentors has sent this task back for more work. Talk to the mentor(s) assigned to this task to satisfy the requirements needed to complete this task, submit your work again and mark the task as complete once you re-submit your work.

Skriptkidon December 10 2012 16:32 UTCaaf_new

aaf_new.tar.gz is the latest. The specified fixes have been done. Also, a small change to the note. The function prototype is in worker.c and view.c no more, but in parallel.c. It is only called in worker.c and view.c

Skriptkidon December 10 2012 16:32 UTCReady for review

The work on this task is ready to be reviewed.

Skriptkidon December 10 2012 16:44 UTCLatest

aaf_latest.tar.gz is the latest. **** DO NOT USE ANYTHING SUBMITTED BEFORE THAT. IGNORE ALL OTHER FILES **** Sorry for any confusion.

Skriptkidon December 10 2012 17:11 UTC#define _GNU_SOURCE

Removed it, but I doubt it will compile without that statement. It doesn't on my computer. Shall I include it in the #ifdef HAVE_PTHREAD_H or #ifdef linux condition block?

Sean on December 11 2012 03:22 UTCfew issues

Skriptkid, the indentation and style is STILL wrong... :) Please review our HACKING file for details, but open curly braces go on the same line as their statements (except for functions) and sequences of 8 spaces need to be tabs (we use tab+space to indent levels are 4 spaces, 1 tab, 1tab+4spaces, 2 tabs, 2tabs+4spaces, etc).  Getting the style right isn't optional.


The implementation seems reasonable, but the testing seems a bit flawed.  Setting affinity within a VM isn't likely to make a difference (and the benchmark numbers reflect that).  Moreover, the patch only includes changes to parallel.c and not any other files you changed.  Run "svn diff" from the top src directory to get a patch for all changes.


 

Sean on December 11 2012 03:22 UTCTask Needs More Work

One of the mentors has sent this task back for more work. Talk to the mentor(s) assigned to this task to satisfy the requirements needed to complete this task, submit your work again and mark the task as complete once you re-submit your work.

Sean on December 11 2012 03:22 UTCDeadline extended

The deadline of the task has been extended with 1 days and 0 hours.

Skriptkidon December 11 2012 11:38 UTCBenchmark

Could you tell me if the patch is alright now? ('newp.patch') I'll run a new benchmark and submit that too.(Because I'm downloading brlcad source again. Can't port it from VM.) Also, is it okay if I test it in a Mac? Using make bechmark, of coure.

Skriptkidon December 11 2012 11:38 UTCReady for review

The work on this task is ready to be reviewed.

Daniel Rossberg on December 11 2012 12:57 UTCAlmost

I refer here to the lines in your "newp.patch" file.  Open it with a text editor and you'll see what I mean:



  • 10: 4 unnecessary blanks

  • 14, 15: 4 unnecessary blanks (why did you added blanks here?)

  • 36: 4 unnecessary blanks

  • 70: a blank after the "for" is missing (line 659ff in HACKING)

  • 73: 4 unnecessary blanks

  • 76: a blank after the "if" is missing 

  • 79: 1 unnecessary tab

  • 80: a blank after the "for" is missing

  • 81: a blank after the "if" is missing

  • 82, 83: 4 blanks for an additional indent are missing

  • 93: a blank after the "if" is missing

  • 96: 4 unnecessary blanks

  • 97: the tab should be replaced by 4 blanks (only one indent here)

Daniel Rossberg on December 11 2012 12:58 UTCTask Needs More Work

One of the mentors has sent this task back for more work. Talk to the mentor(s) assigned to this task to satisfy the requirements needed to complete this task, submit your work again and mark the task as complete once you re-submit your work.

Skriptkidon December 11 2012 16:28 UTCReady for review

The work on this task is ready to be reviewed.

Skriptkidon December 11 2012 16:28 UTCFixed

Fixed all that you said. I'm getting a 'malformed patch.' Tell me if you're getting it too. I'll look into that.

Sean on December 11 2012 20:43 UTCpatch applies

The patch applied fine for me but there are several compilation problems.  How are you compiling?  If you've disabled strict compilation, you need to re-enable that and fix the issues reported.  There are several issues, so perhaps try compilation on Mac as well as that should uncover most of them (Mac does not even have the pthread affinity functions() so you'll have to wrap the implementation in #ifdef protections and probably add a function test to the top-level CMakeLists.txt file.


 

Sean on December 11 2012 20:43 UTCTask Needs More Work

One of the mentors has sent this task back for more work. Talk to the mentor(s) assigned to this task to satisfy the requirements needed to complete this task, submit your work again and mark the task as complete once you re-submit your work.

Sean on December 11 2012 20:43 UTCDeadline extended

The deadline of the task has been extended with 1 days and 0 hours.

Sean on December 11 2012 22:23 UTChot diggity

So Skriptkid, a little encouragement perhaps.  I applied your patch and did a real quick-fix for the various build issues I alluded to (which you should still fix properly).  I then tested the performance impact on a real compute system.


On a 64-processor POWER7 beast, I'm seeing our *debug* benchmark results going from around 44000 VGRs to around 67000 VGRs.  That's more than a 50% performance increase, but wait, it gets better...


Our *optimized* production benchmark results increase from around 121000 VGRs to 185000 VGRs!


That. Is. Awesome.


Your patch is going to make some big attention in our next major release.  I'll also be adding extra tasks to implement thread affinity for BSD and Windows.

Skriptkidon December 12 2012 18:41 UTCAWESOME!

Amazing to hear that! :D Look at per3.patch and per2.patch. I've made some fixes. I've added #if defined()s. So only if pthread support and gnu source are found, that function will be run, else, it won't even have a prototype.(Tell me if there are styke coding issues with the preprocessor commands, I saw the AHCKING file, but fonudn nothing about them. I had to see other source files and make deductions.) Also, I'm really sorry for the late progress. I have my semester exams.

Skriptkidon December 12 2012 18:48 UTCDifference

Difference between the two patches is that per2.patch contains some unecessary addition and deletion(of the same) lines. I know that has to be fixed, but just use this to test the compilation. per3.patch has all those issues addressed.(But maybe malformed.) And the changes made compile even in strict mode. Checked it. probably, no '#define _GNU_SOURCE' was causing compilation failure(I was asked to remove it, but now I've put it an #if statement.)

Skriptkidon December 12 2012 18:50 UTC** LAST TWO PATCHES. DO NOT USE FIRST TWO ***

Use the last two per2.patch and per3.patch. Not the first two.

Skriptkidon December 12 2012 18:51 UTCReady for review

The work on this task is ready to be reviewed.

Sean on December 13 2012 07:23 UTCstray directive but looks good

You left a stray #define GNU_SOURCE in there, but it otherwise looks okay.  The preprocessor work needs some work (the API function should always exist) and it's probably worthwhile to just hook it in behind all bu_parallel() calls, but that change can be made on commit.  Thanks for your efforts!


If you have access to other platforms, there are two new affinity tasks to implement support for Windows and BSD now because of your success.

Sean on December 13 2012 07:23 UTCTask Closed

Congratulations, this task has been completed successfully.

Skriptkidon December 13 2012 11:24 UTCSure

Thank you :) I have access to Windows, so yes, I'll take it up. But I'm unable to find it on the tasks list. Is it still on the way?

Skriptkidon December 13 2012 11:55 UTC(ASAP)

Please inform me ASAP. I really wanna take that up.

Sean on December 13 2012 14:48 UTCirc etiquette

 


Joining IRC, asking a question, and leaving is like walking into a room, farting, and leaving!  :)


As you can imagine, that's not very polite.  :)  IRC isn't just for you to use when it's convenient to get a quick response.  If you're going to join, you'll need to wait for a response.  Here or our mailing list are the places to ask without waiting.  I mention this because you've done the ask-and-bolt several times now.


The other tasks are not yet posted because this patch has to be integrated first.  The follow-up tasks will need to build on that committed code so we don't have integration work.  If you want to clean up this patch with the things I mentioned, that would certainly speed things along for posting the other two affinity tasks:



  • hooking behind bu_parallel(), not the individual worker functions

  • unwrapping the new function so it's always available

  • documenting the function properly (missed that mistake before closing the task)

  • pulling the static globals into the callback in parallel.c

  • creating a separate callback that takes an arg

  • make the one with static data call the new arg one

  • updating the callback to utilize the arg parameter, pass it during pthread_create()

  • update the other platforms to do the same where we can, and finally

  • sorting out what to do with _GNU_SOURCE (since just testing means it'll never be set)


Basically, just a little bit of cleanup in that file.  If you start working on this, let me know and I can create a task for it, but you'll need to be meticulous and careful since the goal is proper cleanup. :)


 

Skriptkidon December 13 2012 16:57 UTCSorry :(

I'm really sorry about that. Won't happen again. Also, I'll start the cleanup. Where do I upload the patches since this is closed?

Sean on December 13 2012 20:04 UTCnew task

If you think you're up to it, http://www.google-melange.com/gci/task/view/google/gci2012/8051203


As most of those changes involve cleaning up parallel.c which is a critical piece of our code, don't hesitate to discuss your progress and ask questions.  We don't want to have to iterate a dozen times through the GCI interface (especially on whitespace/style issues!) if we don't have to.


 

Skriptkidon December 14 2012 16:20 UTCbu_parallel

Where do I call bu_set_affinity in bu_parallel? At the beginning of the function?

Sean on December 14 2012 17:52 UTCread through what bu_parallel() does

You'll need to read through and try to understand what bu_parallel() does, at least for the pthreads case.  Pay particular attention to the pthread_create() function and what every parameter means (man pthread_create).