BRL-CAD
Parallel Processing

Cross platform API for parallel processing, handling issues like threads and semaphores. More...

Collaboration diagram for Parallel Processing:

Modules

 Multithreading
 
 Single Instruction Multiple Data
 Single Instruction Multiple Data support.
 

Macros

#define MAX_PSW   1024
 
#define BU_SEMAPHORE_DEFINE(x)   x = bu_semaphore_register(CPP_STR(x))
 
#define BU_SETJUMP   setjmp((bu_setjmp_valid[bu_parallel_id()]=1, bu_jmpbuf[bu_parallel_id()]))
 
#define BU_UNSETJUMP   (bu_setjmp_valid[bu_parallel_id()]=0)
 

Functions

DEPRECATED int bu_is_parallel (void)
 subroutine to determine if we are multi-threaded More...
 
int bu_parallel_id (void)
 
void bu_nice_set (int newnice)
 process management routines More...
 
size_t bu_avail_cpus (void)
 
void bu_parallel (void(*func)(int func_cpu_id, void *func_data), size_t ncpu, void *data)
 
int bu_semaphore_register (const char *name)
 semaphore implementation More...
 
void bu_semaphore_init (unsigned int nsemaphores)
 
void bu_semaphore_free (void)
 
void bu_semaphore_acquire (unsigned int i)
 
void bu_semaphore_release (unsigned int i)
 

Variables

int BU_SEM_GENERAL
 
int BU_SEM_SYSCALL
 
int BU_SEM_MAPPEDFILE
 
int bu_setjmp_valid [MAX_PSW]
 
jmp_buf bu_jmpbuf [MAX_PSW]
 

Detailed Description

Cross platform API for parallel processing, handling issues like threads and semaphores.

Thread based parallelism routines.

Macro Definition Documentation

◆ MAX_PSW

#define MAX_PSW   1024

MAX_PSW - The maximum number of processors that can be expected on this hardware. Used to allocate application-specific per-processor tables at compile-time and represent a hard limit on the number of processors/threads that may be spawned. The actual number of available processors is found at runtime by calling bu_avail_cpus()

Definition at line 45 of file parallel.h.

◆ BU_SEMAPHORE_DEFINE

#define BU_SEMAPHORE_DEFINE (   x)    x = bu_semaphore_register(CPP_STR(x))

emaphores available for both library and application use.

Definition at line 200 of file parallel.h.

◆ BU_SETJUMP

#define BU_SETJUMP   setjmp((bu_setjmp_valid[bu_parallel_id()]=1, bu_jmpbuf[bu_parallel_id()]))

Definition at line 230 of file parallel.h.

◆ BU_UNSETJUMP

#define BU_UNSETJUMP   (bu_setjmp_valid[bu_parallel_id()]=0)

Definition at line 231 of file parallel.h.

Function Documentation

◆ bu_is_parallel()

DEPRECATED int bu_is_parallel ( void  )

subroutine to determine if we are multi-threaded

This subroutine is separated off from parallel.c so that bu_bomb() and others can call it, without causing either parallel.c or semaphore.c to get referenced and thus causing the loader to drag in all the parallel processing stuff from the vendor library. This routine is DEPRECATED, do not use it. If you need a means to determine when an application is running bu_parallel(), please report this to our developers.

Previously, this was a library-stateful way for bu_bomb() to tell if a parallel application is running. This routine now simply returns zero all the time, which permits BU_SETJUMP() error handling during bu_bomb().

◆ bu_parallel_id()

int bu_parallel_id ( void  )

returns the CPU number of the current bu_parallel() invoked thread.

◆ bu_nice_set()

void bu_nice_set ( int  newnice)

process management routines

routines for parallel processing

Machine-specific routines for portable parallel processing. Without knowing what the current UNIX "nice" value is, change to a new absolute "nice" value. (The system routine makes a relative change).

◆ bu_avail_cpus()

size_t bu_avail_cpus ( void  )

Return the maximum number of physical CPUs that are considered to be available to this process now.

◆ bu_parallel()

void bu_parallel ( void(*)(int func_cpu_id, void *func_data)  func,
size_t  ncpu,
void *  data 
)

Create parallel threads of execution.

This function creates (at most) 'ncpu' copies of function 'func' all running in parallel, passing 'data' to each invocation. Specifying ncpu=0 will specify automatic parallelization, invoking parallel threads as cores become available. This is particularly useful during recursive invocations where the ncpu core count is limited by the parent context.

Locking and work dispatching are handled by 'func' using a "self-dispatching" paradigm. This means you must manually protect shared data structures, e.g., via BU_SEMAPHORE_ACQUIRE(). Lock-free execution is often possible by creating data containers with MAX_PSW elements as bu_parallel will never execute more than that many threads of execution.

All invocations of the specified 'func' callback function are passed two parameters: 1) it's assigned thread number and 2) a shared 'data' pointer for application use. Threads are assigned increasing numbers, starting with zero. Processes may also call bu_parallel_id() to obtain their thread number.

Threads created with bu_parallel() may specify utilization of affinity locking to keep threads on a given physical CPU core. This behavior can be enabled at runtime by setting the environment variable LIBBU_AFFINITY=1. Note that this option may increase or even decrease performance, particularly on platforms with advanced scheduling, so testing is recommended.

This function will not return control until all invocations of the subroutine are finished.

In following is a working stand-alone example demonstrating how to call the bu_parallel() interface.

void shoot_cells_in_series(int width, int height) {
int i, j;
for (i=0; i<height; i++) {
for (j=0; j<width; j++) {
printf("Shooting cell (%d, %d) on CPU %d\n", i, j, bu_parallel_id());
}
}
}
void shoot_row_per_thread(int cpu, void *mydata) {
int i, j, width;
width = *(int *)mydata;
for (i=0; i<width; i++) {
printf("Shooting cell (%d, %d) on CPU %d\n", i, cpu, bu_parallel_id());
}
}
void shoot_cells_in_parallel(int width, int height) {
bu_parallel(shoot_row_per_thread, height, &width);
// we don't reach here until all threads complete
}
int main(int ac, char *av[]) {
int width = 4, height = 4;
printf("\nShooting cells one at a time, 4x4 grid:\n");
shoot_cells_in_series(width, height);
printf("\nShooting cells in parallel with 4 threads, one per row:\n");
shoot_cells_in_parallel(width, height);
return 0;
}
int bu_parallel_id(void)
void bu_parallel(void(*func)(int func_cpu_id, void *func_data), size_t ncpu, void *data)

◆ bu_semaphore_register()

int bu_semaphore_register ( const char *  name)

semaphore implementation

Machine-specific routines for parallel processing. Primarily for handling semaphores to protect critical sections of code.

The new paradigm: semaphores are referred to, not by a pointer, but by a small integer. This module is now responsible for obtaining whatever storage is needed to implement each semaphore.

Note that these routines can't use bu_log() for error logging, because bu_log() acquires semaphore #0 (BU_SEM_SYSCALL).

◆ bu_semaphore_init()

void bu_semaphore_init ( unsigned int  nsemaphores)

Prepare 'nsemaphores' independent critical section semaphores. Die on error.

Takes the place of 'n' separate calls to old RES_INIT(). Start by allocating array of "struct bu_semaphores", which has been arranged to contain whatever this system needs.

◆ bu_semaphore_free()

void bu_semaphore_free ( void  )

Release all initialized semaphores and any associated memory.

FIXME: per hacking, rename to bu_semaphore_clear()

◆ bu_semaphore_acquire()

void bu_semaphore_acquire ( unsigned int  i)

◆ bu_semaphore_release()

void bu_semaphore_release ( unsigned int  i)

Variable Documentation

◆ BU_SEM_GENERAL

int BU_SEM_GENERAL
extern

This semaphore is intended for short-lived protection.

It is provided for both library and application use, code that doesn't call into a BRL-CAD library.

◆ BU_SEM_SYSCALL

int BU_SEM_SYSCALL
extern

This semaphore is intended to protect general system calls.

It is provided for both library and application use, code that doesn't call into a BRL-CAD library.

◆ BU_SEM_MAPPEDFILE

int BU_SEM_MAPPEDFILE
extern

FIXME: this one shouldn't need to be global.

◆ bu_setjmp_valid

int bu_setjmp_valid[MAX_PSW]
extern

◆ bu_jmpbuf

jmp_buf bu_jmpbuf[MAX_PSW]
extern