Research offset macro consolidation

Research offset macro consolidationBRL-CAD

Status: ClosedTime to complete: 72 hrs Mentors: SeanTags: research, offsetof, consolidation

BRL-CAD's utility library provides two offset wrapper functions bu_offsetof() and bu_offsetofarray() that determine the offset into a structure. Investigate what is required to remove the latter array-variant. This should be doable by either consolidating all calls into bu_offsetof() and changing the callers (e.g., offsetof+sizeof offset), OR by consolidating both macros into a new interface.

This is a portability minefield, so try to find a proof-of-concept approach first. Submit a demo .c file that just includes bu.h and shows that the behavior is preserved.

Uploaded Work

File name/URL	File size	Date submitted
clang_make.log	15.0 KB	December 30 2012 16:41 UTC

Comments

Cezaron December 30 2012 10:35 UTCTask Claimed

I would like to work on this task.

Harmanpreet Singh on December 30 2012 10:59 UTCTask Assigned

This task has been assigned to Cezar. You have 72 hours to complete this task, good luck!

Cezaron December 30 2012 17:46 UTCclang_make.log

clang_make.log contains a summary of the complaints clang generates, but I still don't understand the offsetofarray issue. I tried something like "struct t { char a[20]; } print(offsetof(struct t, a)) print(offsetof(struct t, a[0])), and I got the same values. When are they supposed to be different?

And on the previous task, you said "you don't need a function to get the size of an array element. sizeof(struct t) would do it", but I don't really understand this. Instead of sizeof(a[0]), I should write sizeof(int) (if I have an array of ints)?

Sean on December 31 2012 05:25 UTCmaybe OBE

Frankly, with non kr compilation, there may no longer be a real distinction between getting the offset of the first element and the name of the array. If that's the case for a variety of compilers, then the array macro may simply get replaced with the non-array version. That said, this could be more complex than it seems on the surface. Try creating a struct with bitfields and non-byte-aligned values so that internal padding may result. Then see if the offsets are still the same.

As for your second question, yes. sizeof() is meant to be sizeof(type) where type is inferred if you provide a variable but it's still based around the type, not the variable/address/whatever.

Cezaron December 31 2012 13:07 UTCCode

I've written a small program to test offsetof with bitfields. I tried different numbers, placing the char array in different places, and compiling with both gcc 4.2 and llvm 3.1, and the numbers were always equal http://slexy.org/view/s20ovFFvuX. As for non-byte-aligned values, I tried adding a `#pragma pack` before the struct and also got the same two numbers. And I read this http://c-faq.com/struct/align.esr.html. As far as I understand, it says that newer compilers don't need to worry about this. Is it relevant? Seems a bit old.

Sean on December 31 2012 16:12 UTCthat's actually the concern

That link you found is exactly a good summary of the various issues. Yep, it's a bit old but then so is our code and that offset code predates even C standardization ... :)

What I don't know is whether the C89 (or C99 or POSIX.1 or some other standard) say anything about the actual alignment offsets for odd hardware (e.g., 36bit hardware). That is to say whether compilers just happen to all behave sanely or whether a standard requires that they behave.

That said, your investigation is enough compelling motivation to move forward. Try an array of non-char (e.g., int, long, and float), and if you get the same behavior, we can close this out (try 32-bit and 64-bit compilation if you can). We can create a follow-on task to remove the bu_offsetofarray if that comes up clean.

Cezaron December 31 2012 16:40 UTCTitle

I found another discussion here http://stackoverflow.com/questions/713963/why-does-this-c-code-work/714206#714206 As far as I understand, the compiler is free to define offsetof however it wants, so `#define offsetof(t, m) 4` would be perfectly valid, although of not much help.

I tested my code with int, long, long long, float, double and long double plus their unsigned variants and got satisfying results on both os x (64-bit, clang and gcc) and linux (mint, 32-bit, gcc 4.7). But if we replace calls to bu_offsetofarray with calls to bu_offsetof and something breaks, shouldn't we see it in `make test` or `make regress`?

Sean on January 1 2013 22:35 UTCnot quite

Conveniently, Leffler in that reply thread provides the exact language from one of the standards (presumably C89, but possibly C99). Your example would only be valid if that's the offset to the structure member. The standard does say what, just not how.

All that means is that our code can finally start to make some assumptions (if it's C89). Our implementation far predates standardization and all of the issues I'm aware of were on pre-C89 KR compilers that did some really wonky things.

Yes, we should see breakage -- hard runtime crashes -- for a compiler that doesn't actually provide a sane offset value. All the more reason to close out this research and take steps towards simplifying code.

Cezaron January 1 2013 22:36 UTCReady for review

The work on this task is ready to be reviewed.

Sean on January 1 2013 23:39 UTCTask Closed

Congratulations, this task has been completed successfully.