__LOCATION__, __LOCATION__, __LOCATION__

I'm working with an open source project right now that uses a __location__ macro to identify source files and lines for debugging.

If you happen to not be familiar with this concept... the __location__ macro is a #define, designed to take the standard c pre-processor macros for __FILE__ (the name of the current source file) and __LINE__ (the line in the current source file) and use some c language macro trickery to turn it into a single string that consists of the file name (foobar.c) and the file line (255) jammed together (ex, "foobar.c:255").

This was apparently done to simplify logging.  You want to know where something went wrong? Well, then - just log the __location__ of your output, and now you know!  Wonderful! Why futz around with two macros (__FILE__ and __LINE__) when you can just use __location__ instead?

Except... each and every use of __location__ now creates a unique string.

And the compiler, when it goes to optimize read-only data, can no longer consolidate all those references to __FILE__ into a single string in rodata.  Because you have helpfully replaced all (or almost all) of those references to __FILE__ with a reference to __location__, and now instead of on instance of the string "foobar.c" in your binary, you have strings for:
foobar.c:32
foobar.c:97
foobar.c:135
foobar.c:176
foobar.c:193
foobar.c:255 
... and so on, and so on.  One string for each __location__.

Quick calculation shows that, for this project, 10% of the size of the stripped binaries - over 15 megabytes of data - consists of these debugging strings.

Ouch.

My current pet project is to work on replacing the use of __location__ with the explicit use of the standard __FILE__ and __LINE__ macros, and (hopefully!) reclaim that space.





No comments: