I'm working with an open source project right now that uses a __location__ macro to identify source files and lines for debugging.
If you happen to not be familiar with this concept... the __location__ macro is a #define, designed to take the standard c pre-processor macros for __FILE__ (the name of the current source file) and __LINE__ (the line in the current source file) and use some c language macro trickery to turn it into a single string that consists of the file name (foobar.c) and the file line (255) jammed together (ex, "foobar.c:255").
This was apparently done to simplify logging. You want to know where something went wrong? Well, then - just log the __location__ of your output, and now you know! Wonderful! Why futz around with two macros (__FILE__ and __LINE__) when you can just use __location__ instead?
Except... each and every use of __location__ now creates a unique string.
And the compiler, when it goes to optimize read-only data, can no longer consolidate all those references to __FILE__ into a single string in rodata. Because you have helpfully replaced all (or almost all) of those references to __FILE__ with a reference to __location__, and now instead of on instance of the string "foobar.c" in your binary, you have strings for:
foobar.c:32foobar.c:97foobar.c:135foobar.c:176foobar.c:193foobar.c:255
... and so on, and so on. One string for each __location__.
Quick calculation shows that, for this project, 10% of the size of the stripped binaries - over 15 megabytes of data - consists of these debugging strings.
Ouch.
My current pet project is to work on replacing the use of __location__ with the explicit use of the standard __FILE__ and __LINE__ macros, and (hopefully!) reclaim that space.
No comments:
Post a Comment