Thursday, December 17, 2009

Relocations for the lose

For those not familiar with relocations, they are basically something that needs to be fixed up in a future once a value is known. In an object file for example, there are relocations for the address of global variables and functions not yet linked in. ELF files themselves contain a number of relocations in the core file structure as you are building it, such as header information positions in the file. Since you can't update this information until the entire file is laid out, you must essentially do relocations on the file to patch up the addresses once known.
There are two common models for solving this issue: the fixup approach and the two pass approach. In the fixup approach, you record all of the locations needing fixups and calculate them after placing all of the objects. In the two pass approach, you iterate a second time over all of the data and have each piece fill in its missing data know that everything has been placed.
I decided somewhat arbitrarily to try the first approach in constructing the ELF object output binaries. For each file metadata location needing fixups, there is a powerful (ie read wordy, slow) interface to fixup various locations in the file. I made some bad naming decisions etc etc that caused some issues. The main kiler was that construction order became an issue because relocations couldn't be issued until certain information was filled in (namely certain data needed to compute the value such as a section not yet assigned) that complicated the workings.
On the code cleanup list is to take this old style and replace it with the alternative style I was considering which was to do the two pass style. This would create all of the objects and reference the data structures together to make coherent objects. Then, a updateValues() or similar would be called to have each object to update all of the offsets and such it needs to be written to a file. Finally, one would iterate over all of the parts of the ELF file and write it out section by section.
All in all, I'm sure this approach will bring up some issues as well. A hybrid approach may be used, but this will hopefully solve some of the headaches of the current implementation, reduce code, and speed up further improvements.