I'm trying to compile Ruby 1.9.3-p125 on AIX 5.3 using xlc_r. I want to use the --enable-load-relative but that's dependent on the dladdr() function to get the pathname of the Ruby shared library and dladdr() isn't available on AIX. I found an implementation of dladdr() here on root.cern.ch for AIX based on a call to loadquery(L_GETINFO).
The loadquery(L_GETINFO) call gets a list of the binary files that make up a program. So to implement dladdr(), I'm doing a check to see if the address of the function passed to dladdr is between ldinfo_textorg and ldinfo_textorg+ldinfo_textsize. But the address of the function isn't within this address range for any of the structures returned. Maybe I'm not correctly interpreting the ld_info structure.
I've attached the code I'm testing. Any help that can be rendered to these questions will be much appreciated.
#include <sys/types.h>
#include <sys/ldr.h>
#include <stdlib.h>
#include <stdio.h>
typedef struct {
char* dli_fname;
} Dl_info;
int
dladdr(void* s, Dl_info* i) {
size_t bufSize = 40960;
struct ld_info* ldi;
void *buf;
int r;
printf("sym at %lu\n", (ulong)s);
buf = (void *)malloc(bufSize);
if (!buf) {
i->dli_fname = 0;
return 0;
}
r = loadquery((int)L_GETINFO, buf, (int)bufSize);
if (r == -1) {
i->dli_fname = 0;
return 0;
}
do {
ldi = (struct ld_info*)buf;
printf("checking %s, text %lu - %lu\n", ldi->ldinfo_filename, (ulong)ldi->ldinfo_textorg, (ulong)(ldi->ldinfo_textorg + ldi->ldinfo_textsize));
if ((ldi->ldinfo_textorg <= s)
&& (s < (ldi->ldinfo_textorg + ldi->ldinfo_textsize))) {
i->dli_fname = ldi->ldinfo_filename;
return 1;
}
buf += ldi->ldinfo_next;
} while (ldi->ldinfo_next);
i->dli_fname = 0;
return 0;
}
int
test_func() { 1; }
int
main() {
Dl_info dli;
int rc = dladdr((void *)test_func, &dli);
printf("rc = %d\n", rc);
if (rc) {
printf("dli.dli_fname = %s\n", dli.dli_fname);
}
}
Update 30-Apr-2012: When I originally posted this question, I was also seeing loadquery(L_GETINFO) cause a segmentation violation when called from a 64-bit program. I can't live without 64-bit so this was a showstopper for me. This now looks like it's a problem with the installation of the compiler; other AIX machines can compile and run the code in 64-bit mode.
The stack back-trace looked like this:
.() at 0xf458
usl_getinfo_user(??, ??, ??, ??) at 0x9fffffff00096b8
uloadquery(??, ??, ??, ??, ??, ??, ??) at 0x9fffffff0009b40
loadquery(0x200000002, 0x1000b3f0, 0xa0000000a000, 0x0, 0x0, 0x0, 0x0, 0x0) at 0x900000000043874
dladdr(s = 0x0000000110000900, i = 0x0ffffffffffffa00), line 18 in "dladdr.c"
main(), line 11 in "test_dladdr.c"
Looks like you need to read more about Pointer Glue on AIX and PowerPC, https://stackoverflow.com/a/1343437/89101 . If you change to:
int rc = dladdr((void*)*((ulong*)&test_func), &dli);
that will get the actual address of test_func and you'll get the expected output of finding it in the main program.
I'm not sure why you get a crash with loadquery for 64-bit. It works on my machine. One thing that looks fishy is
loadquery(0x200000002, 0x1000b3f0, 0xa0000000a000, 0x0, 0x0, 0x0, 0x0, 0x0) at 0x900000000043874
the 0xa0000000a000 looks to be too large, 40960 in hex should just be 0x00000000a000.