Here's what's to be done for maintaining Boehm GC.
This one does need Hurd-specific configuration.
It is, for example, used by GCC (which has its own fork), so any changes committed upstream should very like also be made there.
General information
Configuration
Last reviewed up to the 5f492b98dd131bdd6c67eb56c31024420c1e7dab (2012-06-08)
sources, and for libatomic_ops to the
6a0afde033f105c6320f1409162e3765a1395bfd (2012-05-15) sources.
configure.acPARALLEL_MARKis not enabled; doesn't make sense so far.*-*-kfreebsd*-gnudefinesUSE_COMPILER_TLS. What's this, and why does not other config?TODO
[ if test "$enable_gc_debug" = "yes"; then AC_MSG_WARN("Should define GC_DEBUG and use debug alloc. in clients.") AC_DEFINE([KEEP_BACK_PTRS], 1, [Define to save back-pointers in debugging headers.]) keep_back_ptrs=true AC_DEFINE([DBG_HDRS_ALL], 1, [Define to force debug headers on all objects.]) case $host in x86-*-linux* | i586-*-linux* | i686-*-linux* | x86_64-*-linux* ) AC_DEFINE(MAKE_BACK_GRAPH) AC_MSG_WARN("Client must not use -fomit-frame-pointer.") AC_DEFINE(SAVE_CALL_COUNT, 8) ;; AM_CONDITIONAL([KEEP_BACK_PTRS], [test x"$keep_back_ptrs" = xtrue])
configure.hostNothing.
Makefile.am,include/include.am,cord/cord.am,doc/doc.am,tests/tests.amNothing.
include/gc_config_macros.hShould be OK.
include/private/gcconfig.hHairy. But should be OK. Search for HURD, compare to LINUX, I386 case.
See
doc/porting.htmlanddoc/README.macros(and others) for documentation.LINUX has:
#define LINUX_STACKBOTTOMDefined instead of
STACKBOTTOMto have the value read from/proc/.#define HEAP_START (ptr_t)0x1000May want to define it for us, too?
#ifdef USE_I686_PREFETCH,USE_3DNOW_PREFETCH--- [...]Apparently these are optimization that we also could use. Have a look at LINUX for X86_64, which uses
__builtin_prefetch(which Linux x86 could use, too?).TODO
#if defined(LINUX) && defined(USE_MMAP) /* The kernel may do a somewhat better job merging mappings etc. */ /* with anonymous mappings. */ # define USE_MMAP_ANON #endifTODO
#if defined(GC_LINUX_THREADS) && defined(REDIRECT_MALLOC) /* Nptl allocates thread stacks with mmap, which is fine. But it */ /* keeps a cache of thread stacks. Thread stacks contain the */ /* thread control blocks. These in turn contain a pointer to */ /* (sizeof (void *) from the beginning of) the dtv for thread-local */ /* storage, which is calloc allocated. If we don't scan the cached */ /* thread stacks, we appear to lose the dtv. This tends to */ /* result in something that looks like a bogus dtv count, which */ /* tends to result in a memset call on a block that is way too */ /* large. Sometimes we're lucky and the process just dies ... */ /* There seems to be a similar issue with some other memory */ /* allocated by the dynamic loader. */ /* This should be avoidable by either: */ /* - Defining USE_PROC_FOR_LIBRARIES here. */ /* That performs very poorly, precisely because we end up */ /* scanning cached stacks. */ /* - Have calloc look at its callers. */ /* In spite of the fact that it is gross and disgusting. */ /* In fact neither seems to suffice, probably in part because */ /* even with USE_PROC_FOR_LIBRARIES, we don't scan parts of stack */ /* segments that appear to be out of bounds. Thus we actually */ /* do both, which seems to yield the best results. */ # define USE_PROC_FOR_LIBRARIES #endifTODO
# if defined(GC_LINUX_THREADS) && defined(REDIRECT_MALLOC) \ && !defined(INCLUDE_LINUX_THREAD_DESCR) /* Will not work, since libc and the dynamic loader use thread */ /* locals, sometimes as the only reference. */ # define INCLUDE_LINUX_THREAD_DESCR # endifTODO
# if defined(UNIX_LIKE) && defined(THREADS) && !defined(NO_CANCEL_SAFE) \ && !defined(PLATFORM_ANDROID) /* Make the code cancellation-safe. This basically means that we */ /* ensure that cancellation requests are ignored while we are in */ /* the collector. This applies only to Posix deferred cancellation;*/ /* we don't handle Posix asynchronous cancellation. */ /* Note that this only works if pthread_setcancelstate is */ /* async-signal-safe, at least in the absence of asynchronous */ /* cancellation. This appears to be true for the glibc version, */ /* though it is not documented. Without that assumption, there */ /* seems to be no way to safely wait in a signal handler, which */ /* we need to do for thread suspension. */ /* Also note that little other code appears to be cancellation-safe.*/ /* Hence it may make sense to turn this off for performance. */ # define CANCEL_SAFE # endifCAN_SAVE_CALL_ARGSvs. -fomit-frame-pointer now being on by default for Linux x86 IIRC? (Which is an open issue gcc for not including us.)TODO
# if defined(REDIRECT_MALLOC) && defined(THREADS) && !defined(LINUX) # error "REDIRECT_MALLOC with THREADS works at most on Linux." # endif
HURD has:
#define STACK_GROWS_DOWN#define HEURISTIC2Defined instead of
STACKBOTTOMto have the value probed.Linux also has this:
#if defined(LINUX_STACKBOTTOM) && defined(NO_PROC_STAT) \ && !defined(USE_LIBC_PRIVATES) /* This combination will fail, since we have no way to get */ /* the stack base. Use HEURISTIC2 instead. */ # undef LINUX_STACKBOTTOM # define HEURISTIC2 /* This may still fail on some architectures like IA64. */ /* We tried ... */ #endifBeing on glibc, we could perhaps do similar as
USE_LIBC_PRIVATESinstead ofHEURISTIC2. Pro: avoidSIGSEGV(and general fragility) during probing at startup (if I'm understanding this correctly). Con: rely on glibc internals. Or we instead add support to parse/proc/(can even use the same as Linux?), or use some other interface.#define SIG_SUSPEND SIGUSR1,#define SIG_THR_RESTART SIGUSR2We don't
#define MPROTECT_VDB(WIP comment); but Linux neither.Where does our
GETPAGESIZEcome from? Should we#include <unistd.h>like it is done for LINUX?
include/gc_pthread_redirects.hTODO
Cancellation stuff is Linux-only. In other places, too.
mach_dep.c#define NO_GETCONTEXTopen issue glibc, but this is not a real problem here, because we can use the following GCC internal function without much overhead:
GC_with_callee_saves_pushedThe
HAVE_BUILTIN_UNWIND_INITcase is ours.
os_dep.creadSure that it doesn't internally (in glibc) use
malloc. Probably only / mostly
a problem for --enable-redirect-mallocconfigurations? Linux with threads usesreadv.TODO.
dyn_load.cFor
DYNAMIC_LOADING. TODO.pthread_support.c,pthread_stop_world.cTODO.
TODO.
Other files also contain LINUX and other conditionals.
libatomic_ops/configure.acNothing.
Makefile,src/Makefile,src/atomic_ops/Makefile,src/atomic_ops/sysdeps/Makefile,doc/Makefile,tests/MakefileNothing.
src/atomic_ops/sysdeps/gcc/x86.hNothing.
b8b65e8a5c2c4896728cd00d008168a6293f55b1 configure.ac probably not all correct.
mmap, b64dd3bc1e5a23e677c96b478d55648a0730ab75parallel mark, 07c2b8e455c9e70d1f173475bbf1196320812154, pass--disable-parallel-markor enable for us, too?HANDLE_FORK, e9b11b6655c45ad3ab3326707aa31567a767134b, 806d656802a1e3c2b55cd9e4530c6420340886c9, 1e882b98c2cf9479a9cd08a67439dab7f9622924Check
include/private/thread_local_alloc.hreUSE_COMPILER_TLS/USE_PTHREAD_SPECIFIC.
Build
Here's a log of a binutils build run; this is from the
5f492b98dd131bdd6c67eb56c31024420c1e7dab (2012-06-08) sources, and for
libatomic_ops for the 6a0afde033f105c6320f1409162e3765a1395bfd (2012-05-15)
sources, run on kepler.SCHWINGE and coulomb.SCHWINGE.
$ export LC_ALL=C
$ (cd ../master/ && ln -sfn ../libatomic_ops/master libatomic_ops)
$ (cd ../master/ && autoreconf -vfi)
$ ../master/configure --prefix="$PWD".install SHELL=/bin/bash CC=gcc-4.6 CXX=g++-4.6 --enable-cplusplus --enable-gc-debug --enable-gc-assertions --enable-assertions 2>&1 | tee log_build
[...]
$ make 2>&1 | tee log_build_
[...]
Different hosts may default to different shells and compiler versions; thus harmonized. Using bash instead of dash as otherwise libtool explodes.
This takes up around X MiB, and needs roughly X min on kepler.SCHWINGE and X min on coulomb.SCHWINGE.
Analysis
$ ssh kepler.SCHWINGE 'cd tmp/source/boehm-gc/ && cat master.build/log_build* | sed -e "s%\(/media/data\)\?${PWD}%[...]%g"' > toolchain/logs/boehm-gc/linux/log_build
$ ssh coulomb.SCHWINGE 'cd tmp/boehm-gc/ && cat master.build/log_build* | sed -e "s%\(/media/erich\)\?${PWD}%[...]%g"' > toolchain/logs/boehm-gc/hurd/log_build
$ diff -wu <(sed -f toolchain/logs/boehm-gc/linux/log_build.sed < toolchain/logs/boehm-gc/linux/log_build) <(sed -f toolchain/logs/boehm-gc/hurd/log_build.sed < toolchain/logs/boehm-gc/hurd/log_build) > toolchain/logs/boehm-gc/log_build.diff
only GNU/Linux:
configure: WARNING: "Explicit GC_INIT() calls may be required."only GNU/Linux:
configure: WARNING: "Client must not use -fomit-frame-pointer."
Install
$ make install 2>&1 | tee log_install
[...]
This takes up around X MiB, and needs roughly X min on kepler.SCHWINGE and X min on coulomb.SCHWINGE.
Analysis
$ ssh kepler.SCHWINGE 'cd tmp/source/boehm-gc/ && cat master.build/log_install | sed -e "s%\(/media/data\)\?${PWD}%[...]%g"' > toolchain/logs/boehm-gc/linux/log_install
$ ssh coulomb.SCHWINGE 'cd tmp/boehm-gc/ && cat master.build/log_install | sed -e "s%\(/media/erich\)\?${PWD}%[...]%g"' > toolchain/logs/boehm-gc/hurd/log_install
$ diff -wu toolchain/logs/boehm-gc/linux/log_install toolchain/logs/boehm-gc/hurd/log_install > toolchain/logs/boehm-gc/log_install.diff
Testsuite
$ make -k check
[...]
$ (cd libatomic_ops/ && make -k check)
[...]
This needs roughly X min on kepler.SCHWINGE and X min on coulomb.SCHWINGE.
Analysis
$ ssh kepler.SCHWINGE 'cd tmp/source/boehm-gc/ && cat master.build/log_check* | sed -e "s%\(/media/data\)\?${PWD}%[...]%g"' > toolchain/logs/boehm-gc/linux/log_check
$ ssh coulomb.SCHWINGE 'cd tmp/boehm-gc/ && cat master.build/log_check* | sed -e "s%\(/media/erich\)\?${PWD}%[...]%g"' > toolchain/logs/boehm-gc/hurd/log_check
$ diff -wu <(sed -f toolchain/logs/boehm-gc/linux/log_check.sed < toolchain/logs/boehm-gc/linux/log_check) <(sed -f toolchain/logs/boehm-gc/hurd/log_check.sed < toolchain/logs/boehm-gc/hurd/log_check) > toolchain/logs/boehm-gc/log_check.diff
There are different configurations possible, but in general, the testsuite restults of GNU/Linux and GNU/Hurd look very similar.
GNU/Hurd is missing
Call chain at allocation: [...]output.os_dep.c:GC_print_callers
TODO
What are other applications to test Boehm GC? Also especially in combination with libpthread and dynamic loading of shared libraries?
There are patches (apparently not committed) that GCC itself can use it, too: http://gcc.gnu.org/wiki/Garbage_collection_tuning.
There's been some talking about it on GNU guile mailing lists, and two Git branches (2010-12-15: last change 2009-09).
IRC, OFTC, #debian-hurd, 2012-02-05
<pinotree> youpi: i think i found out the possible cause of the ecl and
mono issuess
<pinotree> -s
<youpi> oh
<pinotree> basically, we don't have the realtime signals (so no
SIGRTMIN/SIGRTMAX defined), hence things use either SIGUSR1 or
SIGUSR2... which are used in libgc to resp. stop/resume threads when
"collecting"
<pinotree> i just patched ecl to use SIGINFO instead of SIGUSR1 (used when
no SIGRTMIN+2 is available), and it seems going on for a while
<youpi> uh, why would SIGINFO work better than SIGUSR1?
<pinotree> it was a test, i tried the first "not common" signal i saw
<pinotree> my test was, use any signal different than USR1/2
<youpi> ah, sorry, I hadn't understood
<youpi> you mean there's a conflict between ecl and mono using SIGUSR1, as
well as libgc?
<pinotree> yes
<pinotree> for example, in ecl sources see src/c/unixint.d,
install_process_interrupt_handler()
<youpi> SIGINFO seems a sane choice
<youpi> SIGPWR could have been a better choice if it was available :)
<pinotree> i would have chose an "unassigned" number, say SIGLOST (the
bigger one) + 10, but it would be greater than _NSIG (and thus discarded)
<youpi> not a good idea indeed
<pinotree> it seems that linux, beside the range for rt signals, has some
"free space"
<pinotree> i'll start now another ecl build, from scratch this time, with
s/SIGUSR1/SIGINFO/ (making sure ctags won't bother), and if it works i'll
update svante's bug
<pinotree> mmap(...PROT_NONE...) failed
<pinotree> hmm...
<pinotree> apparently enabling MMAP_ANON in mono's libgc copy was a good
step, let's see
IRC, OFTC, #debian-hurd, 2012-03-18
<pinotree> youpi: mono is afflicted by the SIGUSR1/2 conflict with libgc
<youpi> pinotree: didn't we have a solution for that?
<pinotree> well, it works just for one signal
<pinotree> the ideal solution would be having a range for RT signals, and
make libgc use RTMIN+5/6, like done on most of other OSes
<youpi> but we don't have RT signals, do we?
<pinotree> right :(
IRC, freenode, #hurd, 2012-03-21
<pinotree> civodul: given we have to realtime signals (so no range of
signals for them), libgc uses SIGUSR1/2 instead of using SIGRTMIN+5/6 for
its thread synchronization stuff
<pinotree> civodul: which means that if an application using libgc then
sets its own handlers for either of SIGUSR1/2, hell breaks
<civodul> pinotree: ok
<civodul> pinotree: is it a Debian-specific change, or included upstream?
<pinotree> libgc using SIGUSR1/2? upstream
<civodul> ok
