Some applications don't themselves link against libpthread, but then load plugins which do link against libpthread. This means internally switching from single-threading to multi-threading, which is only supported since libc0.3 2.19-16~2. Previously, it would result in errors such as:
./pthread/../sysdeps/generic/pt-mutex-timedlock.c:70: __pthread_mutex_timedlock_internal: Assertion `__pthread_threads' failed.
This is now fixed as of libc0.3 2.19-16~2.
<pinotree> youpi: hm, thought about the pthread/stubs issue w/ dlopen'ed libraries <pinotree> currently looks like libstdc++ on hurd links to pthread-stubs, we're the only one with such configuration <pinotree> i was looking at the gcc 4.4 patch hurd-pthread.diff, could it be it does not set THREADLIBS in the configure.ac switch case? <youpi> that's expected <youpi> on linux the libc provides hooks itself, on hurd-i386 it's pthread-stubs <pinotree> why not explicitly link to pthread though? <youpi> because there is no strict need to, for applications that don't need libpthread <youpi> the dlopen case is a tricky case that pthread-stubs had not thought about <pinotree> hm <pinotree> what if the pthread stubs would be moved in our glibc? <youpi> that's what we should do yes <youpi> (ideally) <youpi> but for this we need to build libpthread along glibc, to get it really working
<youpi> and that's the tricky part (Makefile & such) which hasn't been done yet <pinotree> why both (stubs + actual libpthread)? <youpi> because you need the stubs to be able to call the actual libpthread <youpi> as soon libpthread gets dlopened for instance <youpi> +as <pinotree> i see <youpi> (remember that nptl does this if you want to see how) <youpi> (it's the libc files in nptl/) <youpi> (and forward.c) <guillem> also if libpthreads gets integrated with glibc don't we need to switch the hurd from cthreads then? Which has been the blocker all this time AFAIR? <youpi> we don't _need_ to <guillem> ok
<youpi> there's one known issue with pthreads <youpi> you can't dlopen() it
... if the main application is not already linked against it.
<youpi> which also means you can't dlopen() a module which depends on it if the main application hasn't used -lpthread already <youpi> (so as to get libpthread initialized early, not at the dlopen() call) <lucas> I get this while building simgrid: <lucas> cd /home/lucas/simgrid-3.6.1/obj-i486-gnu/examples/gras/console && /usr/bin/cmake -E create_symlink /home/lucas/simgrid-3.6.1/obj-i486-gnu/lib/libsimgrid.so /home/lucas/simgrid-3.6.1/obj-i486-gnu/examples/gras/console/simgrid.so <lucas> cd /home/lucas/simgrid-3.6.1/obj-i486-gnu/examples/gras/console && lua /home/lucas/simgrid-3.6.1/examples/gras/console/ping_generator.lua <lucas> lua: /home/buildd/build/chroot-sid/home/buildd/byhand/hurd/./libpthread/sysdeps/generic/pt-mutex-timedlock.c:68: __pthread_mutex_timedlock_internal: Assertion `__pthread_threads' failed. <lucas> Aborted (core dumped) <youpi> that's it, yes <youpi> (or at least it has the same symptoms) <lucas> it would need fixing in lua, not in SG, then, right? <youpi> yes <lucas> ok, thanks
The fix thus being: link the main application with -lpthread.
IRC, freenode, #hurd, 2011-08-17
< youpi> i.e. openjade apparently dlopen()s modules which use pthreads, but openjade itself is not liked against libpthread < youpi> which means unexpectedly loading pthreads on the fly, which is not implemented < youpi> (and hard to implement of course) < youpi> gnu_srs: so simply tell openjade people to link it with -lpthread < gnu_srs> Shuoldn't missing linking with pthread create an error when building openjade then? < youpi> no < youpi> because it's just a module which needs pthread < youpi> and that module _is_ linked with -lpthread < youpi> and dlopen() loads libpthreads too due to that < youpi> but that's unexpected, for the libpthread initialization stuff < youpi> (and too late to fix initlaization) < gnu_srs> How come that other OSes build opensp w/o problems? < youpi> because there are stubs in the libc < gnu_srs> Sorry for the delay: What hinders stubs to be present also in the Hurd libc parts too, to cope with this problem? < youpi> doing it < youpi> which is hard because you need libpthread bits inside the libc < youpi> making it simpler would need building libpthread at the same time as libc
<gnu_srs> iceweasel: ./pthread/../sysdeps/generic/pt-mutex-timedlock.c:70: __pthread_mutex_timedlock_internal: Assertion `__pthread_threads' failed. <pinotree> LD_PRELOAD libpthread <gnu_srs> why <pinotree> missing link to pthread? <pinotree> and yes, it's known already, just nobody worked on solving it
<gnu_srs> braunr: Is this fixed by your recent patches? test_dbi: ./pthread/../sysdeps/generic/pt-mutex-timedlock.c:70: <gnu_srs> __pthread_mutex_timedlock_internal: Assertion `__pthread_threads' failed. <youpi> faq/libpthread_dlopen.mdwn: ./pthread/../sysdeps/generic/pt-mutex-timedlock.c:70: __pthread_mutex_time <gnu_srs> youpi: tks. A workaround seems to be available: LD_PRELOAD=/lib/i386-gnu/libpthread.so.0.3 <gnu_srs> Is that possible on a buildd? <youpi> it would be simpler to just make the package explicitly link libpthread <gnu_srs> Package is libdbi-drivers, providing libdbd-sqlite3 needed by gnucash
<braunr> hm ok, looks like iceweasel errors all have something to do with the libc dns resolver <braunr> http://darnassus.sceen.net/~rbraun/iceweasel_crash <braunr> apparently, it's simply because the memory chunk isn't page aligned .. <braunr> looks like not preloading libpthread tirggers lots of tricky issues <braunr> anyway, apparently, the malloc/free calls in libresolv don't use locks if libpthread isn't preloaded, which explains why the program state looked impossible to reach and why crashes look random <congzhang> debian linux does not have the pthread load problem. <braunr> congzhang: it had it <braunr> maybe not debian but i've found one such report for opensuse <braunr> ok the bug is simple <braunr> for some reason, our glibc still uses a global _res state for dns resolution instead of per thread ones <braunr> uh, apparently, it's libpthread's job to define a __res_state function for that :(
<braunr> usually when i say it, it crashes soon after, so let's try it : <braunr> i've been running iceweasel 27 fine for like 10 minutes with a patched libpthread <braunr> still no crash ;p <braunr> with luck this extremely lightweight patch will fix all multithreaded applications doing concurrent name resolution .... :) <teythoon> nice :) <braunr> let's try gnash .... <braunr> uh, segfault on termination <braunr> gnash works :) <teythoon> sweet :) <braunr> i'm very surprised we could live so long with that resolv bug
<braunr> youpi: the eglibc bug is about libresolv <braunr> it uses a global resolver state even in multithreaded applications <youpi> libresolv is a horrible part of glibc :) <braunr> which is obviously bad <braunr> yes .. :) <braunr> here is the patch : http://darnassus.sceen.net/~rbraun/0001-libpthread-per-thread-resolver-states.patch <braunr> it's very short, it basically allocates a resolver state per thread in the pthread struct, and sets the TLS variable __resp when the thread starts <braunr> should we make that hurd-specific ? <braunr> or enclose that assignment with #ifdef ENABLE_TLS ? <youpi> well, ENABLE_TLS is now always 1, iirc :) <braunr> for the hurd, yes <youpi> I'm surprised linux never had the issue <youpi> no, not for the hurd <braunr> ah <youpi> I *had* to implement TLS for hurd because it was always 1 for everybody :) <braunr> ok <braunr> so all those ifdefs could be removed and libpthread can assume tls is enabled <braunr> in which case my patch looks fine <youpi> ah, thats a libpthread patch, not glibc patch <braunr> yes <braunr> nptl obviously did that from the start . :) <braunr> linuxthreads had the problem a looong time ago <youpi> ok <braunr> i'm surprised we overlooked it for so long <braunr> but anyway, that's a good fix <youpi> indeed <youpi> it seems all good to me <braunr> well, __resp is a __thread variable <braunr> i could add #ifdef ENABLE_TLS, but then what of the case where TLS isn't enabled, and do we actually care ? <braunr> #error maybe ? <braunr> or #warning ? <youpi> I don't think we care about the non-TLS case any more <braunr> ok <braunr> topgit branch i suppose ? <youpi> well, not, hurd libpthread repo :) <braunr> oh right ... :)
The same symptom appears in an odd case, for instance:
buildd@hurd:~$ ldd /usr/bin/openjade libthreads.so.0.3 => /lib/libthreads.so.0.3 (0x0103d000) libosp.so.5 => /usr/lib/libosp.so.5 (0x01044000) libpthread.so.0.3 => /lib/libpthread.so.0.3 (0x01221000) libnsl.so.1 => /lib/i386-gnu/libnsl.so.1 (0x01232000) [...]
openjade links against both libthreads and libpthread. The result is that libc early-initializes libthreads only, and thus libpthread is not early-initialized, and later on raises assertions. The solution is to just get rid of libthreads, to have only one threading library.