Given an a.out executable that only does raise (SIGABRT), invoking that one...

  • ... against crash-dump-core will...

    • ... not overwrite existing core files.

      Is this reasonable? Linux does overwrite them, for example.

    • ... show big variances in running-time behavior:

      $ TIMEFORMAT='real %R user %U system %S'
      $ rm -f core; time env CRASHSERVER=/servers/crash-dump-core ./a.out; ls -l core
      Aborted (core dumped)
      real 1.350 user 0.000 system 0.010
      -rw------- 1 tschwinge tschwinge 17031168 Jul  7 21:59 core
      $ rm -f core; time env CRASHSERVER=/servers/crash-dump-core ./a.out; ls -l core
      Aborted (core dumped)
      real 22.771 user 0.000 system 0.010
      -rw------- 1 tschwinge tschwinge 17031168 Jul  7 21:59 core
      $ rm -f core; time env CRASHSERVER=/servers/crash-dump-core ./a.out; ls -l core
      Aborted (core dumped)
      real 1.367 user 0.000 system 0.010
      -rw------- 1 tschwinge tschwinge 17031168 Jul  7 22:00 core
      $ rm -f core; time env CRASHSERVER=/servers/crash-dump-core ./a.out; ls -l core
      Aborted (core dumped)
      real 5.789 user 0.000 system 0.010
      -rw------- 1 tschwinge tschwinge 17031168 Jul  7 22:00 core
      $ rm -f core; time env CRASHSERVER=/servers/crash-dump-core ./a.out; ls -l core
      Aborted (core dumped)
      real 22.664 user 0.010 system 0.000
      -rw------- 1 tschwinge tschwinge 17031168 Jul  7 22:01 core
      
    • ... produce a huge core file:

      $ du -hs core 
      17M     core
      

      On Linux, the core file occupies 76 KiB of disk space, which seems much more reasonable. This is possibly related with the default 128MiB heap preallocation.

    • ... does not always produce a useful backtrace:

      abort();

      $ gdb test core
      warning: core file may not match specified executable file.
      [New Thread 86678]
      warning: Wrong size fpregset in core file.
      ...
      Core was generated by `./test'.
      Program terminated with signal 6, Aborted.
      warning: Wrong size fpregset in core file.
      (gdb) bt
      #0  0x00000000 in ?? ()
      #1  0x011f593f in __msg_sig_post (process=72, signal=6, sigcode=0, refport=1)
          at /build/buildd-eglibc_2.10.2-7-hurd-i386-iGL6op/eglibc-2.10.2/build-tree/hurd-i386-libc/hurd/RPC_msg_sig_post.c:144
      #2  0x0109a433 in kill_port (pid=<value optimized out>)
          at ../sysdeps/mach/hurd/kill.c:68
      #3  kill_pid (pid=<value optimized out>) at ../sysdeps/mach/hurd/kill.c:105
      #4  0x0109a69f in __kill (pid=21142, sig=6) at ../sysdeps/mach/hurd/kill.c:139
      #5  0x01099af6 in raise (sig=6) at ../sysdeps/posix/raise.c:27
      #6  0x0109de59 in abort () at abort.c:88
      #7  0x0804849f in main ()
      

      char *foo = 0; *foo = 1;

      $ gdb test core
      Program terminated with signal 11, Segmentation fault.
      warning: Wrong size fpregset in core file.
      #0  0x00000000 in ?? ()
      (gdb) bt
      #0  0x00000000 in ?? ()
      #1  0x0108565b in __libc_start_main (main=0x8048464 <main>, argc=1, ubp_av=0x1023e64, 
          init=0x8048490 <__libc_csu_init>, fini=0x8048480 <__libc_csu_fini>, rtld_fini=0xea20 <_dl_fini>, 
          stack_end=0x1023e5c) at libc-start.c:251
      #2  0x080483d1 in _start ()
      

      raise (SIGABRT);

      $ gdb a.out core
      warning: core file may not match specified executable file.
      [New Thread 76651]
      
      
      warning: Wrong size fpregset in core file.
      Reading symbols from /lib/libc.so.0.3...[...]
      Core was generated by `./a.out'.
      Program terminated with signal 6, Aborted.
      
      
      warning: Wrong size fpregset in core file.
      #0  0x00000000 in ?? ()
      (gdb) bt
      #0  0x00000000 in ?? ()
      Cannot access memory at address 0x17
      

      Probably GDB doesn't manage to dig in the stack properly.

  • ... against crash-suspend will...

    • ... not work at all:

      $ CRASHSERVER=/servers/crash-suspend ./a.out
      $ [returns to the shell and doesn't suspended]
      
    • ... show big variances in running-time behavior:

      $ TIMEFORMAT='real %R user %U system %S'
      $ rm -f core; time env CRASHSERVER=/servers/crash-suspend ./a.out; ls -l core
      Aborted (core dumped)
      real 1.381 user 0.000 system 0.010
      -rw------- 1 tschwinge tschwinge 17031168 Jul  7 22:04 core
      $ rm -f core; time env CRASHSERVER=/servers/crash-suspend ./a.out; ls -l core
      Aborted (core dumped)
      real 1.332 user 0.000 system 0.010
      -rw------- 1 tschwinge tschwinge 17031168 Jul  7 22:04 core
      $ rm -f core; time env CRASHSERVER=/servers/crash-suspend ./a.out; ls -l core
      Aborted (core dumped)
      real 21.228 user 0.000 system 0.010
      -rw------- 1 tschwinge tschwinge 17031168 Jul  7 22:04 core
      $ rm -f core; time env CRASHSERVER=/servers/crash-suspend ./a.out; ls -l core
      Aborted (core dumped)
      real 1.323 user 0.000 system 0.010
      -rw------- 1 tschwinge tschwinge 17031168 Jul  7 22:05 core
      $ rm -f core; time env CRASHSERVER=/servers/crash-suspend ./a.out; ls -l core
      Aborted (core dumped)
      real 22.279 user 0.000 system 0.010
      -rw------- 1 tschwinge tschwinge 17031168 Jul  7 22:05 core
      $ rm -f core; time env CRASHSERVER=/servers/crash-suspend ./a.out; ls -l core
      Aborted (core dumped)
      real 1.362 user 0.000 system 0.000
      -rw------- 1 tschwinge tschwinge 17031168 Jul  7 22:08 core
      $ rm -f core; time env CRASHSERVER=/servers/crash-suspend ./a.out; ls -l core
      Aborted (core dumped)
      real 21.110 user 0.000 system 0.000
      -rw------- 1 tschwinge tschwinge 17031168 Jul  7 22:08 core
      $ rm -f core; time env CRASHSERVER=/servers/crash-suspend ./a.out; ls -l core
      Aborted (core dumped)
      real 1.350 user 0.000 system 0.020
      -rw------- 1 tschwinge tschwinge 17031168 Jul  7 22:08 core
      
    • ... can reliably crash GNU Mach:

      This happens if a core file is already present (and won't get overwritten; see above). I reproduced this three times.

      $ TIMEFORMAT='real %R user %U system %S'
      $ time env CRASHSERVER=/servers/crash-suspend ./a.out; ls -l core
      Aborted
      real 2.856 user 0.000 system 0.010
      -rw------- 1 tschwinge tschwinge 17031168 Jul  7 22:08 core
      
      
      panic: zalloc: zone kalloc.8192 exhausted
      Kernel Breakpoint trap, eip 0x20020a77
      Stopped at  0x20020a76: int     $3
      db> trace
      0x20020a76(2006aba8,4d0f7e9c,200209b0,0,0)
      0x20020a4d(2006b094,2006ae40,2000,20016803,4a5f4114)
      0x2002bca5(49a03564,1,0,9,1000)
      0x20022f4c(2000,4a5f45d4,4a84879c,49a46564,4ac43e78)
      0x20021e65(4ac43e78,4a5f45d4,4a5f4114,0,0)
      0x2005309d(2106ba9c,3,38,28,1783)
      Bad frame pointer: 0x2106ba78
      
      
      $ addr2line -i -f -e /boot/gnumach-xen 0x20020a76 0x20020a4d 0x2002bca5 0x20022f4c 0x20021e65 0x2005309d
      Debugger
      /home/tschwinge/tmp/gnumach/gnumach-1-branch-Xen-branch.build/../gnumach-1-branch-Xen-branch/kern/debug.c:105
      panic
      /home/tschwinge/tmp/gnumach/gnumach-1-branch-Xen-branch.build/../gnumach-1-branch-Xen-branch/kern/debug.c:148
      zalloc
      /home/tschwinge/tmp/gnumach/gnumach-1-branch-Xen-branch.build/../gnumach-1-branch-Xen-branch/kern/zalloc.c:470
      kalloc
      /home/tschwinge/tmp/gnumach/gnumach-1-branch-Xen-branch.build/../gnumach-1-branch-Xen-branch/kern/kalloc.c:185
      ipc_kobject_server
      /home/tschwinge/tmp/gnumach/gnumach-1-branch-Xen-branch.build/../gnumach-1-branch-Xen-branch/kern/ipc_kobject.c:76
      mach_msg_trap
      /home/tschwinge/tmp/gnumach/gnumach-1-branch-Xen-branch.build/../gnumach-1-branch-Xen-branch/ipc/mach_msg.c:1367
      

If someone is working in this area, they may want to have a look at GDB gcore, and port http://code.google.com/p/google-coredumper/, too.