IRC, freenode, #hurd, 2011-07-20

<braunr> could we add gnumach forward map entry merging as an open issue ?
<braunr> probably hurting anything using bash extensively, like build most
  build systems
<braunr> mcsim: this map entry merging problem might interest you
<braunr> tschwinge: see vm/vm_map.c, line ~905
<braunr> "See whether we can avoid creating a new entry (and object) by
  extending one of our neighbors.  [So far, we only attempt to extend from
<braunr> and also vm_object_coalesce
<braunr> "NOTE:   Only works at the moment if the second object is NULL -
  if it's not, which object do we lock first?"
<braunr> although map entry merging should be enough
<braunr> this seems to be the cause for bash having between 400 and 1000+
  map entries
<braunr> thi makes allocations and faults slow, and forks even more
<braunr> but again, this should be checked before attempting anything
<braunr> (for example, this comment still exists in freebsd, although they
  solved the problem, so who knows)
<antrik> braunr: what exactly would you want to check?
<antrik> braunr: this rather sounds like something you would just have to
<braunr> antrik: that map merging is actually incomplete
<braunr> and that entries can actually be merged
<antrik> hm, I see...
<braunr> (i.e. they are adjacent and have compatible properties
<braunr> )
<braunr> antrik: i just want to avoid the "hey, splay trees mak fork slow,
  let's work on it for a month to see it wasn't the problem"
<antrik> so basically you need a dump of a task's map to check whether
  there are indeed entries that could/should be merged?
<antrik> hehe :-)
<braunr> well, vminfo should give that easily, i just didn't take the time
  to check it
<jkoenig> braunr, as you pointed out, "vminfo $$" seems to indicate that
  merging _is_ incomplete.
<braunr> this could actually have a noticeable impact on package builds
<braunr> hm
<braunr> the number of entries for instances of bash running scripts don't
  exceed 50-55 :/
<braunr> the issue seems to affect only certain instances (login shells,
  and su -)
<braunr> jkoenig: i guess dash is just much lighter than bash in many ways
<jkoenig> braunr, the number seems to increase with usage (100 here for a
  newly started interactive shell, vs. 150 in an old one)
<braunr> yes, merging is far from complete in the vm_map code
<braunr> it only handles null objects (private zeroed memory), and only
  tries to extend a previous entry (this isn't even a true merge)
<braunr> this works well for the kernel however, which is why there are so
  few as 25 entries
<braunr> but any request, be it e.g. mmap(), or mprotect(), can easily
  split entries
<braunr> making their number larger
<jkoenig> my ext2fs has ~6500 entries, but I guess this is related to
  mapping blocks from the filesystem, right?
<braunr> i think so
<braunr> hm not sure actually
<braunr> i'd say it's fragmentation due to copy on writes when client have
  mapped memory from it
<braunr> there aren't that many file mappings though :(
<braunr> jkoenig: this might just be the same problem as in bash
<braunr>  0x1308000[0x3000] (prot=RW, max_prot=RWX, mem_obj=584)
<braunr>  0x130b000[0x6000] (prot=RW, max_prot=RWX, mem_obj=585)
<braunr>  0x1311000[0x3000] (prot=RX, max_prot=RWX, mem_obj=586)
<braunr>  0x1314000[0x1000] (prot=RW, max_prot=RWX, mem_obj=586)
<braunr>  0x1315000[0x2000] (prot=RX, max_prot=RWX, mem_obj=587)
<braunr> the first two could be merged but not the others
<jkoenig> theoritically, those may correspond to memory objects backed by
  different portions of the disk, right?
<braunr> jkoenig: this looks very much like the same issue (many private
  mappings not merged)
<braunr> jkoenig: i'm not sure
<braunr> jkoenig: normally there is an offset when the object is valid
<braunr> but vminfo may simply not display it if 0
* jkoenig goes read about memory object
<braunr> ok, vminfo can't actually tell if the object is anonymous or
  file-backed memory
<jkoenig> (I'm perplexed about how the kernel can merge two memory objects
  if disctinct port names exist in the tasks' name space -- that's what
  mem_obj is, right?)
<braunr> i don't see why
<braunr> jkoenig: can you be more specific ?
<jkoenig> braunr, if, say, 584 and 585 above are port names which the task
  expects to be able to access and do stuff with, what will happen to them
  when the memory objects are merged?
<braunr> good question
<braunr> but hm
<braunr> no it's not really a problem
<braunr> memory objects aren't directly handled by the vm system
<braunr> vm_object and memory_object are different things
<braunr> vm_objects can be split and merged
<braunr> and shadow objects form chains ending on a final vm_object
<braunr> which references a memory object
<braunr> hm
<braunr> jkoenig: ok no solution, they can't be merged :)
<jkoenig> braunr, I'm confused :-)
<braunr> jkoenig: but at least, if two vm_objects are created but reference
  the same externel memory object, the vm should be able to merge them back
<braunr> external*
<braunr> are created as a result of a split
<braunr> say, you map a memory object, mprotect part of it (=split), then
  mprotect the reste of it (=merge), it should work
<braunr> jkoenig: does that clarify things a bit ?
<jkoenig> ok so if I get it right, the entries shown by vmstat are the
  vm_object, and the mem_obj listed is a send right to the memory object
  they're referencing ?
<braunr> yes
<braunr> i'm not sure about the type of the integer showed (port name or
  simply an index)
<braunr> jkoenig: another possibility explaining the high number of entries
  is how anonymous memory is implemented
<braunr> if every vm_allocate request implies the creation of a memory
  object from the default pager
<braunr> the vm has no way to merge them
<jkoenig> and a vm_object is not a capability, but just an internal kernel
  structure used to record the composition of the address space
<braunr> jkoenig: not exactly the address space, but close enough
<braunr> jkoenig: it's a container used to know what's in physical memory
  and what isn't
<jkoenig> braunr, ok I think I'm starting to get it, thanks.
<braunr> glad i could help
<braunr> i wonder when vm_map_enter() gets null objects though :/
<braunr> "If this port is MEMORY_OBJECT_NULL, then zero-filled memory is
  allocated instead"
<braunr> which means vm_allocate()
<jkoenig> braunr, when the task uses vm_allocate(), or maybe vm_map_enter()
  with MEMORY_OBJECT_NULL, there's an opportunity to extend an existing
  object though, is that what you referred to earlier ?
<braunr> jkoenig: yes, and that's what is done
<jkoenig> but how does that play out with the default pager? (I'm thinking
  aloud, as always feel free to ignore ;-)
<braunr> the default pager backs vm_objects providing zero filled memory
<braunr> hm, guess it wasn't your question
<braunr> well, swap isn't like a file, pages can be placed dynamically,
  which is why the offset is always 0 for this type of memory
<jkoenig> hmm I see, apparently a memory object does not have a size
<braunr> are you sure ?
<jkoenig> from what I can gather from,
  but I looked very quickly
<braunr> vm_objects have a size
<braunr> and each map entry recors the offset within the object where the
  mapping begins
<braunr> offset and sizes are used by the kernel when querying the memory
  object pager
<braunr> see memory_object_data_request for example
<jkoenig> right.
<braunr> but the default pager has another interface
<braunr> jkoenig: after some simple tests, i haven't seen a simple case
  where forward merging could be applied :(
<braunr> which means it's a lot harder than it first looked
<braunr> hm
<braunr> actually, there seems to be cases where this can be done
<braunr> all of them occurring after a first merge was done
<braunr> (which means a mapping request perfectly fits between two map

IRC, freenode, #hurd, 2011-07-21

<braunr> tschwinge: you may remove the forward map entry merging issue :/
<pinotree> what did you discover?
<braunr> tschwinge: it's actually much more complicated than i thought, and
  needs major changes in the vm, and about the way anonymous memory is
<braunr> from what i could see, part of the problem still exists in freebsd
<braunr> for the same reasons (shadow objects being one of them)

mach shadow objects.

GCC build time using bash vs. dash


  • Analyze.

  • Measure.

  • Fix.

  • Measure again.

  • Have Samuel measure on the buildd.