/!\ Obsolete /!\

This is no longer valid as a Google Summer of Code project.

The Hurd currently uses its ihash library (libihash) as a generic container for various objects. While it does its job, it has been reported to suffer from high collision rates. In addition, the "one size fits all" approach contributes to slow things down. One particular use case is looking up an object from a Mach port name, which basically translates to getting the file or socket associated with a file descriptor in traditional Unix systems. It's particular because there are actually two lookups for each object, the first being finding the Mach port from a client port name, which is done in the GNU Mach kernel, and the second being finding the server object from a server port name. The best strategy would probably be to directly associate the address of an object to the receive right of its port, eliminating the need to look up again, but this is quite an intrusive change in the code base. For the time being, optimizing lookups would already be an improvement.

The goal of this project is to increase system performance by speeding up object lookups, with a particular focus on name-to-object lookups. Note that there is little room for improvement in the kernel name-to-port lookups because of the various optimizations IPC has received in the past. Looking up server objects from port names could use an algorithm highly tuned for this task, perhaps with better locking (shared/exclusive instead of always mutually exclusive for example). Then, the libihash algorithm could be replaced with a better one, not necessarily a hash based one, to improve all the other users.

This task requires proper knowledge of data structure algorithms, taking into account machine properties such as processor caches, as well as the appropriate skills in C and assembly to check the generated code. Being able to perform accurate measurements in a system that lacks modern profiling tools would also be helpful.

Possible mentors: Richard Braun

IRC, freenode, #hurd, 2013-09-18

In context of id:"20130918081345.GA13789@dalaran.sceen.net".

<teythoon> braunr: (wrt the gnumach HACK) funny, I was thinking about doind
  the same for userspace servers, renaming ports to the address of the
  associated object, saving the need for the hash table...
<braunr> teythoon: see
<braunr> teythoon: my idea is to allow servers to set a label per port,
  obtained at mesage recv time
<braunr> because, yes, looking up an object twice is ridiculous
<braunr> you normally still want port names to be close to 0 because it
  allows some data structure optimizations
<teythoon> braunr: yes, I feared that ports should normally be smallish
  integers and contigious at best
<teythoon> braunr: interesting that you say there that libihash suffers
  from high collision rates
<teythoon> I've a theory to why that is, libihash doesn't do any hashing at
<pinotree> there are notes about that in the open_issues section of the
<teythoon> but I figured that this is probably ok for port names, as they
  are small and contigious
<neal> braunr: That's called protected payload.
<neal> braunr: The idea is that the kernel appends data to the message in

IRC, freenode, #hurd, 2013-10-24

<teythoon> and, with some effort, getting rid of the hash table lookup by
  letting the kernel provide the address of the object (iirc neil knew the
  proper term for that)
<braunr> teythoon: that is a big interface change
<teythoon> how so
<braunr> optimizing libihash and libpthread should already be a good start
<braunr> well how do you intend to add this information ?
<braunr> ok, "big" is overstatement, but still, it's a low level interface
  change that would probably break a lot of things
<teythoon> store a pointer in the port structure in gnumach, make that
  accessible somehow
<braunr> yes but how ?
<teythoon> interesting question indeed
<braunr> my plan for x15 is to make this "label" part of received messages
<braunr> which means you need to change the format of messages
<braunr> that is what i call a big change
<teythoon> ok, so we need to provide an update path
<teythoon> but once done, the change to hurd will be minimal, patching
  libports should cover most of that
<braunr> normally yes
<teythoon> so this amounts to messing with gnumach and mig and designing a
  clever way to make the update process safe

<braunr> libihash is known to show high collision rates
<teythoon> right, libihash
<teythoon> it could use an integer hash function on the keys to distribute
  them better
<braunr> i think that's already what it tries to do
<braunr> so merely using a better hash algorithm such as murmur should do
  the job
<braunr> or use another data structure altogether
<teythoon> no, it does no hashing of its own on the keys
<braunr> are you sure ?
<teythoon> well, it uses only prime numbers as sizes, and computes key %
<braunr> well that's hashing .. :)
<teythoon> but this is not really a good hash
<braunr> yes
<braunr> isn't that what i said ?
<teythoon> right
<teythoon> ok, I didn't get that ;)
<teythoon> also, the sizes start quite small, 3, 7, 19...
<teythoon> and each time the hash table is grown, all items will have to be
<braunr> which is why we could consider another data structure
<teythoon> or, for starters, to thin out that list of sizes
<braunr> my personal preference being radix trees
<teythoon> I assume you have an implementation handy?
<braunr> yes
<teythoon> cool :D
<braunr> but good hashing is excellent too
<braunr> radix trees have their own issues
<teythoon> braunr: http://burtleburtle.net/bob/hash/integer.html
<braunr> i use thomas wang's hashing function in x15
<braunr> or rather, my own personal c utility library, since x15 doesn't
  hash anything currently
<braunr> but murmur is better
<braunr> we prefer distribution over hashing performances
<braunr> https://131002.net/siphash/

IRC, freenode, #hurd, 2013-11-21

<teythoon> btw, about protected payloads in mach
<teythoon> I'm thinking about adding a flag to indicate that mach_msg
  should return the protected payload pointer instead of the local port
  field in the message header
<braunr> when would you set it ?
<braunr> i mean, how is it set ?
<teythoon> we don't really need the port name, right? and when we do, we
  look it up in the referenced data structure
<teythoon> using a new rpc perhaps
<braunr> ok
<teythoon> what do you think?
<braunr> a new rpc on ports themselves, like mach_port_mod_refs i assume ?
<braunr> i think it's a good solution
<teythoon> the field is a natural_t, as far as i can see, pointers should
  fit into it, right?
<teythoon> yes
<braunr> the big problem is backward compatibility
<teythoon> why?
<braunr> and your solution solves that
<teythoon> yes
<braunr> hum
<braunr> natural_t was originally intended to be as large as the machine
<braunr> but it may no longer stay true
<braunr> i think youpi decided to keep it an int and not a long in his
  x86_64 branch
<braunr> mach uses a trick for in-kernel port rights
<braunr> where the right is the port address
<teythoon> yes, I've seen that
<braunr> but i remember hearing about problems with this strategy in
<braunr> or maybe compat problems in mig interfaces
<braunr> i don't remember exactly
<braunr> so youpi looked at how macosx mach deals with the problem
<teythoon> well, but so far there is no 64 bit mach, so we do not need to
  worry about compatibility there, no ?
<braunr> and if i'm right, they forced the ports on 32-bits
<braunr> no you're right
<braunr> we can simply force the field to 64-bits, whatever it contains
<teythoon> or change the message format from the beginning to include both
  the name and the payload
<teythoon> then again, why bother
<braunr> ?
<braunr> have a 64-bits specific message format ?
<teythoon> well, it's worth discussing, no?
<braunr> maybe
<braunr> i personally don't like the idea
<teythoon> we could fix stuff
<braunr> forcing the field to 64-bits should be enough
<teythoon> right
<teythoon> do you think the idea is worth prototyping ?
<braunr> teythoon: yes
<teythoon> braunr: cool :)
<braunr> teythoon: definitely :p
<braunr> actually, doing that can remove a large part (if not all)
  contention from libports
<teythoon> indeed
<braunr> i still think we should work on libihash first


<braunr> converting libihash to murmur2/3 impacts more data structures
<braunr> it's also much easier
<teythoon> what exactly do you mean by that
<teythoon> ?
<braunr> libports uses libihash
<teythoon> yes
<braunr> but it's not the only user
<braunr> libihash is known to have high collision rates
<braunr> that should be fixed
<teythoon> right, but what do you mean by using murmur2/3
<braunr> that's a hashing algorithm name
<teythoon> using the integer finalizer used by murmur?
<braunr> hm
<braunr> i didn't dig the details
<braunr> and simply assumed it could be used for integer hashing as well
<teythoon> the way i see it, murmur can hash arbitrary ammounts
<braunr> if there are better integer hashing algorithms, let's just use
<teythoon> but that is not what we need
<braunr> yes
<teythoon> we have a fixed size integer
<braunr> but from what i remember, it's also very efficient for integer

IRC, freenode, #hurd, 2013-11-22

<teythoon> braunr: /test-pp: msgh_protected_payload is 0x12345678
<teythoon> :)
<braunr> :)
<teythoon> but currently I do another ipc_port_translate which is clearly
  not desireable
<teythoon> the msg handling in the kernel is... involved...
<teythoon> here is the thing... there are two (kernel) threads involved,
  the sender and the receiver
<teythoon> for the sender, kmsg->ikm_header.msgh_remote_port is a pointer
  (thanks to ipc_port_translate) to the destination's ipc_port_t
<teythoon> that's where the protected_payload is stored
<teythoon> but at the receiving thread, the pointer is gone, replaced by a
  port name
<teythoon> so currently I'm doing the lookup there again
<braunr> hum
<braunr> are you sure kmsg is the general structure for all messages ?
<braunr> or is it only for kernel messages ?
<braunr> i don't remember exactly
<teythoon> no, for all messages
<braunr> ok
<teythoon> I just need to get this pointer across cleanly
<braunr> i thought you wanted to replace that port name in the receiving
  thread with the payload
<teythoon> I do
<braunr> i don't see the problem then
<teythoon> well, only the sending thread has the pointer, the receiving
  thread only has the name
<braunr> i don't see what makes it hard to change it
<braunr> since that's what you want to do
<braunr> the sending thread doesn't have the pointer
<teythoon> yes it has
<braunr> well
<braunr> only for in kernel objects
<braunr> and it's an optimization
<braunr> and you shouldn't have to care much about it
<braunr> your work only involves changing how messages are received
<teythoon> let me push it somewhere, so I can show you the patches
<braunr> ok
<teythoon> braunr:
<braunr> teythoon: where should i look at ?
<teythoon> the last commit
<braunr> hm
<braunr> see what calls mach_msg_receive
<braunr> the payload flag must be handled before, when the message is
  actually transferred
<braunr> ipc_kmsg_copyin perhaps
<teythoon> well
<teythoon> but this is the tricky part
<braunr> i'm not sure which of the sender or receiver code actually
  performs these translations
<braunr> yes
<teythoon> b/c at this point it is not known whether the receiver has
  specified the MACH_RCV_PROTECTED_PAYLOAD flag
<teythoon> or my understanding of the whole process is still somewhat off,
  which might very well be...
<braunr> it's not something the receiver should set
<braunr> i.e. the flag shouldn't be set at mach_msg time
<braunr> because it's asynchronous
<braunr> it's a flag that should be set near port creation time
<teythoon> oh
<teythoon> right, I can see how that could work
<braunr> mach_reply_port(); mach_port_set_payload(); mach_msg();
<teythoon> braunr:
<teythoon> I think I found the right spot
<braunr> teythoon: looks better indeed :)
<teythoon> braunr: yes, thanks for the hint :)
<braunr> teythoon: keep in mind gnumach supports moving receive rights
  between tasks
<braunr> i don't think it's much of a burden but don't forget :)
<teythoon> right, if that happens, the protected payload field should
  probably be just reset to 0
<teythoon> that preserves the old default behavior
<braunr> teythoon: you shouldn't name the payload "address" though
<braunr> but really "payload" or "label"
<braunr> vm_offset_t isn't the appropriate type either
<braunr> i suggest unsigned long payload
<teythoon> braunr: noted
<braunr> what i mean is
<braunr> the payload isn't the userspace structure you want to use
<braunr> it's the value stored in that unsigned long
<braunr> whether it's used as a pointer or an array index or whatever
  should be at the application discretion
<teythoon> yes, I got that
<braunr> concerning vm_offset_t, it's misused a lot, mostly for historical
<braunr> vm_offset_t is actually the ancestor of off_t
<braunr> i.e. an offset inside a *memory object*
<braunr> the size of which may differ from the size of a pointer
<teythoon> ok
<braunr> historically, physical and virtual addresses, in addition to
  memory object sizes, were the same, hence the confusion
<braunr> today we might have 32-bits virtual addresses, 36-bits physical
  addresses, and 44- to 64-bits file offsets
<braunr> (e.g. PAE with large file support)

IRC, freenode, #hurd, 2013-11-25

<teythoon> braunr: the object lookup problem is worse than i assumed
<teythoon> the lookup is actually done twice...
<braunr> teythoon: isn't that usually the case :) ?
<braunr> inside gnumach ?
<teythoon> no
<teythoon> once in libports, once in the intrans function
<braunr> which intrans function ?
<braunr> can you point at an example ?
<teythoon> right
<teythoon> routine file_get_fs_options ( file: file_t;
<teythoon> file_t is special
<teythoon> mig magic
<teythoon> type file_t = mach_port_copy_send_t
<teythoon> #ifdef FILE_INTRAN
<teythoon> intran: FILE_INTRAN
<braunr> trivfs_begin_using_protid ?
<teythoon> for example, yes
<braunr> ugh
<teythoon> however, I believe that can be fixed cleanly
<teythoon> I revised my gnumach changes
<teythoon> it works surprisingly well
<braunr> gnumach is largely clean code
<teythoon> i patched libports to use the new falicilty
<teythoon> all the fs translators i tested work fine
<braunr> nice
<teythoon> tmpfs, ext2fs, nfs, hello*
<teythoon> so does exec
<braunr> very nice
<teythoon> howevcer, the bootstrap fails
<braunr> a lot more straightforward than i expected
<teythoon> i believe proc crashes
<teythoon> yes
<braunr> you can use mach_print to manually trace the bootstrap process

mach print.

<teythoon> i did that
<braunr> ok
<teythoon> it's nice
<braunr> bare knives are :)

<teythoon> uh oh, this lookup fix requires some mig changes
<braunr> teythoon: have you built some packages on it ?
<teythoon> braunr: some clang test builds
<braunr> nice
<braunr> where is mig getting in the way ?
<teythoon> yes, and i compiled lots of stuff
<braunr> any debian package ?
<teythoon> let me just push my changes somewhere...
<teythoon> no, no deb
<braunr> ok
<teythoon> braunr:
<teythoon> braunr: in particular,

IRC, freenode, #hurd, 2013-11-27

<teythoon> btw, my protected payload work is progressing nicely
<teythoon> the system actually boots now :)
<braunr> that's great
<braunr> looking forward to seeing it in action
<teythoon> I'd love to quickly discuss my mig changes if you've got a
<braunr> sure
<teythoon> ok
<teythoon> first, please look at this
<teythoon> in line 165, the msgh_local_port is restored
<teythoon> b/c later some intrans function might use this for the object
<braunr> yes ok
<braunr> makes sense
<teythoon> this makes mig payload aware
<teythoon> we'd specify another intrans function that takes a label and
  returns an object
<braunr> let me remind
<braunr> you said there were 3 lookups actually
<braunr> the mach one
<braunr> the libports one
<braunr> and is the intran one the last, right ?
<teythoon> yes
<teythoon> so now i optimized away the second one, the one in libports
<braunr> ok so you need intran aware functions to replace that lookup
<braunr> well
<braunr> payload aware intran functions
<teythoon> yes
<braunr> ok
<teythoon> mostly cast the label, ports_port_ref the object
<braunr> i assume they'd be pretty straightforward
<teythoon> yes
<braunr> and easy to add for all existing intran functions
<teythoon> most likely
<braunr> the proposed change looks very appropriate
<teythoon> :)
<braunr> i'd never thought about intran functions because i didn't want
  that in my clone ;p
<braunr> they do add a bit of complexity
<braunr> but this upgrade path looks right
<teythoon> yes
<teythoon> I think so too
<braunr> nothing more to say :)
<braunr> it's so simple i actually don't understand how i could miss it
  last time i looked
<braunr> i guess i was exhausted heh
<teythoon> thanks for the review :)
<braunr> thanks for your work
<braunr> it's been a long time since we had someone spend that much time on
  the hurd

IRC, freenode, #hurd, 2013-11-29

<teythoon> I came to believe that there is actually a lot of room for
  improvement in our rpc path

IRC, freenode, #hurd, 2013-12-19

<braunr> teythoon_: how is protected payload branch now ?
<braunr> ready for review ?
<teythoon_> the kernel and mig patch are
<teythoon_> patches
<braunr> so pending for approval rathr
<braunr> rather
<teythoon_> documentation is still missing for those ofc
<braunr> the last parts are the mig mutations iirc
<teythoon_> err, you lost me
<teythoon_> i haven't continued to work on the hurd patch series
<teythoon_> the patch series for gnu mach and mig are feature complete from
  my point of view
<braunr> i mean the changes needed to remove the third lookup in the
  mutation functions
<teythoon_> to do that in hurd, we need a patched mig
<braunr> i was just trying to remember correctly
<teythoon_> those patches need to be reviewed
<teythoon_> the hurd patch series is not yet working, but you can see the
  approach i've taken
<braunr> yes
<braunr> ok
<teythoon_> the next thing i'd do in this regard is to fix all object
<braunr> so it didn't change from last time i looked
<teythoon_> no
<teythoon_> some code, notoriously the control port handling in the *fs
  libs, uses mach_port_t for the receiver and do the lookup themself. i'd
  fix that next.

IRC, freenode, #hurd, 2014-01-20

<teythoon> i've tied up enough loose ends, that i can start working on the
  protected payload stuff again
<teythoon> the next step is fixing the receiver lookups everywhere
<braunr> good :)
<teythoon> if everyone uses mig magic for that, the switch will be easy
<teythoon> undoing the hack in mach-defpager too

IRC, freenode, #hurd, 2014-01-24

<braunr> teythoon: what are you currently working on ? protected payload ?
<teythoon> braunr: yes, i'm working with coccinelle to fix all object
  lookups in the hurd
<teythoon> i figured it is easier and cleaner to just fix them instead of
  converting pointers back to port names for those functions that want port

IRC, freenode, #hurd, 2014-02-17

<teythoon> braunr: do you think it's okay to make the 0 payload special ?
<braunr> teythoon: for our needs, sure
<braunr> it's similar to NULL or MACH_PORT_NULL
<teythoon> yes
<teythoon> maybe i should add a symbolic name for that
<teythoon> for consistency
<braunr> but is it wise to let mach_port_set_protected_payload reset the
  behaviour on a zero payload ?
<braunr> i don't think a symbolic name is needed
<braunr> or maybe
<braunr> as you want
<teythoon> what else should it do then ?
<braunr> 00:25 < braunr> but is it wise to let
  mach_port_set_protected_payload reset the behaviour on a zero payload ?
<braunr> it could return invalid argument instead
<braunr> and the documentation would clearly state 0 is invalid
<braunr> but that would also prevent reverting the mode
<teythoon> yes, i consider that not really useful, but i'd be okay with the
  current behavior
<teythoon> but yes, the documentation should make that clear

IRC, freenode, #hurd, 2014-02-22

<teythoon> braunr: once the pp patch set is in gnumach, i'll make
  mach-defpager use it
<teythoon> it's a good target, as it does not use libports
<teythoon> and it's currently abusing the port rename procedure for the
  lookup, making the rights spill into the splay tree
<teythoon> braunr: the wiki mentioned that you once considered to remove
  the ability to rename ports altogether
<braunr> teythoon: ah right
<braunr> i actually intend to keep it for x15, but only because i want port
  names to blend as file descriptors
<teythoon> right, for dup and friends
<braunr> and the radix tree is a data structure that can cope decently with
  not too sparsed distributions

IRC, freenode, #hurd, 2014-02-27

<braunr> teythoon: ah, just saw the commit that will make our network
  faster :)
<teythoon> network ?
<braunr> eh no, it's about ioctls actually
<braunr> :)
<braunr> i read a bit too quickly
<teythoon> one more small step towards fixing all receiver lookups in the
<teythoon> i did not anticipate how much the hurd has to be changed first
  in order to make use of the protected payloads
<braunr> that was my main reason not to do it actually :/
<braunr> but you're almost finished with it, aren't you ?
<teythoon> not sure tbh
<teythoon> i believe the fsys stuff was the largest chunk

IRC, freenode, #hurd, 2014-03-02

<teythoon> youpi: i cleaned up most of the receiver lookups in hurd by now
<teythoon> but there are some tricky cases left
<teythoon> 1/ the pager stuff
<teythoon> the mig declarations are in gnumach, and do not have the
  necessary intran declarations that we can mutate
<teythoon> 2/ some uses of mach_port_t instead of some_type_t in the hurd
  rpc definitions
<teythoon> (e.g. fsys_forward)
<teythoon> for 1/, i'd like to extend the definitions in gnumach
<teythoon> i'm not so sure what to do for the second case
<teythoon> we could introduce some types for each case
<teythoon> or, we do not touch the definitions
<teythoon> my protected payload prototype allows us to map payloads back to
  port names for the functions that want a name
<teythoon> i did this by redefining the mach_port_t type for mig that uses
  the payload to port-name intran function
<teythoon> mig allows type redefinitions, but emits a warning message
<teythoon> though i consider that a useful feature, it allows one to refine
  a type

IRC, freenode, #hurd, 2014-03-04

<teythoon> braunr: i fixed the object lookups in libpager yesterday, a
  pretty mechanic change
<braunr> teythoon: can't be bad :)
<teythoon> amusingly, the resulting packages boot about half way through
<braunr> teythoon: ?
<teythoon> it hangs while cleaning left-over files from /tmp
<braunr> ugh, libpager ..
<teythoon> yes
<teythoon> tricky pager stuff is tricky ?
<braunr> tricky buggy pager stuff is tricky and buggy
<teythoon> ^^
<braunr> let's assume you made things faster, increasing the likelihood of
  deadlocks ..
<braunr> we had a patch once for a libpager deadlock
<teythoon> well, i'm not yet at the point where things might get faster
<braunr> see 901c61a1d25e7c8963e51012760a82730eda1910
<braunr> the same problem exists elsewhere i think, you might have hit it
<teythoon> i'm still just moving the object lookups from the server
  functions to the mig translation functions
<braunr> hm
<teythoon> but yes, i might have influenced the timing, not sure
<braunr> shouldn't cost too much to add some prints
<braunr> iirc, the other potential deadlock is in libpager/pager-attr.c
<braunr> when memory_object_change_attributes is called
<braunr> (which loops back into libpager when the kernel sends data back
<braunr> )
<braunr> tricky ..
<teythoon> i'll try that when i get home

<braunr> aren't you almost done ?
<teythoon> not sure tbh
<braunr> :(
<braunr> althouhg libpager would be really great
<teythoon> and mach-defpager
<braunr> since this is actually one of the biggest points of contention
<teythoon> i'll do that next, and return to libpager later
<braunr> ok
<teythoon> for both i needed to change some rpc type definitions in gnu
<braunr> skipping lookups in libpager would make it harder to suffer
  writeback thread storms
<teythoon> so i want to make sure that these changes are fine so that i can
  propose them
<braunr> ok