IRC, freenode, #hurd, 2011-11-18

<nocturnal> I'm learning about GNU Hurd and was speculating with a friend
  who is also a computer enthusiast. I would like to know if Hurds
  microkernel can recover services should they crash? and if it can, does
  that recovery code exist in multiple services or just one core kernel
  service? 
<braunr> nocturnal: you should read about passive translators
<braunr> basically, there is no dedicated service to restore crashed
  servers
<etenil> Hi everyone!
<braunr> services can crash and be restarted, but persistence support is
  limited, and rather per serivce
<braunr> actually persistence is more a side effect than a designed thing
<braunr> etenil: hello
<etenil> braunr: translators can also be spawned on an ad-hoc basis, for
  instance when accessing a particular file, no?
<braunr> that's what being passive, for a translator, means
<etenil> ah yeah I thought so :)

IRC, freenode, #hurd, 2011-11-19

<chromaticwt> will hurd ever have the equivalent of a rs server?, is that
  even possible with hurd?
<youpi> chromaticwt: what is an rs server ?
<chromaticwt> a reincarnation server
<youpi> ah, like minix. Well, the main ground issue is restoring existing
  information, such as pids of processes, etc.
<youpi> I don't know how minix manages it
<antrik> chromaticwt: I have a vision of a session manager that could also
  take care of reincarnation... but then, knowing myself, I'll probably
  never imlement it
<youpi> we do get proc  crashes from times to times
<youpi> it'd be cool to see the system heal itself :)
<braunr> i need a better description of reincarnation
<braunr> i didn't think it would make core servers like proc able to get
  resurrected in a safe way
<antrik> depends on how it is implemented
<antrik> I don't know much about Minix, but I suspect they can recover most
  core servers
<antrik> essentially, the condition is to make all precious state be
  constantly serialised, and held by some third party, so the reincarnated
  server could restore it
<braunr> should it work across reboots ?
<antrik> I haven't thought about the details of implementing it for each
  core server; but proc should be doable I guess... it's not necessary for
  the system to operate, just for various UNIX mechanisms
<antrik> well, I'm not aware of the Minix implementation working across
  reboots. the one I have in mind based on a generic session management
  infrastructure should though :-)

IRC, freenode, #hurd, 2012-12-06

<Tekk_> out of curiosity, would it be possible to strap on a resurrection
  server to hurd?
<Tekk_> in the future, that is
<braunr> sure
<Tekk_> cool :)
<braunr> but this requires things like persistence
<spiderweb> like a reincarnation server?
<braunr> it's a lot of works, with non negligible overhead
<Tekk_> spiderweb: yes, exactly. I didn't remember tanenbaum's wording on
  that
<braunr> i'm pretty sure most people would be against that
<spiderweb> braunr: why so?
<Tekk_> it was actually the feature that convinced me that ukernels were a
  good idea
<Tekk_> spiderweb: because then you need a process that keeps track of all
  the other servers
<Tekk_> and they have to be replying to "useless" pings to see if they're
  still alive
<braunr> spiderweb: the hurd community isn't looking for a system reliable
  in critical environments
<braunr> just a general purpose system
<braunr> and persistence requires regular data saves
<braunr> it's expensive
<Tekk_> as well as that
<braunr> we already have performance problems because of the nature of the
  system, adding more without really looking for the benefits is useless
<spiderweb> so you can't theoretically have both?
<braunr> persistence and performance ?
<braunr> it's hard
<Tekk_> spiderweb: you need to modify the other translators to be
  persistent
<braunr> only the ones you care about actually
<braunr> but it's just better to make the critical servers very stable
<Tekk_> so it's not just turning on and off the reincarnation
<braunr> (there isn't that much code there)
<braunr> and the other servers restartable
<mcsim> braunr: I think that if there will be aim to make something like
  resurrection server than it will be needed rewrite most servers to make
  them stateless, isn't it?
<braunr> that's a lot easier and already works with non essential passive
  translators
<Tekk_> mcsim: pretty much
<braunr> mcsim: only those you care about
<braunr> mcsim: the proc auth exec servers for example, perhaps the file
  system servers that can act as root fs, but the others would simply be
  restarted by the passive translator mechanism
<spiderweb> what about restarting device drivers, that would be simple
  right?
<braunr> that's perfectly doable, yes
<spiderweb> (being an OS newbie) - it does seem to me that the whole
  reincarnation server concept could quite possibly be a band aid.
<braunr> spiderweb: no it really works
<braunr> many systems do that actually
<braunr> let me give you a link
<braunr>
  http://ftp.sceen.net/curios_improving_reliability_through_operating_system_structure.pdf
<braunr> it's a bit old, but there is a review of systems aiming at
  resilience and how they achieve part of it
<spiderweb> neat, thanks
<braunr> actually it's not that old at all
<braunr> around 2007