IRC, freenode, #hurd, 2010

<slpz> humm... why does tmpfs try to use the default pager? that's a bad
  idea, and probably will never work correctly...
* slpz is thinking about old issues
<slpz> tmpfs should create its own pagers, just like ext2fs, storeio...
<slpz> slopez@slp-hurd:~$ settrans -a tmp /hurd/tmpfs 10M
<slpz> slopez@slp-hurd:~$ echo "foo" > tmp/bar
<slpz> slopez@slp-hurd:~$ cat tmp/bar
<slpz> foo
<slpz> slopez@slp-hurd:~$ 
<slpz> :-)
<pochu> slpz: woo you fixed it?
<slpz> pochu: well, it's WIP, but reading/writing works...
<slpz> I've replaced the use of default pager for the standard pager
  creation mechanism
<antrik> slpz: err... how is it supposed to use swap space if not using the
  default pager?
<antrik> slpz: or do you mean that it should act as a proxy, just
  allocating anonymous memory (backed by the default pager) itself?
<youpi> antrik: the kernel uses the default pager if the application pager
  isn't responsive enough
<slpz> antrik: it will just create memory objects and provide zerofilled
  pages when requested by the kernel (after a page fault)
<antrik> youpi: that makes sense I guess... but how is that relevant to the
  question at hand?...
<slpz> antrik: memory objects will contain the data by themselves
<slpz> antrik: as youpi said, when memory is scarce, GNU Mach will start
  paging out data from memory objects to the default pager
<slpz> antrik: that's the way in which pages will get into swap space
<slpz> (if needed)
<youpi> the thing being that the tmpfs pager has a chance to select pages
  he doesn't care any more about
<antrik> slpz: well, the point is that instead of writing the pages to a
  backing store, tmpfs will just keep them in anonymous memory, and let the
  default pager write them out when there is pressure, right?
<antrik> youpi: no idea what you are talking about. apparently I still
  don't really understand this stuff :-(
<youpi> ah, but tmpfs doesn't have pages he doesn't care about, does it?
<slpz> antrik: yes, but the term "anonymous memory" could be a bit
<slpz> antrik: in GNU Mach, anonymous memory is backed by a memory object
  without a pager. In tmpfs, nodes will be allocated in memory objects, and
  the pager for those memory objects will be tmpfs itself
<antrik> slpz: hm... I thought anynymous memory is backed by memory objects
  created from the default pager?
<antrik> yes, I understand that tmpfs is supposed to be the pager for the
  objects it provides. they are obviously not anonymoust -- they have
  inodes in the tmpfs name space
<antrik> but my understanding so far was that when Mach returns pages to
  the pager, they end up in anonymous memory allocated to the pager
  process; and then this pager is responsible for writing them back to the
  actual backing store
<antrik> am I totally off there?...
<antrik> (i.e. in my understanding the returned pages do not reside in the
  actual memory object the pager provides, but in an anonymous memory
<slpz> antrik: you're right. The trick here is, when does Mach return the
<slpz> antrik: if we set the attribute "can_persist" in a memory object,
  Mach will keep it until object cache is full or memory is scarce
<slpz> or we change the attributes so it can no longer persist, of course
<slpz> without a backing store, if Mach starts sending us pages to be
  written, we're in trouble
<slpz> so we must do something about it. One option, could be creating
  another pager and copying the contents between objects.
<antrik> another pager? not sure what you mean
<antrik> BTW, you didn't really say why we can't use the default pager for
  tmpfs objects :-)
<slpz> well, there're two problems when using the default pager as backing
  store for translators
<slpz> 1) Mach relies on it to do swapping tasks, so meddling with it is
  not a good idea
<slpz> 2) There're problems with seqnos when trying to work with the
  default pager from tasks other the kernel itself
<slpz> (probably, the latter could be fixed)
<slpz> antrik: pager's terminology is a bit confusing. One can also say
  creating another memory object (though the function in libpager is
<antrik> not sure why "meddling" with it would be a problem...
<antrik> and yeah, I was vaguely aware that there is some seqno problem
  with tmpfs... though so far I didn't really understand what it was about
<antrik> makes sense now
<antrik> anyways, AIUI now you are trying to come up with a mechanism where
  the default pager is not used for tmpfs objects directly, but without
  making it inefficient?
<antrik> slpz: still don't understand what you mean by creating another
  memory object/pager...
<antrik> (and yeat, the terminology is pretty mixed up even in Mach itself)
<slpz> antrik: I meant creating another pager, in terms of calling again to
  libpager's pager_create
<antrik> slpz: well, I understand what "create another pager" means... I
  just don't understand what this other pager would be, when you would
  create it, and what for...
<slpz> antrik: oh, ok, sorry
<slpz> antrik: creating another pager it's just a trick to avoid losing
  information when Mach's objects cache is full, and it decides to purge
  one of our objects
<slpz> anyway, IMHO object caching mechanism is obsolete and should be
<slpz> I'm writting a comment to bug #28730 which says something about this
<slpz> antrik: just one more thing :-) 
<slpz> if you look at the code, for most time of their lives, anonymous
  memory objects don't have a pager
<slpz> not even the default one
<slpz> only the pageout thread, when the system is running really low on
  memory, gives them a reference to the default pager by calling
<slpz> this is not really important, but worth noting ;-)

IRC, freenode, #hurd, 2011-09-28

<slpz> mcsim: "Fix tmpfs" task should be called "Fix default pager" :-)
<slpz> mcsim: I've been thinking about modifying tmpfs to actually have
  it's own storeio based backend, even if a tmpfs with storage sounds a bit
<slpz> mcsim: but I don't like the idea of having translators messing up
  with the default pager...
<antrik> slpz: messing up?...
<slpz> antrik: in the sense of creating a number of arbitrarily sized
<antrik> slpz: well, it doesn't really matter much whether a process
  indirectly eats up arbitrary amounts of swap through tmpfs, or directly
  through vm_allocate()...
<antrik> though admittedly it's harder to implement resource limits with
<slpz> antrik: but I've talked about having its own storeio device as
  backend. This way Mach can pageout memory to tmpfs if it's needed.
<mcsim> Do I understand correctly that the goal of tmpfs task is to create
  tmpfs in RAM?
<slpz> mcsim: It is. But it also needs some kind of backend, just in case
  it's ordered to page out data to free some system's memory.
<slpz> mcsim: Nowadays, this backend is another translator that acts as
  default pager for the whole system
<antrik> slpz: pageout memory to tmpfs? not sure what you mean
<slpz> antrik: I mean tmpfs acting as its own pager
<antrik> slpz: you mean tmpfs not using the swap partition, but some other
  backing store?
<slpz> antrik: Yes.

See also: pagers.

<antrik> slpz: I don't think an extra backing store for tmpfs is a good
  idea. the whole point of tmpfs is not having a backing store... TBH, I'd
  even like to see a single backing store for anonymous memory and named
<slpz> antrik: But you need a backing store, even if it's the default pager
<slpz> antrik: The question is, Should users share the same backing store
  (swap space) or provide their own?
<antrik> slpz: not sure what you mean by "users" in this context :-)
<slpz> antrik: Real users with the ability of setting tmpfs translators
<antrik> essentially, I'd like to have a single partition that contains
  both swap space and the main filesystem (at least /tmp, but probably also
  all of /run, and possibly even /home...)
<antrik> but that's a bit off-topic :-)
<antrik> well, ideally all storage should be accounted to a user,
  regardless whether it's swapped out anonymous storage, temporary named
  files, or permanent files
<slpz> antrik: you could use a file as backend for tmpfs
<antrik> slpz: what's the point of using tmpfs then? :-)
<pinotree> (and then store the file in another tmpfs)
<slpz> antrik: mach-defpager could be modified to use storeio instead of
  Mach's device_* operations, but by the way things work right now, that
  could be dangerous, IMHO
<antrik> pinotree: hehe
<pinotree> .. recursive tmpfs'es ;)
<antrik> slpz: hm, sounds interesting
<slpz> antrik: tmpfs would try to keep data in memory always it's possible
  (not calling m_o_lock_request would do the trick), but if memory is
  scarce an Mach starts paging out, it would write it to that
<antrik> ideally, all storage used by system tasks for swapped out
  anonymous memory as well as temporary named files would end up on the
  /run partition; while all storage used by users would end up in /home/*
<antrik> if users share a partition, some explicit storage accounting would
  be useful too...
<antrik> slpz: is that any different from what "normal" filesystems do?...
<antrik> (and *should* it be different?...)
<slpz> antrik: Yes, as most FS try to synchronize to disk at a reasonable
  rate, to prevent data losses.
<slpz> antrik: tmpfs would be a FS that wouldn't synchronize until it's
  forced to do that (which, by the way, it's what's currently happening
  with everyone that uses the default pager).
<antrik> slpz: hm, good point...
<slpz> antrik: Also, metadata in never written to disk, only kept in memory
  (which saves a lot of I/O, too).
<slpz> antrik: In fact, we would be doing the same as every other kernel
  does, but doing it explicitly :-)
<antrik> I see the use in separating precious data (in permanent named
  files) from temporary state (anonymous memory and temporary named files)
  -- but I'm not sure whether having a completely separate FS for the
  temporary data is the right approach for that...
<slpz> antrik: And giving the user the option to specify its own storage,
  so we don't limit him to the size established for swap by the super-user.
<antrik> either way, that would be a rather radical change... still would
  be good to fix tmpfs as it is first if possible
<antrik> as for limited swap, that's precisely why I'd prefer not to have
  an extra swap partition at all...
<slpz> antrik: It's not much o fa change, it's how it works right now, with
  the exception of replacing the default pager with its own.
<slpz> antrik: I think it's just a matter of 10-20 hours, as
  much. Including testing.
<slpz> antrik: It could be forked with another name, though :-)
<antrik> slpz: I don't mean radical change in the implementation... but a
  radical change in the way it would be used
<slpz> antrik: I suggest "almosttmpfs" as the name for the forked one :-P
<antrik> hehe
<antrik> how about lazyfs?
<slpz> antrik: That sound good to me, but probably we should use a more
  descriptive name :-)


<tschwinge> slpz, antrik: There is a defpager in the Hurd code.  It is not
  currently being used, and likely incomplete.  It is backed by libstore.
  I have never looked at it.

mach-defpager vs defpager.

IRC, freenode, #hurd, 2011-11-08

<mcsim> who else uses defpager besides tmpfs and kernel?
<braunr> normally, nothing directly
<mcsim> than why tmpfs should use defpager?
<braunr> it's its backend
<braunr> backign store rather
<braunr> the backing store of most file systems are partitions
<braunr> tmpfs has none, it uses the swap space
<mcsim> if we allocate memory for tmpfs using vm_allocate, will it be able
  to use swap partition?
<braunr> it should
<braunr> vm_allocate just maps anonymous memory
<braunr> anonymous memory uses swap space as its backing store too
<braunr> but be aware that this part of the vm system is known to have
<braunr> which is why all mach based implementations have rewritten their
  default pager
<mcsim> what kind of deficiencies?
<braunr> bugs
<braunr> and design issues, making anonymous memory fragmentation horrible
<antrik> mcsim: vm_allocate doesn't return a memory object; so it can't be
  passed to clients for mmap()
<mcsim> antrik: I use vm_allocate in pager_read_page
<antrik> mcsim: well, that means that you have to actually implement a
  pager yourself
<antrik> also, when the kernel asks the pager to write back some pages, it
  expects the memory to become free. if you are "paging" to ordinary
  anonymous memory, this doesn't happen; so I expect it to have a very bad
  effect on system performance
<antrik> both can be avoided by just passing a real anonymous memory
  object, i.e. one provided by the defpager
<antrik> only problem is that the current defpager implementation can't
  really handle that...
<antrik> at least that's my understanding of the situation

IRC, freenode, #hurd, 2013-07-05

<teythoon> btw, why does the tmpfs translator have to talk to the pager?
<teythoon> to get more control about how the memory is paged out?
<teythoon> read lot's of irc logs about tmpfs on the wiki, but I couldn't
  find the answer to that
<mcsim> teythoon: did you read this?
<teythoon> mcsim: I did
<mcsim> teythoon: Last discussion, i think has very good point.
<mcsim> To provide memory objects you should implement pager interface
<mcsim> And if you implement pager interface you are the one who is asked
  to write data to backing storage to evict them
<mcsim> But tmpfs doesn't do this
<teythoon> mmm, clients doing mmap...
<mcsim> teythoon: You don't have mmap
<mcsim> teythoon: mmap is implemented on top of mach interface
<mcsim> teythoon: I mean you don't have mmap at this level
<teythoon> mcsim: sure, but that's close enough for me at this point
<mcsim> teythoon: diskfs interface requires implementor to provide a memory
  object port (send right)
<mcsim> Guest8183: Why tmpfs requires defpager
<Guest8183> how did you get to talk about that ?
<mcsim> I was just asked
<teythoon> Guest8183: it's just so unsettling that tmpfs has to be started
  as root :/
<Guest8183> teythoon: why ?
*** Guest8183 ( is now known as braunr_
<teythoon> braunr_: b/c starting translators isn't a privileged operation,
  and starting a tmpfs translator that doesn't even access any device but
  "just" memory shouldn't require any special privileges as well imho
<teythoon> so why is tmpfs not based on say libnetfs? b/c it is used for
  d-i and someone (apt?) mmaps stuff?
<pinotree> being libdiskfs-based isn't much the issue, iirc
<kilobug> teythoon: AFAIK apt uses mmap, yes
<braunr_> teythoon: right
<braunr_> a ramfs is actually tricky to implement well
<mcsim> braunr_: What do you mean under "to implement well"?
<braunr_> as efficiently as possible
<braunr_> i.e. being as close as possible to the page cache for minimum
<mcsim> braunr: AFAIK ramfs should not use swap partition, so page cache
  shouldn't be relevant for it.
<braunr> i'm talking about a ramfs in general
<braunr> not the specific linux ramfs
<braunr> in linux, what they call ramfs is the tiny version of tmpfs that
  doesn't use swap
<braunr> i actually don't like "tmpfs" much
<braunr> memfs may be more appropriate
<braunr> anyway
<mcsim> braunr: I see. And do you consider defpager variant as "close as
  possible to the page cache"?
<braunr> not far at least
<braunr> if we were able to use it for memory obects, it would be nice
<braunr> but defpager only gets attached to objects when they're evicted
<braunr> before that, anonymous (or temporary, in mach terminology) objects
  have no backing store
<braunr> this was probably designed without having tmpfs in mind
<braunr> i wonder if it's possible to create a memory object without a
  backing store
<mcsim> what should happen to it if kernel decides to evict it?
<braunr> it sets the default pager as its backing store and pushes it out
<mcsim> that's how it works now, but you said "create a memory object
  without a backing store"
<braunr> mach can do that
<braunr> i was wondering if we could do that too from userspace
<mcsim> mach does not evict such objects, unless it bound a defpager to
<mcsim> but how can you handle this in userspace?
<braunr> i mean, create a memory object with a null control port
<braunr> mcsim: is that clearer ?
<mcsim> suppose you create such object, how kernel will evict it if kernel
  does not know who is responsible for eviction of this object?
<braunr> it does
<braunr> 16:41 < braunr> it sets the default pager as its backing store and
  pushes it out
<braunr> that's how i intend to do it on x15 at least
<braunr> but it's much simpler there because uvm provides better separation
  between anonymous and file memory
<braunr> whereas they're much too similar in mach vm
<mcsim> than what the difference between current situation, when you
  explicitly invoke defpager to create object and implicit method you
<braunr> you don't need a true defpager unless you actually have swap
<mcsim> ok
<mcsim> now I see
<braunr> it also saves the communication overhead when initializing the
<mcsim> thank you
<braunr> which may be important since we use ramfs for speed mostly
<mcsim> agree
<braunr> it should also simplify the defpager implementation, since it
  would only have a single client, the kernel
<braunr> which may also be important with regard to global design
<braunr> one thing which is in my opinion very wrong with mach is that it
  may be a client
<braunr> a well designed distributed system should normally not allow on
  component to act as both client and server toward another
<braunr> i.e. the kernel should only be a server, not a client
<braunr> and there should be a well designed server hierarchy to avoid
<braunr> (such as the one we had in libpager because of that)
<mcsim> And how about filesystem? It acts both as server and as client
<braunr> yes
<braunr> but not towards the same other component
<braunr> application -> file system -> kernel
<braunr> no "<->"
<braunr> the qnx documentation explains that quite well
<braunr> let me see if i can find the related description
<mcsim> Basically, I've got your point. And I would rather agree that
  kernel should not act as client
<braunr> mcsim:
<braunr> one way to implement that (and qnx does that too) is to make
  pagers act as client only
<braunr> they sleep in the kernel, waiting for a reply
<braunr> and when the kernel needs to evict something, a reply is sent
<braunr> (qnx doesn't actually do that for paging, but it's a general idea)
<mcsim> braunr: how hierarchy of senders is enforced?
<braunr> it's not
<braunr> developers must take care
<braunr> same as locking, be careful about it