This is a collection of resources concerning Berkeley Packet Filters.




IRC, freenode, #hurd, 2012-01-13

<braunr> hm, i think the bpf code needs a complete redesign :p
<braunr> unless it's actually a true hurdish way to do things
<braunr> antrik: i need your help :)
<braunr> antrik: I need advice on the bpf "architecture"
<braunr> the current implementation uses a translator installed at /dev/bpf
<braunr> which means packets from the kernel are copied to that translator
  and then to client applications
<braunr> does that seem ok to you ?
<braunr> couldn't the translator be used to set a direct link between the
  kernel and the client app ?
<braunr> which approach seems the more Hurdish to you ? (<= this is what I
  need your help on)
<pinotree> braunr: so there would be a roundtrip like kernel → bpf
  translator → pfinet?
<antrik> braunr: TBH, I don't see why we need a BPF translator at all...
<braunr> antrik: it handles the ioctls
<braunr> pinotree: pfinet isn't involved (it was merely modified to use the
  "new" filter format to specify it used the old packet filter, and not
<antrik> braunr: do we really need to emulate the ioctl()s? can't we assume
  that all packages using BPF will just use libpcap?
<antrik> (and even if we *do* want to emulate ioctl()s, why can't we handle
  this is libc?)
<braunr> antrik: that's what i'm wondering actually
<braunr> even if assuming all packages use libpcap, i'd like our bpf
  interface to be close to what bsds have, and most importantly, what
  libpcap expects from a bpf interface
<antrik> well, why? if we already have a library handling the abstraction,
  I don't see much point in complicating the design and use by adding
  another layer :-)
<braunr> so you would advise adapting libpcap to include a hurd specific
  module ?
<antrik> there are two reasons for adding translators: more robustness or
  more flexibility... so far I don't see how a BPF translator would add
<braunr> right
<antrik> yes
<braunr> so we'd end up with a bpf-like interface, the same instructions
  and format, with different control calls
<antrik> right
<antrik> note that I had more or less the same desicion to make for KGI
  (emulate Linux/BSD ioctl()s, or implement a backend in libggi for
  handling Hurd-specific RPC; and after much consideration, I decided on
  the latter)

IRC, freenode, #hurd, 2012-01-16

<braunr> antrik: is there an existing facility to easily give a send right
  to the device master port to a task ?
<braunr> another function of the bpf translator is to handle the /dev/bpf
  node, and most importantly its permissions
<braunr> so that users which have read/write access to the node have access
  to the packet filter
<braunr> i guess the translator could limit itself to that functionality
<braunr> and then provide a device port on which libpcap operates directly
  by means of device_{g,s}et_status/device_set_filter
<antrik> braunr: I don't see the point in seperating permissions for filter
  from permissions from general network device access...
<antrik> as for device master port, all root tasks can obtain it from proc
<braunr> antrik: yes, but how do we allow non-root users to access that
  facility ?
<braunr> on a unix like system, it's a matter of changing the permissions
  of /dev/bpf
<antrik> with devnode, non-root users can get access to specific device
  nodes, including network devices
<braunr> i can't imagine the hurd being less flexible for that
<braunr> ah devnode
<braunr> good
<antrik> so we can for example make /dev/eth0 accessible by users of some
<braunr> what's devnode exactly ?
<antrik> it's a very simple translator that implements an FS node that
  looks somewhat like a file, but the only operation it supports is
  obtaining a pseudo device master port, giving access to a specific Mach
<braunr> is it already part of the hurd ?
<braunr> or hurdextras maybe ?
<antrik> it's only in zhengda's branch
<braunr> ah
<antrik> needed for both eth-multipexer and DDE
<braunr> and bpf soon i guess
<antrik> indeed :-)
<braunr> "obtaining a pseudo device master port", i believe you meant a
  pseudo device port
<antrik> I must admit that I don't remember exactly whether devnode proxies
  device_open(), so clients direct get a port to the device in question, or
  whether it implements a pseudo device master port...
<antrik> but definitely not a pseudo device port :-)
<braunr> i'm almost positive it gives the target device port, otherwise i
  don't see the point
<braunr> i don't understand the user of the "pseudo" word here either
<braunr> s/user/use/
<braunr> aiui, devnode should be started as root (or in any way which gives
  it the device master port)
<antrik> the point is that the client doesn't need to know the Mach device
  name, and also is not bound to actual kernel devices
<braunr> and when started, implement the required permissions before giving
  clients a device port to the specific device it was installed for
<braunr> right
<braunr> but it mustn't be a proxy
<antrik> yes, devnode needs access to either the real master device port
  (for kernel devices), or one provided by eth-multiplexer or the DDE
  network driver
<braunr> well, a very simple proxy for deviceopen
<braunr> ok
<braunr> that seems exactly what i wanted to do
<braunr> we now need to see if we can integrate it separately
<braunr> create a separate branch that works for the current gnumach code,
  and merge dde/other specific code later on
<antrik> you mean independent of eth-multiplexer or DDE? yes, it was
  generally agreed that devnode is a good idea in any case. I have no idea
  why there are no device nodes for network devices on other UNIX
<braunr> i've been wondering that for years too :)
<antrik> zhengda's branch has a pfinet modified to a) use devnode, and b)
  use BPF
<braunr> why bpf ?
<braunr> for more specific filters maybe ?
<antrik> hm... don't remember whether there was any technical reason for
  going with BPF; I guess it just seemed more reasonable to invest new work
  in BPF rather than obsolete Mach-specific NPF...
<braunr> cspf could be removed altogether, i agree
<antrik> another plus side of his modified pfinet is that it actually sets
  an appropriate filter for TCP/IP and the IP in use, rather than just
  setting a dummy filter catching app packets (including those irrelevant
  to the specific pfinet instance)
<antrik> err... catching all packets
<braunr> that's what i meant by "for more specific filters maybe ?"
<braunr> he was probably more comfortable with the bpf interface to write
  his filter rules
<antrik> well, it would probably be doable with NPF too :-) so by itself
  it's not a reason for switching to BPF...
<antrik> it's rather the other way around: as it was necessary to implement
  filters in eth-multiplexer, and implementing BPF seemed more reasoable,
  pfinet had to be changed to use BPF...
<braunr> antrik: where is zhengda's branch btw ?
<antrik> (I guess using proper filters with eth-multiplexer is not strictly
  necessary; but it would be a major performance hit not to)
<antrik> it's in incubator.git
<antrik> but it's very messy
<braunr> ok
<antrik> at some point I asked him to provide cleaned up branches, and I'm
  pretty sure he said he did, but I totally fail to remember where he
  published them :-(
<braunr> hm, i don't like how devnode is "architectured" :/
<braunr> but it makes things a little more easy to get working i guess
<LarstiQ> antrik: any idea what to grep the logs on for that?
<braunr> ok never mind, devnode is fine
<braunr> exactly what i need
<braunr> i wonder however if it shouldn't be improved to better handle
<braunr> ok, never mind either, permission handling is fine
<braunr> so what are we waiting for ? :)
<antrik> I remember that there were some issues with permission handling,
  but I don't remember whether all were fixed :-(
<antrik> LarstiQ: hm... good question...
<braunr> ah ?
<braunr> hm actually, there could be issues for packet filters, yes
<braunr> i guess we want to allow e.g. read-only opens for capture only
<antrik> braunr: that would have to be handled by the actual BPF
  implementation I'd say
<braunr> it should already be the case
<antrik> what's the problem then?
<braunr> but when the actual device_open() is performed, the appropriate
  permissions must be provided
<braunr> and checking those is the responsibility of the proxy, devnode in
  this case
<antrik> and it doesn't do that?
<braunr> apparently not
<braunr> the only check is against the device name
<braunr> i'll begin playing with that first
<antrik> I vaguely remember that there has been discussion about the
  relation of underlying device open mode and devnode open mode... but I
  don't remember the outcome. in fact it was probably one of the
  discussions I never got around to follow up on... :-(
<antrik> before you begin playing, take a look at the relevant messages in
  the ML archive :-)
<antrik> must have been around two years ago
<braunr> ok
<antrik> some thread with me and scolobb (Sergiu Ivanov +- spelling) and
  probably zhengda
<antrik> there might also be some outstanding patch(es) from scolobb, not

IRC, freenode, #hurd, 2012-01-17

<braunr> antrik: i think i found the thread you mentioned about devnode
<braunr> neither sergiu nor zhengda considered the use of a read-only
  device for packet filtering
<braunr> leading to assumptions such as "only receiving packets
<braunr> is not terribly useful, in view of the fact that you have to at
<braunr> request them, which implies *sending* packets :-)
<braunr> "
<braunr> IMO, devnode should definitely check its node permissions to build
  the device open flags
<braunr> good news is that it doesn't depend on anything specific to other
  incubator projects
<braunr> making it almost readily mergeable in the hurd
<braunr> i'm not sure devnode is an appropriate name though
<braunr> maybe something like device, or devproxy
<braunr> proxy-devopen maybe
<antrik> braunr: well, I don't remember the details of the disucssion; but
  as I mentioned in some mail, I did actually want to write a followup,
  just didn't get around to it... so I was definitely not in agreement with
  some of the statements made by others. I just don't remember on which
  point :-)
<antrik> which thread was it?
<antrik> anyways, this should in no way be specific to network
  devices... the idea is simply that if the client has only read
  permissions on the device node, it should only get to open the underlying
  device for read. it's up to the kernel to handle the read-only status for
  the device once it's opened
<antrik> as for the naming, the idea is that devnode simply makes Mach
  devices accessible through FS nodes... so the name seemed appropriate
<antrik> you may be right though that just "device" might be more
  straightforward... I don't agree on the other variants
<braunr> antrik:
<braunr> antrik: i agree with the general idea behind permission handling,
  i was just referring to their thoughts about it, which probably led to
  the hard coded READ | WRITE flags
<antrik> braunr: unfortunately, I don't remember the context of the
  discussion... would take me a while to get into this again :-(
<antrik> the discussion seems to be about eth-multiplexer as much as about
  devnode (if not more), and I don't remember the exact interaction

IRC, freenode, #hurd, 2012-01-18

<braunr> so, does anyone have an objection to getting devnode into the hurd
  and calling it something else like e.g. device ?
<youpi> braunr: it's Zhengda's work, right?
<braunr> yes
<youpi> I'm completely for it, it just perhaps needs some cleanup
<braunr> i have a few changes to add to what already exists
<braunr> ok
<braunr> well i'm assigning myself to the task
<antrik> braunr: I'm still not convinced just "device" is preferable
<antrik> perhaps machdevice ;-)
<antrik> but otherwise, I'd LOVE to see it in :-)
<braunr> i don't know .. what if the device is actually eth-multiplexer or
  a dde one ?
<braunr> it's not really "mach", is it ?
<braunr> or do we only refer to the interface ?
<youpi> that translator is only for mach devices
<braunr> so you consider dde devices as being mach devices too ?
<braunr> it's a simple proxy for device_open really
<youpi> will these devices use that translator?
<youpi> ah
<youpi> I thought it was using a mach-specific RPC
<braunr> so we can consider whatever we want
<antrik> braunr: yes, the translator is for Mach device interface only. it
  might be provided by other servers, but it's still Mach devices
<youpi> then drop the mach, yes
<braunr> i'd tend to agree with antrik
<youpi> antrik: I'd say the device interface is part of the hur dinterfaces
<braunr> then machdev :p
<braunr> no, it's really part of the mach interface
<youpi> it's part of the mach interface, yes
<youpi> but also of the Hurd, no?
<antrik> DDE network servers also use the Mach device interface
<braunr> no
<youpi> can't we say it's part of it?
<youpi> I mean
<youpi> even if we change the kernel
<braunr> dde is the only thing that implements it besides the kernel that i
  know of
<youpi> we will probably want to keep the same interface
<braunr> yes but that's a mach thing
<youpi> what we have now is not necessarily a reason
<antrik> as for other DDE drivers, I for my part believe they should export
  proper Hurd (UNIX) device nodes directly... but for some reason zhengda
  insisted on implementing it as Mach devices too :-(
<braunr> antrik: i agree with you on that too
<braunr> i was a bit surprised to see the same interface was reused
<braunr> youpi: we can, we just have to agree on what we'll do
<braunr> what do you mean by "even if we change the kernel" ?
<antrik> the problem with "machdev" is that it might suggest the translator
  actually implements the device... not sure whether this would cause
  serious confusion
<antrik> "devopen" might be another option
<antrik> or "machdevopen" to be entirely verbose ;-)
<braunr> an option i suggested earlier which you disagreed on :p
<braunr> but devopen is the one i'd choose
<antrik> youpi: as I already mentioned in the libburn thread, I don't
  actually think the Mach device interface is very nice; IMHO we should get
  rid of it as soon as we can, rather than port it to other
<antrik> but even *if* we decided to reuse it after all, it would still be
  the Mach device interface :-)
<braunr> actually, zheng da already suggested that name a long time ago
<braunr> no actually antrik did eh
<braunr> ok let's use devopen
<antrik> braunr: you suggested proxy-devopen, which I didn't like because
  of the "proxy" part :-)
<braunr> not only, but i don't have the logs any more :p
<antrik> oh, I already suggested devopen once? didn't expect myself to be
  that consistent... ;-)
<antrik> braunr: you suggested device, devproxy or proxy-devopen
<braunr> ah, ok
<braunr> devopen is better
<antrik> I wonder whether it's more important for clarity to have "mach" in
  there or "open"... or whether it's really too unweildy to have both

IRC, freenode, #hurd, 2012-01-21

<braunr> oh btw, i made devopen run today, it shouldn't be hard getting it
  in properly
<braunr> patching libpcap will be somewhat trickier
<braunr> i don't even really need it, but it allows having user access to
  mach devices, which is nice for the libpcap patch and tcpdump tests
<braunr> permission checking is actually its only purpose
<braunr> well, no, not really, it also allows opening devices implemented
  by user space servers transparently

IRC, freenode, #hurd, 2012-01-27

<braunr> hmm, bpf needs more work :(
<braunr> or we can use the userspace bpf filter in libpcap, so that it
  works with both gnumach and dde drivers
<antrik> braunr: there is a userspace BPF implementation in libpcap? I'm
  surprised that zhengda didn't notice it, and ported the one from gnumach
<antrik> what is missing in the kernel implementation?
<braunr> antrik: filling the bpf header
<braunr> frankly, i'm not sure we want to bother with the kernel
<braunr> i'd like it to work with both gnumach and dde drivers
<braunr> and in the long run, we'll be using userspace drivers anyway
<braunr> the bpf header was one of the things the defunct translator did
<braunr> which involved ugly memcpy()s :p
<antrik> braunr: well, if you want to get rid of the kernel implementation,
  basically you would have to take up eth-multiplexer and get it into
<antrik> (and make sure it's used by default in Debian)
<antrik> I frankly believe it's the better design anyways... but quite a
  major change :-)
<braunr> not that major to me
<braunr> in the meantime i'll use the libpcap embedded implementation
<braunr> we'll have something useful faster, with minimum work when
  eth-multiplexer is available
<antrik> eth-multiplexer is ready for use, it just needs to go upstream
<antrik> though it's probably desirable to switch it to the BPF
  implementation from libpcap
<braunr> using the libpcap implementation in libpcap and in eth-multiplexer
  are two different things
<braunr> the latter is preferrable
<braunr> (and yes, by available, i meant upstream ofc)
<antrik> eth-mulitplexer is already using libpcap anyways (for compiling
  the filters); I'm sure zhengda just didn't realize it has an actual BPF
  implementation too...
<braunr> we want the filter implementation as close to the packet source as
<antrik> I have been using eth-multiplexer for at least two years now
<braunr> hm, there is a "snoop" source type, using raw sockets
<braunr> too far from the packet source, but i'll try it anyway
<braunr> hm wrong, snoop was the solaris packet filter fyi

IRC, freenode, #hurd, 2012-01-28

<braunr> nice, i have tcpdump working :)
<braunr> let's see if it's as simple with wireshark
<pinotree> \o/
<braunr> pinotree: it was actually very simple
<pinotree> heh, POV ;)
<braunr> yep, wireshark works too
<braunr> promiscuous mode is harder to test :/
<braunr> but that's a start

IRC, freenode, #hurd, 2012-01-30

<braunr> ok so next step: get tcpreplay working
<antrik> braunr: BTW, when you checked the status of the kernel BPF code,
  did you take zhengda's enhancements/fixes into account?...
<braunr> no
<braunr> when did i check it ?
<antrik> braunr: well, you said the kernel BPF code has serious
  shortcomings. did you take zhengda's changes into account?
<braunr> antrik: ah, when i mention the issues, i considered the userspace
  translator only
<braunr> antrik: and stuff like non blocking io, exporting a selectable
  file descriptor
<braunr> antrik: deb experimental/
<braunr> antrik: this is my easy to use repository with a patched
<braunr> and a small and unoptimized pcap-hurd.c module
<braunr> it doesn't use devopen yet
<braunr> i thought it would be better to have packet filtering working
  first as a debian patch, then get the new translator+final patch upstream
<jkoenig> braunr, tcpdump works great here (awesome!). I'm probably using
  exactly the same setup and "hardware" as you do, though :-P

IRC, freenode, #hurd, 2012-01-31

<braunr> antrik: i tend to think we need a bpf translator, or anything
  between the kernel and libpcap to provide selectable file descriptors
<braunr> jkoenig: do you happen to know how mach_msg (as called in a
  hello.c file without special macros or options) deals with signals ?
<braunr> i mean, is it wrapped by the libc in a version that sets errno ?
<jkoenig> braunr: no idea.
<pinotree> braunr: what's up with it? (not that i have an idea about your
  actual question, just curious)
<braunr> pinotree: i'm improving signal handling in my pcap-hurd module
<braunr> i guess checking for MACH_RCV_INTERRUPTED will dio
<braunr> -INFO is correctly handled :)
<braunr> ok new patch seems fine
<antrik> braunr: selectable file descriptors?
<braunr> antrik: see pcap_fileno() for example
<braunr> it returns a file descriptor matching the underlying object
  (usually a socket) that can be multiplexed in a select/poll call
<braunr> obviously a mach port alone can't do the job
<braunr> i've upgraded the libpcap0.8 package with improved signal handling
  for tests
<antrik> braunr: no idea what you are talking about :-(

IRC, freenode, #hurd, 2012-02-01

<braunr> antrik: you do know about select/poll
<braunr> antrik: you know they work with multiple selectable/pollable file
<braunr> on most unix systems, packet capture sources are socket
<braunr> they're selectable/pollable
<antrik> braunr: what are packet capture sources?
<braunr> antrik: objects that provide applications with packets :)
<braunr> antrik: a PF_PACKET socket on Linux for example, or a Mach device,
  or a BPF file descriptor on BSD
<antrik> for a single network device? or all of them?
<antrik> AIUI the userspace BPF implementation in libpcap opens this
  device, waits for packets, and if any arrive, decides depending on the
  rules whether to pass them to the main program?
<braunr> antrik: that's it, but it's not the point
<braunr> antrik: the point is that, if programs need to include packet
  sources in select/poll calls, they need file descriptors
<braunr> without a translator, i can't provide that
<braunr> so we either decide to stick with the libpcap patch only, and keep
  this limitation, or we write a translator that enables this feature
<pinotree> braunr: are the two options exclusive?
<braunr> pinotree: unless we implement a complete bpf translator like i did
  years ago, we'll need a patch in libpcap
<braunr> pinotree: the problem with my early translator implementation is
  that it's buggy :(
<braunr> pinotree: and it's also slower, as packets are small enough to be
  passed through raw copies
<antrik> braunr: I'm not sure what you mean when talking about "programs
  including packet sources". programs only interact with packet sources
  through libpcap, right?
<antrik> braunr: or are you saying that programs somehow include file
  descriptors for packet sources (how do they obtain them?) in their main
  loop, and explicitly pass control to libpcap once something arrives on
  the respecitive descriptors?
<braunr> antrik: that's the idea, yes
<antrik> braunr: what is the idea?
<braunr> 20:38 < antrik> braunr: or are you saying that programs somehow
  include file descriptors for packet sources (how do they obtain them?) in
  their main loop, and explicitly pass control to libpcap once something
  arrives on the respecitive descriptors?
<antrik> braunr: you didn't answer my question though :-)
<antrik> braunr: how do programs obtain these FDs?
<braunr> antrik: using pcap_fileno() for example

IRC, freenode, #hurd, 2012-02-02

<antrik> braunr: oh right, you already mentioned that one...
<antrik> braunr: so you want some entity that exposes the device as
  something more POSIXy, so it can be used in standard FS calls, unlike the
  Mach devices used for pfinet
<antrik> this is probably a good sentiment in general... but I'm not in
  favour of a special solution only for BPF. rather I'd take this as an
  indication that we probably should expose network interfaces as something
  file-like in general after all, and adapt pfinet, eth-multiplexer, and
  DDE accordingly
<braunr> antrik: i agree
<braunr> antrik: eth-multiplexer would be the right place

IRC, freenode, #hurd, 2012-04-24

<gnu_srs> braunr: Is  BPF fully supported by now? Can it be used for
<braunr> gnu_srs: bpf isn't supported at all
<braunr> gnu_srs: instead of emulating it, i added a hurd-specific module
  in libpcap
<braunr> if isc-dhcp can use libpcap, then fine
<braunr> (otherwise we could create a hurd-specific patch for dhcp that
  uses the in-kernel bpf filter implementation)
<braunr> gnu_srs: can't it use a raw socket ?
<youpi> it can
<youpi> it's just that the shape of the patch to do so wasn't exactly how
  they needed it
<youpi> so they have to rework it a bit
<youpi> and that takes time
<braunr> ok
<braunr> antrik: for now, we prefer encapsulating the system specific code
  in libpcap, and let users of that library benefit from it
<braunr> instead of implementing the low level bpf interface, which
  nonetheless has some system-specific variants ..

IRC, freenode, #hurd, 2012-08-03

In context of the select issue.

<braunr> i understand now why my bpf translator was so buggy
<braunr> the condition_timedwait i wrote at the time was .. incomplete :)

IRC, freenode, #hurd, 2014-02-04

<teythoon> btw, why is there a bpf filter in gnumach ?
<teythoon> braunr: didn't you put it there ?
<braunr> teythoon: ah yes i did
<braunr> teythoon: i completed the work of a friend
<braunr> teythoon: the original filters in mach were netf filters
<braunr> teythoon: we added bpf so that libpcap could directly upload them
  to the kernel
<braunr> in order to apply filters as close as possible to the packet
  source and save copies
<teythoon> so they were used with the in-kernel network drivers ?
<braunr> only by experimental code and pfinet which sets a
  receive-all-inet4/6 filter
<braunr> i also have a pcap-hurd.c file for libpcap but integration is a
  bit tricky because of netdde
<braunr> maybe i could work on it again some day
<braunr> it should be easy to get into the debian package at least
<teythoon> so they can still be used with a netdde-based driver ?
<braunr> i'm not sure
<braunr> the pcap-hurd.c file i wrote uses the libpcap bpf filter
<teythoon> oh, ok, i misinterpreted what you said wrt netdde
<braunr> the problem caused by netdde is about where to get packets from,
  but devnode should take care of that
<teythoon> did you mean that the integration is tricky b/c when netdde is
  used, a different approach is necessary and that would have to be
  detected at runtime ?
<braunr> something like that
<teythoon> right
<braunr> i didn't want to detect anything
<teythoon> right
<braunr> i was waiting for things to settle but netdde is still debian only
<braunr> but that's ok, this oculd be a debian only patch for now
<teythoon> so is eth-filter the netdde equivalent or am i getting a wrong
  picture here ?
<braunr> i don't know
<teythoon> it seems to implement bpf filters as well
<braunr> it could very well be
<braunr> whatever the driver, pfinet must be able to install a filter
<braunr> even if it's almost a catch-all
<teythoon> i guess it could start a eth-filter and use this, why not
<braunr> sure

IRC, freenode, #hurd, 2014-02-06

<antrik> teythoon: the BPF filter in Mach can also be used by
  eth-multiplexer or eth-filter when running on in-kernel network
  drivers... in fact the implementation was finished by the guy who created
  eth-multiplexer; it was not fully working before
<antrik> it's not useful at all when using netdde I believe
<antrik> teythoon: IIRC eth-filted both relies on BPF being implemented by
  the layer below it (whatever it is) to do the actual filtering, as well
  as implements BPF itself so any layer on top of it can in turn use BPF
<antrik> netdde should provide BPF filters too I'd say... but don't
  remember for sure