Performance analysis (Wikipedia article) deals with analyzing how computing resources are used for completing a specified task.
Profiling is one relevant tool.
In a multi-server system, it is non-trivial to implement a high-performance I/O System.
When providing POSIX compatibility (and similar interfaces) in an
environemnt that doesn't natively implement these interfaces, there may be a
severe performance degradation. For example, in this
Unit testing can be used for tracking performance regressions.
IRC, freenode, #hurd, 2012-07-05
<braunr> the more i study the code, the more i think a lot of time is wasted on cpu, unlike the common belief of the lack of performance being only due to I/O
IRC, freenode, #hurd, 2012-07-23
<braunr> there are several kinds of scalability issues <braunr> iirc, i found some big locks in core libraries like libpager and libdiskfs <braunr> but anyway we can live with those <braunr> in the case i observed, ext2fs, relying on libdiskfs and libpager, scans the entire file list to ask for writebacks, as it can't know if the pages are dirty or not <braunr> the mistake here is moving part of the pageout policy out of the kernel <braunr> so it would require the kernel to handle periodic synces of the page cache <antrik> braunr: as for big locks: considering that we don't have any SMP so far, does it really matter?... <braunr> antrik: yes <braunr> we have multithreading <braunr> there is no reason to block many threads while if most of them could continue <braunr> -while <antrik> so that's more about latency than throughput? <braunr> considering sleeping/waking is expensive, it's also about throughput <braunr> currently, everything that deals with sleepable locks (both gnumach and the hurd) just wake every thread waiting for an event when the event occurs (there are a few exceptions, but not many) <antrik> ouch
IRC, freenode, #hurd, 2012-12-04
<damo22> why do some people think hurd is slow? i find it works well even under heavy load inside a virtual machine <braunr> damo22: the virtual machine actually assists the hurd a lot :p <braunr> but even with that, the hurd is a slow system <damo22> i would have thought it would have the potential to be very fast, considering the model of the kernel <braunr> the design implies by definition more overhead, but the true cause is more than 15 years without optimization on the core components <braunr> how so ? <damo22> since there are less layers of code between the hardware bare metal and the application that users run <braunr> how so ? :) <braunr> it's the contrary actually <damo22> VFS -> IPC -> scheduler -> device drivers -> hardware <damo22> that is monolithic <braunr> well, it's not really meaningful <braunr> and i'd say the same applies for a microkernel system <damo22> if the application can talk directly to hardware through the kernel its almost like plugging directly into the hardware <braunr> you never talk directly to hardware <braunr> you talk to servers instead of the kernel <damo22> ah <braunr> consider monolithic kernel systems like systems with one big server <braunr> the kernel <braunr> whereas a multiserver system is a kernel and many servers <braunr> you still need the VFS to identify your service (and thus your server) <braunr> you need much more IPC, since system calls are "replaced" with RPC <braunr> the scheduler is basically the same <damo22> okay <braunr> device drivers are similar too, except they run in thread context (which is usually a bit heavier) <damo22> but you can do cool things like report when an interrupt line is blocked <braunr> and there are many context switches between all that <braunr> you can do all that in a monolithic kernel too, and faster <braunr> but it's far more elegant, and (when well done) easy to do on a microkernel based system <damo22> yes <damo22> i like elegant, makes coding easier if you know the basics <braunr> there are only two major differences between a monolilthic kernel and a multiserver microkernel system * damo22 listens <braunr> 1/ independence of location (your resources could be anywhere) <braunr> 2/ separation of address spaces (your servers have their own addresses) <damo22> wow <braunr> these both imply additional layers of indirection, making the system as a whole slower <damo22> but it would be far more secure though i suspect <braunr> yes <braunr> and reliable <braunr> that's why systems like qnx were usually adopted for critical tasks <damo22> security and reliability are very important, i would switch to the hurd if it supported all the hardware i use <braunr> so would i :) <braunr> but performance matters too <damo22> not to me <braunr> it should :p <braunr> it really does matter a lot in practice <damo22> i mean, a 2x slowdown compared to linux would not affect me <damo22> if it had all the benefits we mentioned above <braunr> but the hurd is really slow for other reasons than its additional layers of indrection unfortunately <damo22> is it because of lack of optimisation in the core code? <braunr> we're working on these issues, but it's not easy and takes a lot of time :p <damo22> like you said <braunr> yes <braunr> and also because of some fundamental design choices related to the microkernel back in the 80s <damo22> what about the darwin system <damo22> it uses a mach kernel? <braunr> yes <damo22> what is stopping someone taking the MIT code from darwin and creating a monster free OS <braunr> what for ? <damo22> because it already has hardware support <damo22> and a mach kernel <braunr> in kernel drivers ? <damo22> it has kernel extensions <damo22> you can do things like kextload module <braunr> first, being a mach kernel doesn't make it compatible or even easily usable with the hurd, the interfaces have evolved independantly <braunr> and second, we really do want more stuff out of the kernel <braunr> drivers in particular <damo22> may i ask why you are very keen to have drivers out of kernel? <braunr> for the same reason we want other system services out of the kernel <braunr> security, reliability, etc.. <braunr> ease of debugging <braunr> the ability to restart drivers separately, without restarting the kernel <damo22> i see
IRC, freenode, #hurd, 2012-09-13
Test Driving GNU Hurd, With Benchmarks Against Linux, Phoronix, Michael Larabel, 2011-07-18.
<braunr> the phoronix benchmarks don't actually test the operating system .. <hroi_> braunr: well, at least it tests its ability to run programs for those particular tasks <braunr> exactly, it tests how programs that don't make much use of the operating system run <braunr> well yes, we can run programs :) <pinotree> those are just cpu-taking tasks <hroi_> ok <pinotree> if you do a benchmark with also i/o, you can see how it is (quite) slower on hurd <hroi_> perhaps they should have run 10 of those programs in parallel, that would test the kernel multitasking I suppose <braunr> not even I/O, simply system calls <braunr> no, multitasking is ok on the hurd <braunr> and it's very similar to what is done on other systems, which hasn't changed much for a long time <braunr> (except for multiprocessor) <braunr> true OS benchmarks measure system calls <hroi_> ok, so Im sensing the view that the actual OS kernel architecture dont really make that much difference, good software does <braunr> not at all <braunr> i'm only saying that the phoronix benchmark results are useless <braunr> because they didn't measure the right thing <hroi_> ok
Optimizing Data Structure Layout
IRC, freenode, #hurd, 2014-01-02
<braunr> teythoon_: wow, digging into the vm code :) <teythoon_> i discovered pahole and gnumach was a tempting target :) <braunr> never heard of pahole :/ <teythoon_> it's nice <teythoon_> braunr: try pahole -C kmem_cache /boot/gnumach <teythoon_> on linux that is. ... <braunr> ok <teythoon_> braunr: http://paste.debian.net/73864/ <braunr> very nice
IRC, freenode, #hurd, 2014-01-03
<braunr> teythoon: pahole is a very handy tool :) <teythoon> yes <teythoon> i especially like how general it is
IRC, freenode, #hurd, 2014-02-27
<braunr> tschwinge: about your concern with regard to performance measurements, you could run kvm with hugetlbfs and cpuset <braunr> on a machine that provides nested page tables, this makes the virtualization overhead as small as it could be considering the implementatoin <braunr> hugetlbs reduces the overhead of page faults, and also implies locked memory while cpuset isolates the vm from global scheduling <braunr> hugetlbfs* <tschwinge> Thanks, will look into that.