Nowadays the most often encountered cause of Hurd crashes seems to be lockups in the ext2fs server. One of these could be traced recently, and turned out to be a lock inside libdiskfs that was taken and not released in some cases. There is reason to believe that there are more faulty paths causing these lockups.

The task is systematically checking the libdiskfs code for this kind of locking issues. To achieve this, some kind of test harness has to be implemented: For example instrumenting the code to check locking correctness constantly at runtime. Or implementing a unit testing framework that explicitly checks locking in various code paths. (The latter could serve as a template for implementing unit tests in other parts of the Hurd codebase...)

(A systematic code review would probably suffice to find the existing locking issues; but it wouldn't document the work in terms of actual code produced, and thus it's not suitable for a GSoC project...)

This task requires experience with debugging locking issues in multithreaded applications.

Tools have been written for automated code analysis; these can help to locate and fix such errors.

Possible mentors: Samuel Thibault (youpi)

Exercise: If you could actually track down and fix one of the existing locking errors before the end of the application process, that would be excellent. This might be rather tough though, so probably you need to talk to us about an alternative exercise task...