Summer of Code projects for GNU

This page has the project suggestions for GNU's participation in Google Summer of Code 2015. (Project proposals for 2006, 2007, 2008, 2009, 2010, 2011, 2012, 2013, and 2014 are archived.)

STUDENTS - BEFORE YOU SUBMIT YOUR PROJECT PROPOSAL:

Please read the GNU Project's guidelines for Summer of Code projects.

Most importantly, please make sure you include all the information requested. If you have questions, please ask summer-of-code@gnu.org (list info here).

Please note that you are not bound to use these ideas, you can propose a new project. It is a good idea to find a mentor and discuss the idea before submit it.


Project suggestions

GNU is a large and complex project, and thus is subdivided into packages, which are relatively independent projetcts. In Summer of Code, GNU acts as an umbrella organization for its packages. The ideas here are grouped by package. Many packages have more than one suggestion, or even their own ideas page.

GNU Classpath| GNU dmd| Gettext| gnucap| GNUnet| GNUstep| GNU Guix| Hurd| LibreDWG| Mediagoblin| Kawa| Octave| Wget| GNU XaoS| GNU Zile

GNU Classpath

GNU Classpath maintains their list of ideas for GSOC in an external webpage: http://icedtea.classpath.org/wiki/GSoC2015.

GNU dmd

GNU dmd is the init system used by GuixSD, GNU's advanced distribution.

Syntax and semantics of systemd units in GNU dmd

GNU dmd has a Scheme interface to define services, most importantly their dependencies and actions associated with them. The goal of this project is twofold. The first part consists in matching the semantics of systemd's .service unit files, more precisely a subset thereof (for instance, the DBus-related part may be omitted.) As part of this work, dmd should be extended with cgroup support on systems that support it. dmd should also be extended with a procedure to load a .service file and return a <service> object.

The second part will consist in implementing other types of units, in particular .device.

Mentor: Ludovic Courtès

Gettext

Extensible XML support through Internationalization Tag Set (ITS)

The xgettext command supports string extraction from several XML-based file formats (for GTK+, GLib, etc). The scanners are currently implemented in C using the SAX interface provided by Expat, and not flexible enough to support emerging file formats.

The project aims to provide a way to allow consumer packages to supply string extraction rules by themselves, which will be loaded at run-time. This is analogous to local macros used by aclocal, but it could adopt the Internationalized Tag Set (ITS) standard instead of Autoconf macros.

The first step would be to to create a C library that parses and evaluates ITS rules, possibly using libxml2. It should be small enough to be bundled into gettext, while extensible enough to allow a new data category to be easily implemented.

The next step would be integrating the library into gettext tools, particularly xgettext. Optionally, it would be also good to extend msgfmt to be capable of merging translations back to original XML file.

Contact: bug-gettext@gnu.org

GNUnet

Implementation of additional transports

Implementation of additional transports to make GNUnet communication more robust in the presence of problematic networks: GNUnet-over-SMTP, GNUnet-over-DNS

Mentors: Matthias Wachs

Implementation of ALG-based NAT traversal methods

Implementation of ALG-based NAT traversal methods (FTP/SIP-based hole punching, better STUN support)

Mentors: Matthias Wachs

Integration of the GNU Name System with GnuPG

Mentors: Matthias Wachs, Christian Grothoff, Jeff Burdges

libaboss improvements

Improving libaboss to make computation on shared secrets (including repeated multiplication) based on Ben-Or et al. if possible. This in particular means moving libaboss to bignums (gcry_mpi).

Mentors: Krista Grothoff, Jeff Burdges

Implementation of a replacement for PANDA

Implementation of a replacement for PANDA (see Pond) with better security, and maybe integration with the GNU Name System for key exchange.

Mentors: Jeff Burdges

Supporting GNU Guix's package distribution

Please refer to the description for this project listed under GNU Guix project ideas.

Gnucap

Gnucap maintains their list of ideas for GSOC in an external webpage: http://gnucap.org/dokuwiki/doku.php?id=gnucap:projects.

GNUstep

GNUstep is a console, desktop and web application development framework for development using Objective-C. It is based on OPENSTEP specification, and is today interested in achieving compatibility with Apple's Cocoa set of frameworks. GNUstep consists of gnustep-base (classes for strings, arrays, dictionaries, timers, sockets, et al), gnustep-gui (classes for windows, buttons, textboxes, et al), gnustep-make (a build system) as well as an assortment of development utilities and bonus libraries.

Improve Core Animation implementation and integrate it into AppKit

During summer of code 2012, Core Animation has been implemented for GNUstep. During summer of code 2013, Core Graphics backend has been implemented for GNUstep using our library Opal. In order to improve compatibility with Cocoa, as well as make it easier to implement modern-looking applications for GNUstep, a student should integrate CALayer with NSView and improve Core Animation where required.

This would also make it possible to use Chameleon, an implementation of UIKit, with GNUstep.

Contact: discuss-gnustep@gnu.org, gnustep-dev@gnu.org

Improve Core Animation implementation and implement UIKit

During summer of code 2012, Core Animation has been implemented for GNUstep. During summer of code 2013, Core Graphics backend has been implemented for GNUstep using our library Opal. In order to attract more developers to free platforms, as well as expand availability of touch-enabled applications, a student should create a UIKit-compatible user interface library and improve Core Animation implementation where necessary.

Contact: discuss-gnustep@gnu.org, gnustep-dev@gnu.org

Note that the GNUstep project is open to other ideas from students. Please contact them if you have any.

GNU Guix

GNU Guix is a purely functional package manager for the GNU system. The Guix System Distribution is GNU's advanced distribution.

Porting Guix to GNU/Hurd

GNU Guix currently supports building packages for GNU/Linux only. The goal of this project would be to allow it to cross-build and build packages for GNU/Hurd, and to provide a virtual machine image that boots into such a system.

This would involve packaging Mach/Hurd/MiG/libc, adjusting allowing cross-compilation to GNU/Hurd, cross-compiling the “bootstrap binaries” for GNU/Hurd, and then working towards support for GNU/Hurd in the (gnu system) Guix modules. This last point would allow a VM image of the complete system to be built.

Mentor: Ludovic Courtès

Implementing a DHCP client in Guile Scheme

The goal of this project is to write a DHCP client library in Guile Scheme. The library will then be usable as a dmd service, thereby providing better integration.

Mentor: Ludovic Courtès

Linux container support

GNU Guix currently supports the installation of GuixSD on virtual machines and physical hosts through its guix system command. The goal of this project would be to add another installation target: containers. A container is an environment that is similar to a virtual machine but without the overhead that comes with running a separate kernel and simulating hardware. Containers are isolated on a host system through Linux's control groups and kernel namespaces.

Mentor: David Thompson

Supporting binary package distribution through GNUnet

GNU Guix provides a transparent binary/source deployment model. A server can claim: “hey, I have the binary for /gnu/store/v9zic07iar8w90zcy398r745w78a7lqs-emacs-24.4!”, where the base32 string uniquely identifies a build process. If you trust that server to provide genuine binaries, then you can grab them instead of building Emacs locally. This mechanism is called substitution.

The “traditional model” has been to have a build farm build and serve binary packages over HTTP. In that model, users trust the build farm to provide authentic binaries.

The project aims to provide a practical decentralization distribution mechanism for binary packages, using GNUnet’s networking layers. In that model, users would be able to automatically share binaries they have built locally, and to install binaries built by other users. This is part of a broader goal of disintermediation among users, and between users and upstream software developers.

Deliverables include a substituter that uses GNUnet as its back-end, and a tool to publish build results.

Mentors: Sree Harsha Totakura, Bart Polot

GNU Hurd

The GNU Hurd is the GNU project's replacement for the Unix kernel. It is a collection of servers that run on the Mach microkernel to implement file systems, network protocols, file access control, and other features that are implemented by the Unix kernel or similar kernels (such as Linux).

Contact: bug-hurd@gnu.org

The Hurd project maintains its GSoC ideas in a separated page.

Libre DWG

GNU LibreDWG is a free C library to handle DWG files. It aims to be a free replacement for the OpenDWG libraries. DWG is the native file format of AutoCAD.

Automated test suite

Build an automatet test suite for LibreDWG. The test suite should have alive tests and also unity tests (test read and write for each object), compare test outputs with expected values, ecc.

Contact: libredwg@gnu.org

3D Solid decoding support

Currently LibreDWG is only able to decode 3D solids partially. The solids in DWG are encoded into the SAT and SAB formats, used by the ACIS 3D modeling kernel. There are not free implementations of this kernel, which means that the SAT and SAB streams that we are able to extract are useless if not parsed. Once parsed, the solids must be converted to other openly documented formats, and properly rendered and worked within free software tools. It is yet not clear if converting belongs to the scope of this idea, since there aren't any known {sat,sab}2something free software converters. Anyway, if your application somehow gives a light in addressing this issue, either implementing or not the converter, that would be a plus.

Contact: libredwg@gnu.org

DWG write support

LibreDWG currently supports DWG versions R13, R14, R2000 and R2004 (R2007 is on the way) but only for reading. Some write operations for entities and objects are already written, and there is a very basic write framework. Although, headers and the whole file structure are not written. Write support is almost evil, but still needed, since there is not a well stablished free CAD format, and we don't want people to leave free CAD applications because they can't send their work back to DWG-only-CAD users.

Contact: libredwg@gnu.org

LibreDWG Python Bindings Rewrite

LibreDWG has a basic support to its internals using Swig which are Non complete and not for use in development. Python bindings are required by some other way and also support to the LibreDWG API via python.

Contact: libredwg@gnu.org

GNU Mediagoblin

GNU Mediagoblin maintains their list of ideas for GSOC in an external webpage: https://wiki.mediagoblin.org/GSOC_2015.

Kawa

Kawa is best known as a fast Scheme implementation for the Java platform. It compiles Scheme to optimized Java bytecodes. It is also a general framework for implementing dynamic languages, and includes a full implementation of XQuery 1.0 and incomplete implementations of Common Lisp and Emacs Lisp (JEmacs).

Contact: kawa@sourceware.org

Kawa maintains a list of ideas here: http://www.gnu.org/software/kawa/Ideas-and-tasks.html.

Octave

Octave maintains a list of ideas here: http://wiki.octave.org/Summer_of_Code_Project_Ideas.

Wget

GNU Wget is a free software package for retrieving files using HTTP, HTTPS and FTP, the most widely-used Internet protocols. It is a non-interactive commandline tool, so it may easily be called from scripts, cron jobs, terminals without X-Windows support, etc. More details about on how to get started with wget and GSoC can be found here: https://github.com/darnir/wget/wiki/GSoC-2015

FTP Server for Test Suite

Brief Explanation: GNU Wget requires a FTP Server implementation that implements all the relevant features from RFC 959 for testing Wget as a FTP Client. While we have a couple of ideas on how this can be done, the student is also expected to come up with their own ideas.

Expected Results: At the end of the project, we expect a functioning FTP Test Suite that integrates with the already existing set of Python3 based HTTP Tests. We should be able to test Wget for not only standards compliant responses but for erroneous responses as well.

Knowledge Prerequisites: This will largely depend on the specific path chosen by the student. However, we expect that a basic knowledge of Python and C will be required along with the ability to understand technical documentation.

Speed up Wget's Download Mechanism

Brief Explanation: This project requires the student to implement two different, but small features to GNU Wget which will eventually help in reducing the time taken for recursive downloads in Wget. The two features are:
1. if-modified-since Headers: Currently, when a file already exists on disk, Wget first sends a HTTP HEAD request and based on the response, sends a second HTTP GET request to the server. By parsing the "if-modified-since" header, this can be reduced to simply one GET request. A good starting point for this is RFC 7232 section 3.3.
2. TCP Fast Open: RFC 7413 describes a mechanism to reduce the number of Round Trips required to open a TCP Connection. This has been implemented in the Linux Networking Stack and since a large number of web servers are hosted on Linux systems, Wget may be able to get better performance during small file transfers or on connections with a high Round Trip Time.

Expected Results: At the end of the project, we should have support for the "if-modified-since" headers and TFO in Wget. Simultaneously, the relevant server side extensions need to be made to the test suite and tests written for the above features.

Knowledge Prerequisites: The student will have to be confortable reading and understanding techinical documentation and implementing them in C. A rudimentary knowledge of Python is desired but not mandatory.

Improve Wget's Security

Brief Explanation: This project deals with improving Wget's security. It is composed of three smaller sub-projects which may act as milestones for the student:
1. HTTP Strict Transport Policy (HSTS): This HTTP Header extension is described by RFC 6797. It is a way for the server to instruct the client to use HTTPS for certain domains irrespective of what the use requested.
2. HTTP Secure Cookie Management: RFC 6265 states that a server may mark a cookie as "secure", in which case a User-Agent (UA) should send the cookie back to the server if and only if the connection to the server is over a secure transport. Currently, Wget ignores the secure cookie field and always sends all cookies back to the server.
3. FTPS: FTPS is an extention of the FTP protocol over secure SSL/TLS connections. Not to be confused with SFTP which is FTP-like protocol over SSH2. Wget already implements FTP and SSL/TLS separately. This project would require understanding how FTPS works and implementing the required changes in Wget. A good starting point is RFC 2228 and RFC 4217.

Expected Results: By the end of the project, Wget should understand the HSTS requests and follow the directives of the local HSTS database. It should also obey the "secure" cookie parameter in HTTP responses. Finally, we should be able to use Wget to download a file via FTP over a secure connection. For HSTS and secure cookie management, the relevant test cases are also expected.

Knowledge Prerequisites: A good understanding of C will be required for this project since it deals with some of the deepest portions of Wget's source code. Apart from that, some amount of Python will be helpful for writing the test cases, but it is not mandatory.

Contact: bug-wget@gnu.org (to subscribe, see the list-info page).
Mentors: Giuseppe Scrivano, Darshit Shah, Tim Ruehsen

GNU XaoS

GNU XaoS maintains their list of ideas for GSOC in an external webpage: http://matek.hu/xaos/doku.php?id=development:main.

Zile

GNU Zile (short for "Zile Implements Lua Editors") is a toolkit for building editors. Zile has all of Emacs's basic editing features: it is 8-bit clean (though it currently lacks Unicode support), and the number of editing buffers and windows is only limited by available memory and screen space respectively. Syntax coloring, registers, minibuffer completion and auto-fill are available. Function and variable names are identical with Emacs's.

Zile currently currently comes with two editors: Zmacs, which as the name suggests emulates Emacs, and Zz, which does not support any form of Lisp, and is Emacs-inspired rather than a strict clone.

In approximate order of increasing difficulty, here are some projects that would help improve Zile:

Add Unicode support

For Zile to be useful for building serious editors, it needs Unicode support. Emacs's interface to encoding systems could be used as a model.  There are already a few unicode helper libraries for Lua, which should serve as the basis for a unicode support module for Zile.

BDD/TDD

Zile has a legacy test suite for Emacs lisp compatibility, and a new infrastructure for a more comprehensive Behaviour Driven Development suite.  The legacy tests need to be moved to the new infrastructure, and then the existing modules should be formally specified with new BDD examples.  This is a great opportunity to get some practice working with TDD tools in a real project.

Idiomatic Lua

Lua Zile is still mostly a line-by-line translation of the earlier C Zile implementation, and as such fails to make the most of what the Lua language offers to simplify and shorten high level code, while being too liberal with the global namespace and not keeping proper cohesion and coupling between the potentially self-contained parts of the code.  Refactoring is already well underway, but there's a lot of scope for re-engineering and refactoring the rest of the code base to support easier reuse

Syntax Highlighting

An earlier fork of Lua Zile provides the proof of concept for a barely-fast-enough restartable syntax parser and highlighting engine based on regexps and a state-machine.  The new combined multi-editor tree needs to be refactored to allow merging the syntax highlighting module.  If time allows, the restartable regexp parser would benefit hugely by being rewritten as using LPEG (Parsed Expression Grammars), and parsers for languages other than Lua and C added.

Nested Buffers

The syntax highlighting proof of concept is engineered to allow different parts of a buffer to be highlighted separately, for instance an HTML file with embedded PHP, CSS and javascript.  This concept should be expanded to allow different keycaps to be active for certain parts of a buffer, for instance a shell interaction buffer that recognizes compiler error messages and in that part of the buffer treats an ENTER keypress as jumping to the file and line of the error message.  If time allows, this feature could be improved to display of parts of other buffers inside the current buffer, highlighted appropriately, in this case injecting the text of the just the function surrounding the error message directly into the current buffer, highlighting it correctly according to the language it is written in, and then saving it back correctly to the appropriate file and removing the injected text from the current buffer.

Splay Ropes Buffers

Right now, Zile provides only a single buffer-gap based buffer implementation - as used by GNU Emacs.  A more comprehensive Editor Building Toolkit really needs some alternative data-structures for editor builders to select, such as a Splay Ropes data structure.  Other sensible data-structures as alternatives or additions to buffer-gap would also be welcome.

Zile Lisp

The Lisp engine used to simulate a tiny subset of zmacs' elisp emulation has seen dramatic improvements in recent months, and already supports many features not available in the earlier C Zile project, such as macros, eval/apply, a REPL; as well as better elisp compatibility to provide minimal features for the BDD specs.  There is a huge amount of scope for improving the Lisp engine itself, adding a faster Cons/List implementation and the like, and also for adding better elisp compatibility to facilitate porting more of the zmacs implementation itself to lisp.

Zz

Zz, the other editor currently shipped with Zile, aims to leverage the already existing and well proven UNIX tools, as compared to Zmacs which wants to be a Lisp shell and IDE in the vein of Emacs.  The current implementation has only gone so far in that direction as to replace Lisp with Lua.  Much more work is required to replace remaining Emacs-like features with implementations that leverage the system tools.  Much inspiration can be gained from Sam and ACME on Plan 9; Wily on Linux; and TextMate on Mac OS X... all of which integrate with the operating system userspace rather than reimplement equivalent features in the editor itself.

Zi

Zile currently provides a Lua implementation of a micro-emacs, called Zmacs.  To improve the flexibility of the new Lua Zile frameworks, the components should be refactored and assembled into an alternative modal vi like editor -- which we'll call Zi.

Graphical Toolkit

Some care was taken to maintain separation of concerns between the user interface (keyboard input and buffer redisplay) and the internals of Zile based editors, but with only a single curses user interface it's hard to be sure that the abstractions are sufficiently clean.  Adding support for an alternative GUI using GTK+, Qt or even the Mac or Windows native toolkits, along with whatever refactorings are required to support that would make using Zile to implement future editors with different io requirements much easier.

More Editors

In addition to Zmacs and Zz in the current Zile release, plus Zi on the horizon, there is also a fork of the old single editor tree called Zee, a much more minimal editor, which should be merged back in as an alternative editor example in the new multi-editor tree.  Further, implementations of Nano, or WordStar, or some ancient DOS editor could be added.  Much of the work in adding an editor will be to refactor the code on the boundaries between the Zile objects and files and the existing editors to facilitate easily writing new editors, while keeping the existing ones working correctly.

If you're interested in portable software engineering and re-engineering, Emacs or other text editors, or traditional UNIX tools and their future incarnations, you should find a project to interest you.


Submitting ideas to this page


Links