Previous: , Up: Debugging   [Contents][Index]


7.2 x86_64-efi

Using GDB to debug GRUB2 for the x86_64-efi target has some similarities with the i386-pc target. Please read and familiarize yourself with the i386-pc section when reading this one. Extra care must be used to run QEMU such that it boots a UEFI firmware. This usually involves either using the ‘-bios’ option with a UEFI firmware blob (eg. OVMF.fd) or loading the firmware via pflash. This document will not go further into how to do this as there are ample resource on the web.

Like all EFI implementations, on x86_64-efi the (U)EFI firmware that loads the GRUB2 EFI application determines at runtime where the application will be loaded. This means that we do not know where to tell GDB to load the symbols for the GRUB2 core until the (U)EFI firmware determines it. There are two good ways of figuring this out when running in QEMU: use a debug build of OVMF and check the debug log, or have GRUB2 say where it is loaded. Neither of these are ideal because they both generally give the information after GRUB2 is already running, which makes debugging early boot infeasible. Technically, the first method does give the load address before GRUB2 is run, but without debugging the EFI firmware with symbols, the author currently does not know how to cause the OVMF firmware to pause at that point to use the load address before GRUB2 is run.

Even after getting the application load address, the loading of core symbols is complicated by the fact that the debugging symbols for the kernel are in an ELF binary named kernel.exec while what is in memory are sections for the PE32+ EFI binary. When grub-mkimage creates the PE32+ binary it condenses several segments from the ELF kernel binary into one .data section in the PE32+ binary. This must be taken into account to properly load the other non-text sections. Otherwise, GDB will work as expected when breaking on functions, but, for instance, global variables will point to the wrong address in memory and thus give incorrect values (which can be difficult to debug).

Calculating the correct offsets for sections is taken care of automatically when loading the kernel symbols via the user-defined GDB command dynamic_load_kernel_exec_symbols, which takes one argument, the address where the text section is loaded as determined by one of the methods above. Alternatively, the command dynamic_load_symbols with the text section address as an agrument can be called to load the kernel symbols and set up loading the module symbols as they are loaded at runtime.

In the author’s experience, when debugging with QEMU and OVMF, to have debugging symbols loaded at the start of GRUB2 execution the GRUB2 EFI application must be run via QEMU at least once prior in order to get the load address. Two methods for obtaining the load address are described in two subsections below. Generally speaking, the load address does not change between QEMU runs. There are exceptions to this, namely that different GRUB2 EFI applications can be run at different addresses. Also, it has been observed that after running the EFI application for the first time, the second run will sometimes have a different load address, but subsequent runs of the same EFI application will have the same load address as the second run. And it’s a near certainty that if the GRUB EFI binary has changed, eg. been recompiled, the load address will also be different.

This ability to predict what the load address will be allows one to assume the load address on subsequent runs and thus load the symbols before GRUB2 starts. The following command illustrates this, assuming that QEMU is running and waiting for a debugger connection and the current working directory is where gdb_grub resides:

gdb -x gdb_grub -ex 'dynamic_load_symbols address of .text section'

If you load the symbols in this manner and, after continuing execution, do not see output showing the module symbols loading, then it is very likely that the load address was incorrect.

Another thing to be aware of is how the loading of the GRUB image by the firmware affects previously set software breakpoints. On x86 platforms, software breakpoints are implemented by GDB by writing a special processor instruction at the location of the desired breakpoint. This special instruction when executed will stop the program execution and hand control to the debugger, GDB. GDB will first save the instruction bytes that are overwritten at the breakpoint and will put them back when the breakpoint is hit. If GRUB is being run for the first time in QEMU, the firmware will be loading the GRUB image into memory where every byte is already set to 0. This means that if a breakpoint is set before GRUB is loaded, GDB will save the 0-byte(s) where the the special instruction will go. Then when the firmware loads the GRUB image and because it is unaware of the debugger, it will write the GRUB image to memory, overwriting anything that was there previously — notably in this case the instruction that implements the software breakpoint. This will be confusing for the person using GDB because GDB will show the breakpoint as set, but the brekapoint will never be hit. Furthermore, GDB then becomes confused, such that even deleting an recreating the breakpoint will not create usable breakpoints. The gdb_grub script takes care of this by saving the breakpoints just before they are overwritten, and then restores them at the start of GRUB execution. So breakpoints for GRUB can be set before GRUB is loaded, but be mindful of this effect if you are confused as to why breakpoints are not getting hit.

Also note, that hardware breakpoints do not suffer this problem. They are implemented by having the breakpoint address in special debug registers on the CPU. So they can always be set freely without regard to whether GRUB has been loaded or not. The reason that hardware breakpoints aren’t always used is because there are a limited number of them, usually around 4 on various CPUs, and specifically exactly 4 for x86 CPUs. The gdb_grub script goes out of its way to avoid using hardware breakpoints internally and uses them as briefly as possible when needed, thus allowing the user to have a maximal number at their disposal.


Previous: , Up: Debugging   [Contents][Index]