Debuggers and Related Tools

最新推荐文章于 2023-10-31 23:23:56 发布

原创最新推荐文章于 2023-10-31 23:23:56 发布 · 289 阅读

0 ·

本内容遵循CC 4.0 BY-SA版权协议

标签

#linux #服务器

本文探讨了如何在内核级别使用调试器gdb进行调试，包括gdb的优势与限制，以及处理可加载模块的方法。介绍了内核核心映像和如何在gdb中查看和解析内核变量。

Debuggers and Related Tools

The last resort in debugging modules is using a debugger to step through the code, watching the value of variables and machine registers. This approach is time-consuming and should be avoided whenever possible. Nonetheless, the fine-grained perspective on the code that is achieved through a debugger is sometimes invaluable. 调试模块的最后手段是使用调试器来逐步浏览代码，观察变量和机器寄存器的值。这种方法很耗时，应该尽可能地避免。然而，通过调试器实现的对代码的细粒度观察有时是非常宝贵的。

Using an interactive debugger on the kernel is a challenge. The kernel runs in its own address space on behalf of all the processes on the system. As a result, a number of common capabilities provided by user-space debuggers, such as breakpoints and single-stepping, are harder to come by in the kernel. In this section we look at several ways of debugging the kernel; each of them has advantages and disadvantages. 在内核上使用交互式调试器是一个挑战。内核代表系统中的所有进程在它自己的地址空间中运行。因此，一些由用户空间调试器提供的常见功能，如断点和单步，在内核中很难得到。在这一节中，我们看一下调试内核的几种方法；每一种方法都有优点和缺点。

Using gdb

gdb can be quite useful for looking at the system internals. Proficient use of the debugger at this level requires some confidence with gdb commands, some understanding of assembly code for the target platform, and the ability to match source code and optimized assembly. gdb对于观察系统内部是相当有用的。熟练使用这个级别的调试器需要对gdb命令有一定的信心，对目标平台的汇编代码有一定的了解，并有能力匹配源代码和优化的汇编。

The debugger must be invoked as though the kernel were an application. In addition to specifying the filename for the ELF kernel image, you need to provide the name of a core file on the command line. For a running kernel, that core file is the kernel core image, /proc/kcore. A typical invocation of gdb looks like the following: 调试器必须被调用，就像内核是一个应用程序一样。除了指定ELF内核映像的文件名之外，你还需要在命令行中提供一个核心文件的名字。对于一个正在运行的内核，这个核心文件就是内核核心映像，/proc/kcore。一个典型的gdb调用看起来像下面这样。

gdb /usr/src/linux/vmlinux /proc/kcore

The first argument is the name of the uncompressed ELF kernel executable, not the zImage or bzImage or anything built specifically for the boot environment. 第一个参数是未压缩的ELF内核可执行文件的名称，而不是zImage或bzImage或任何专门为启动环境建立的文件。

The second argument on the gdb command line is the name of the core file. Like any file in /proc, /proc/kcore is generated when it is read. When the read system call executes in the /proc filesystem, it maps to a data-generation function rather than a data-retrieval one; we've already exploited this feature in the section Section 4.3.1. kcore is used to represent the kernel "executable" in the format of a core file; it is a huge file, because it represents the whole kernel address space, which corresponds to all physical memory. From within gdb, you can look at kernel variables by issuing the standard gdb commands. For example, p jiffies prints the number of clock ticks from system boot to the current time. gdb命令行的第二个参数是核心文件的名称。像/proc中的任何文件一样，/proc/kcore在被读取时被生成。当读取系统调用在/proc文件系统中执行时，它映射到一个数据生成函数，而不是数据检索函数；我们已经在第4.3.1节中利用了这个特性。 kcore被用来表示内核 "可执行文件 "的格式；它是一个巨大的文件，因为它代表整个内核地址空间，对应于所有物理内存。在gdb中，你可以通过发布标准的gdb命令来查看内核变量。例如，p jiffies打印出从系统启动到当前时间的时钟刻度数。

When you print data from gdb, the kernel is still running, and the various data items have different values at different times; gdb, however, optimizes access to the core file by caching data that has already been read. If you try to look at the jiffies variable once again, you'll get the same answer as before. Caching values to avoid extra disk access is a correct behavior for conventional core files but is inconvenient when a "dynamic" core image is used. The solution is to issue the command core-file /proc/kcore whenever you want to flush the gdb cache; the debugger gets ready to use a new core file and discards any old information. You won't, however, always need to issue core-file when reading a new datum; gdb reads the core in chunks of a few kilobytes and caches only chunks it has already referenced. 当你从gdb中打印数据时，内核仍在运行，各种数据项在不同的时间有不同的值；而gdb则通过缓存已经读过的数据来优化对核心文件的访问。如果你尝试再次查看jiffies变量，你会得到和以前一样的答案。缓存数值以避免额外的磁盘访问，对于传统的核心文件来说是一种正确的行为，但在使用 "动态 "核心映像时就不方便了。解决办法是，每当你想刷新gdb缓存时，就发出core-file /proc/kcore命令；调试器准备使用一个新的内核文件，并丢弃任何旧的信息。然而，你并不总是需要在读取新数据时发出core-file命令；gdb以几千字节为单位读取内核，只缓存它已经引用过的部分。

Numerous capabilities normally provided by gdb are not available when you are working with the kernel. For example, gdb is not able to modify kernel data; it expects to be running a program to be debugged under its own control before playing with its memory image. It is also not possible to set breakpoints or watchpoints, or to single-step through kernel functions. 当你使用内核工作时，gdb通常提供的许多功能是不可用的。例如，gdb不能修改内核数据；它希望在自己的控制下通过它的内存映像运行一个要调试的程序。它也不可能设置断点或观察点，或者单步执行内核函数。

Note that, in order to have symbol information available for gdb, you must compile your kernel with the CONFIG_DEBUG_INFO option set. The result is a far larger kernel image on disk, but, without that information, digging through kernel variables is almost impossible. 注意，为了让gdb获得符号信息，你必须在编译内核时设置CONFIG_DEBUG_INFO选项。其结果是磁盘上的内核镜像大得多，但是，如果没有这些信息，挖掘内核变量几乎是不可能的。

With the debugging information available, you can learn a lot about what is going on inside the kernel. gdb happily prints out structures, follows pointers, etc. One thing that is harder, however, is examining modules. Since modules are not part of the vmlinux image passed to gdb, the debugger knows nothing about them. Fortunately, as of kernel 2.6.7, it is possible to teach gdb what it needs to know to examine loadable modules. 通过可用的调试信息，你可以了解很多关于内核内部发生的事情。gdb很乐意打印出结构，跟踪指针，等等。然而，有一件事比较难，那就是检查模块。由于模块不是传递给gdb的vmlinux镜像的一部分，调试器对它们一无所知。幸运的是，从内核2.6.7开始，可以教给gdb它需要知道的东西，以检查可加载模块。

Linux loadable modules are ELF-format executable images; as such, they have been divided up into numerous sections. A typical module can contain a dozen or more sections, but there are typically three that are relevant in a debugging session: Linux可加载模块是ELF格式的可执行映像；因此，它们被分为许多部分。一个典型的模块可能包含一打或更多的部分，但在调试过程中，通常有三个部分是相关的。

.text

This section contains the executable code for the module. The debugger must know where this section is to be able to give tracebacks or set breakpoints. (Neither of these operations is relevant when running the debugger on /proc/kcore, but they can useful when working with kgdb, described below). 这一部分包含了该模块的可执行代码。调试器必须知道这一部分的位置，以便能够进行回溯或设置断点。(在/proc/kcore上运行调试器时，这两种操作都不相关，但在使用kgdb时，它们会很有用，下面会介绍）。

.bss
.data

These two sections hold the module's variables. Any variable that is not initialized at compile time ends up in .bss, while those that are initialized go into .data. 这两个部分存放着模块的变量。任何在编译时没有被初始化的变量都会被放在.bss中，而那些被初始化的变量则被放在.data中。

Making gdb work with loadable modules requires informing the debugger about where a given module's sections have been loaded. That information is available in sysfs, under /sys/module. For example, after loading the scull module, the directory /sys/module/scull/sections contains files with names such as .text; the content of each file is the base address for that section. 要使gdb在可加载模块中工作，需要告知调试器一个特定模块的部分被加载到哪里。这些信息在/sys/module下的sysfs中可用。例如，在加载scull模块后，目录/sys/module/scull/sections包含名称为.text的文件；每个文件的内容是该部分的基本地址。

We are now in a position to issue a gdb command telling it about our module. The command we need is add-symbol-file; this command takes as parameters the name of the module object file, the .text base address, and a series of optional parameters describing where any other sections of interest have been put. After digging through the module section data in sysfs, we can construct a command such as: 现在我们可以发布一个gdb命令，告诉它我们的模块情况。我们需要的命令是add-symbol-file；这个命令的参数是模块对象文件的名称，.text的基本地址，以及一系列可选参数，描述任何其他感兴趣的部分的位置。在挖掘了sysfs中的模块部分数据后，我们可以构建一个命令，如。

(gdb) add-symbol-file .../scull.ko 0xd0832000 \

-s .bss 0xd0837100 \

-s .data 0xd0836be0

We have included a small script in the sample source (gdbline) that can create this command for a given module. 我们在样本源代码（gdbline）中包含了一个小脚本，可以为给定的模块创建这个命令。

We can now use gdb to examine variables in our loadable module. Here is a quick example taken from a scull debugging session: 现在我们可以用gdb来检查我们可加载模块中的变量。下面是一个快速的例子，取自一个scull调试会话。

(gdb) add-symbol-file scull.ko 0xd0832000 \

-s .bss 0xd0837100 \

-s .data 0xd0836be0

add symbol table from file "scull.ko" at添加文件 "scull.ko "中的符号表，在

.text_addr = 0xd0832000

.bss_addr = 0xd0837100

.data_addr = 0xd0836be0

(y or n) y

Reading symbols from scull.ko...done. 从scull.ko读取符号...完成。

(gdb) p scull_devices[0]

$1 = {data = 0xcfd66c50,

quantum = 4000,

qset = 1000,

size = 20881,

access_key = 0,

...}

Here we see that the first scull device currently holds 20,881 bytes. If we wanted, we could follow the data chain, or look at anything else of interest in the module. 这里我们看到，第一个scull设备目前拥有20,881字节。如果我们愿意，我们可以跟踪数据链，或者看一下模块中其他感兴趣的东西。

One other useful trick worth knowing about is this: 还有一个值得了解的有用的技巧是这样的。

(gdb) print *(address)

Here, fill in a hex address for address; the output is a file and line number for the code corresponding to that address. This technique may be useful, for example, to find out where a function pointer really points. 在这里，为地址填写一个十六进制地址；输出是与该地址对应的代码的文件和行号。这种技术可能是有用的，例如，找出一个函数指针真正指向的地方。

We still cannot perform typical debugging tasks like setting breakpoints or modifying data; to perform those operations, we need to use a tool like kdb (described next) or kgdb (which we get to shortly). 我们仍然不能执行典型的调试任务，如设置断点或修改数据；要执行这些操作，我们需要使用像kdb（接下来描述）或kgdb（我们很快就会提到）这样的工具。