《Linux高级程序设计》读书笔记6

最新推荐文章于 2025-04-22 09:33:22 发布

原创最新推荐文章于 2025-04-22 09:33:22 发布 · 507 阅读

0 ·

本内容遵循CC 4.0 BY-SA版权协议

标签

#Linux #读书 #设计模式 #网络应用 #C

Programming 专栏收录该内容

10 篇文章

订阅专栏

本文深入探讨了Linux内核中的接口类型，包括内核与用户应用程序间的外部接口和内核自身的内部接口。详细介绍了系统调用、文件抽象、内核事件机制等核心概念，并讨论了诸如procfs和sysfs等文件系统的作用。

第八章内核接口

主要解释Linux内核中存在什么类型的接口以及在Linux内核和其他用户级应用程序之间的接口。接口：内核和用户之间的各种接口，内核自身的内部API。

有助于理解一些工具的作用，例如udev动态设备文件系统守护进程和消息是如何在系统的底层传递的。

接口的定义:
    用来定义Linux系统不同部分相互交互的软件接口(API)，合法与内核交互的方式:使用系统调用、读写文件、读写套接字
    分类: Linux内核和用户应用程序之间的外部接口和Linux内核自身各个部分之间的内部接口。内部接口涉及到对缺乏一个标准内核ABI以及它将如何影响编写Linux内核模块的解释。
    内核与应用程序之间的外部接口定义了某些操作必须通过的软件壁垒。接口为发送给内核的请求和返回的响应提供了一个已定义的机制。netlink套接字提供一个进入内核的接口，通过它可以接收消息、更新路由表和执行其他可能的操作，这类接口不能被轻易改变。
    内部接口由提供给可装载内核模块的输出内核函数集组成(公开输出的内核符号，如printk()函数)。例如printk()为任何一个由消息记录需求的可装载内核模块提供消息记录的能力。一旦内核被编译，它将包含这些函数的二进制版本，所有后续的内核模块都必须建立在它的基础上。
    与普通的API的不同: Linux内核的可外部访问的接口无法被轻易的改动，任何涉及到对外部接口的改动，都非常谨慎的被控制。
    未定义的接口: 破坏规则后就可以访问到未定义接口，例如在某些平台上，可以在一个用户应用程序中手工访问物理系统内存。绕过了正常的内核对哪个设备位于该特定内存地址的处理，所以必须小心使用。
    外部接口:普通用户应用程序用来与Linux内核通信的接口。包括系统调用、UNIX文件抽象以及过去几年中专门针对Linux实现的其他各种现代内核事件机制。完全依赖于内核来使得你的应用程序可以正常工作、也完全依赖于它来提供一个稳定的环境以使得你的驱动程序可以通过它与应用程序通信。
    死板的API不能轻易被改变，例如devfs的修改。事实上到现在依然有人在使用devfs.当然，udev是主流。
    设备文件系统、procfs、sysfs等更多的文件系统。这些接口都被用户软件所依赖----例如module-init-tools(modprobe,insmod,rmmod,modinfo等)就依赖于/proc目录的特定文件，随着时间流逝，将更多依赖于sysfs中的各种文件。底线：添加新的内核接口后，需要考虑它能被长期使用。

最常用的内核接口----系统调用
    一个工作站，平均每秒都有数以千计的系统调用。用户程序可以通过系统调用这样一种抽象来请求内核代表它们执行某些特权操作。这些特权操作需要访问到底层硬件，所以不能简单开放给用户程序来使用。例如：open()一个文件,mmap()一些匿名内存(被malloc()所使用)。
    许多系统C库函数只是实际执行请求任务的系统调用的包裹函数。C库函数用于提供一个执行系统调用的简单方式并执行任何不需要内核支持的轻量级任务。每当执行一个系统调用时，一个潜在昂贵的从用户态到内核态的上下文切换就必须被执行，如果可以被避免，系统吞吐量就将增加。例如，一些系统不需要使用系统调用从CPU那里获得高精度的时间戳。
    一个简单的程序，可以用strace看到它的系统调用：

[root@develop dash]# strace ./hello
execve("./hello", ["./hello"], [/* 28 vars */]) = 0
brk(0)                                  = 0x9234000
old_mmap(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0xb7f16000
access("/etc/ld.so.preload", R_OK)      = -1 ENOENT (No such file or directory)
open("/etc/ld.so.cache", O_RDONLY)      = 3
fstat64(3, {st_mode=S_IFREG|0644, st_size=105377, ...}) = 0
old_mmap(NULL, 105377, PROT_READ, MAP_PRIVATE, 3, 0) = 0xb7efc000
close(3)                                = 0
open("/lib/libc.so.6", O_RDONLY)        = 3
read(3, "\177ELF\1\1\1\0\0\0\0\0\0\0\0\0\3\0\3\0\1\0\0\0\n\3370"..., 512) = 512
fstat64(3, {st_mode=S_IFREG|0755, st_size=1489572, ...}) = 0
old_mmap(0x2f9000, 1219548, PROT_READ|PROT_EXEC, MAP_PRIVATE|MAP_DENYWRITE, 3, 0) = 0x2f9000
old_mmap(0x41d000, 16384, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED|MAP_DENYWRITE, 3, 0x124000) = 0x41d000
old_mmap(0x421000, 7132, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED|MAP_ANONYMOUS, -1, 0) = 0x421000
close(3)                                = 0
old_mmap(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0xb7efb000
set_thread_area({entry_number:-1 -> 6, base_addr:0xb7efb6c0, limit:1048575, seg_32bit:1, contents:0, read_exec_only:0, limit_in_pages:1, seg_not_present:0, useable:1}) = 0
mprotect(0x41d000, 8192, PROT_READ)     = 0
mprotect(0x2f5000, 4096, PROT_READ)     = 0
munmap(0xb7efc000, 105377)              = 0
fstat64(1, {st_mode=S_IFCHR|0620, st_rdev=makedev(136, 4), ...}) = 0
mmap2(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0xb7f15000
write(1, "Hello!\n", 7Hello!
)                 = 7
munmap(0xb7f15000, 4096)                = 0
exit_group(0)                           = ?

    在strace中运行的应用程序所产生的每个系统调用以及传递给它的参数都将被显示。strace本身使用ptrace()系统调用来执行跟踪。
    分析一下 open("/lib/libc.so.6", O_RDONLY)        = 3, 桩函数代表程序产生合适的系统调用，这允许系统调用API的平台相关智能被限制到C函数库针对每个平台的单一部分从而避免对每个系统调用进行复制。
    open()是GLIBC函数中实现的，对它的调用最终将导致一个平台相关的处理。在PowerPC平台中，系统调用实际上是使用一个特殊的sc处理器指令实现的。而Intel IA32(x86)平台则使用一个特定的处理器异常标号0x80(128)来实现同样的功能。硬件过程->上下文转移到内核的sys_open()函数。

1032 asmlinkage long sys_open(const char __user *filename, int flags, int mode)
1033{
1034        long ret;
1035
1036        if (force_o_largefile())
1037                flags |= O_LARGEFILE;
1038
1039        ret = do_sys_open(AT_FDCWD, filename, flags, mode);
1040        /* avoid REGPARM breakage on x86: */
1041        asmlinkage_protect(3, ret, filename, flags, mode);
1042        return ret;
1043}
    EXPORT_SYMBOL_GPL(sys_open);

可以看到，实际调用的是do_sys_open()函数，sys_open()函数通过预处理的 asmlinkage宏(说明这个函数是直接通过底层陷入处理异常代码调用的)向GNU C编译器指定额外的连接信息。

系统调用表
每个不同的Linux架构都有自己的通过系统调用进入内核的方法，常用方法：涉及到对一个特殊机器指令的调用。处理器进入其特权模式并导致对要执行的正确软件例程进行硬件辅助或软件辅助查找。取决于系统调用编号，该编号通常在产生系统调用时被传递到一个处理器寄存器中。

    syscall_table.S: arch/i386/kernel  arch/x86/kernel:

ENTRY(sys_call_table)
        .long sys_restart_syscall       /* 0 - old "setup()" system call, used for restarting */
        .long sys_exit
        .long sys_fork
        .long sys_read
        .long sys_write
        .long sys_open          /* 5 */
        .long sys_close

    可以看到，编号为5的就是sys_open, 5+1=6, 所以是表中的第六个条目。

    新的调用如那些用于支持kexec()接口的调用被附加到表格的结尾。

    手工调用在内核的系统调用表中定义的系统调用？ C函数库如果不知道出现了一个新的系统调用的时候，需要手工调用此系统调用。最好使用C函数库中的简单宏的方式，或者从源代码重建你的C函数库。

/*
 * Manually call the sys_time system call.
 * Jon Masters <jcm@jonmasters.org>
 */

#include <errno.h>
#include <stdio.h>
#include <stdlib.h>
#include <syscall.h>
#include <linux/unistd.h>

//#define __NR_time             13 /* normally defined in <asm/unistd.h> */
#define __NR_mytime             13
_syscall1(long, mytime,time_t, * tloc);

int main(int argc, char **argv)
{

        time_t current_time;
        long seconds;

        seconds = mytime(&current_time);

        printf("The current time is : %ld.\n",seconds);

        exit(0);

}

上面的程序只能在2.6.1x系列上跑得过？貌似syscall1改了。
vsyscall优化：现代内核中的vsyscall能力允许大多数平台上的所有用户应用程序依赖于内核来采用最合适的进入机制。简单的讲，使用vsyscall的应用程序在它们地址空间的高端有一个额外的内存映射页，应用程序与Linux内核直接共享该页。

设备文件抽象

[root@develop syscalls]# ls -l /dev
total 0
crw-rw----  1 root root    14,  12 Jan 12 17:19 adsp
crw-rw----  1 root root    10, 175 Jan 12 17:19 agpgart
crw-------  1 root root    14,   4 Jan 12 17:19 audio
lrwxrwxrwx  1 root root          3 Jan 12 17:19 cdrom -> hda
crw-rw----  1 root root     5,   1 Jan 12 17:19 console
lrwxrwxrwx  1 root root         11 Jan 12 17:19 core -> /proc/kcore
crw-rw----  1 root root    10,  63 Jan 12 17:19 device-mapper
brw-r-----  1 root disk   253,   0 Jan 12 17:19 dm-0
brw-r-----  1 root disk   253,   1 Jan 12 17:19 dm-1

使用文件，可以打开设备并使用常规的IO函数来与底层硬件设备交互。历史上，/dev下文件无数，现在使用udev自动创建相应的设备节点(文件)。减少了不必要的设备文件的数量。udev/D-BUS/Utopia，简化一般系统通知过程。
audio设备，允许应用程序使用标准的Linux音频API通过系统的声卡播放声音。读取来记录一个声音，写回设备就可以回放该声音。

/*
 * char.c - A simple example character device.
 *
 * Copyright (C) 2006 Jon Masters <jcm@jonmasters.org>
 *
 * This program is free software; you can redistribute it and/or
 * modify it under the terms of the GNU General Public License as
 * published by the Free Software Foundation.
 *
 */

#include <linux/init.h>
#include <linux/fs.h>
#include <linux/major.h>
#include <linux/blkdev.h>
#include <linux/module.h>
#include <linux/cdev.h>

#include <asm/uaccess.h>

/* function prototypes */

static int char_open(struct inode *inode, struct file *file);
static int char_release(struct inode *inode, struct file *file);
static ssize_t char_read(struct file *file, char __user *buf,
                         size_t count, loff_t *ppos);

/* global variables */

static struct class *plp_class;     /* pretend /sys/class */
static dev_t char_dev;              /* dynamically assigned at registration. */
static struct cdev *char_cdev;      /* dynamically allocated at runtime. */

/* file_operations */

static struct file_operations char_fops = {
        .read    = char_read,
        .open    = char_open,
        .release = char_release,
        .owner   = THIS_MODULE,
};

/*
 * char_open: open the phony char device
 * @inode: the inode of the /dev/char device
 * @file: the in-kernel representation of this opened device
 * Description: This function just logs that the device got
 *              opened. In a real device driver, it would also
 *              handle setting up the hardware for access.
 */

static int char_open(struct inode *inode, struct file *file)
{
        printk(KERN_INFO "char: device file opened.\n");
        return 0;
}

/*
 * char_release: close (release) the phony char device
 * @inode: the inode of the /dev/char device
 * @file: the in-kernel representation of this opened device
 * Description: This function just logs that the device got
 * closed. In a real device driver, it would also handle
 * freeing up any previously used hardware resources.
 */


static int char_release(struct inode *inode, struct file *file)
{
        printk(KERN_INFO "char: device file released.\n");
        return 0;
}

/*
 * char_read: read the phony char device
 * @file: the in-kernel representation of this opened device
 * @buf: the userspace buffer to write into
 * @count: how many bytes to write
 * @ppos: the current file position.
 * Description: This function always returns "hello world"
 * into a userspace buffer (buf). The file position is
 * non-meaningful in this example. In a real driver, you
 * would read from the device and write into the buffer.
 */


static ssize_t char_read(struct file *file, char __user *buf,
                         size_t count, loff_t *ppos)
{
        char payload[] = "hello, world!\n";

        ssize_t payload_size = strlen(payload);

        if (count < payload_size)
                return -EFAULT;

        if (copy_to_user((void __user *)buf, &payload, payload_size))
                return -EFAULT;

        *ppos += payload_size;
        return payload_size;

}

/*
 * char_init: initialize the phony device
 * Description: This function allocates a few resources (a cdev,
 * a device, a sysfs class...) in order to register a new device
 * and populate an entry in sysfs that udev can use to setup a
 * new /dev/char entry for reading from the fake device.
 */


static int __init char_init(void)
{
        if (alloc_chrdev_region(&char_dev, 0, 1, "char"))
                goto error;

        if (0 == (char_cdev = cdev_alloc()))
                goto error;

        kobject_set_name(&char_cdev->kobj,"char_cdev");
        char_cdev->ops = &char_fops; /* wire up file ops */
        if (cdev_add(char_cdev, char_dev, 1)) {
                kobject_put(&char_cdev->kobj);
                unregister_chrdev_region(char_dev, 1);
                goto error;
        }

        plp_class = class_create(THIS_MODULE, "plp");
        if (IS_ERR(plp_class)) {
                printk(KERN_ERR "Error creating PLP class.\n");
                cdev_del(char_cdev);
                unregister_chrdev_region(char_dev, 1);
                goto error;
        }
        class_device_create(plp_class, NULL, char_dev, NULL, "char");

        return 0;


error:
        printk(KERN_ERR "char: could not register device.\n");
        return 1;
}

/*
 * char_exit: uninitialize the phony device
 * Description: This function frees up any resource that got allocated
 * at init time and prepares for the driver to be unloaded.
 */


static void __exit char_exit(void)
{
        class_device_destroy(plp_class, char_dev);
        class_destroy(plp_class);
        cdev_del(char_cdev);
        unregister_chrdev_region(char_dev,1);
}

/* declare init/exit functions here */

module_init(char_init);
module_exit(char_exit);

/* define module meta data */

MODULE_AUTHOR("Jon Masters <jcm@jonmasters.org>");
MODULE_DESCRIPTION("A simple character device driver for a fake device");
MODULE_LICENSE("GPL");

编译方法：

[root@develop char]# make -C /lib/modules/2.6.24.4/build modules M=$PWD
make: Entering directory `/opt/dash/linux-2.6.24.4'
  CC [M]  /home/yangfei/code/interfaces/char/char.o
  Building modules, stage 2.
  MODPOST 1 modules
  CC      /home/yangfei/code/interfaces/char/char.mod.o
  LD [M]  /home/yangfei/code/interfaces/char/char.ko
make: Leaving directory `/opt/dash/linux-2.6.24.4'

模块信息：

[root@develop char]# modinfo char.ko
filename:       char.ko
license:        GPL
description:    A simple character device driver for a fake device
author:         Jon Masters <jcm@jonmasters.org>
srcversion:     B07816C66A436C379F5BE7E
depends:
vermagic:       2.6.24.4 mod_unload 686 4KSTACKS

指定安装目录：

[root@develop char]# export INSTALL_MOD_DIR=misc
[root@develop char]# make -C /lib/modules/2.6.24.4/build modules modules_install M=$PWD
make: Entering directory `/opt/dash/linux-2.6.24.4'
  LD      /home/yangfei/code/interfaces/char/built-in.o
  Building modules, stage 2.
  MODPOST 1 modules
  INSTALL /home/yangfei/code/interfaces/char/char.ko
  DEPMOD  2.6.24.4
make: Leaving directory `/opt/dash/linux-2.6.24.4'

载入模块和查看内容：

 1020  insmod char.ko
 1021  lsmod | grep char
 1022  cat /dev/char
$ dmesg | tail输出内容：
char: device file opened.
char: device file released.
$ rmmod char

创建的sysfs中的class类，对驱动程序中数据结构的一个伪文件系统表现

[root@develop char]# ls /sys/class/plp/char/
dev        subsystem/ uevent
[root@develop char]# cat /sys/class/plp/char/dev
252:0

驱动程序被装载后，/sys/class/plp目录被创建并产生一个针对由该驱动程序提供的char设备的子目录。目录被创建完后，一个消息被发送给系统的udev动态设备守护进程，该进程将读取dev文件并确定创建一个名为/dev/char的新设备，主设备号为252，次设备号为0，以后没有硬编码的设备存在了。
modprobe/insmod-->调用char_init()函数-->module_init(char_init)

static int __init char_init(void)
{
        if (alloc_chrdev_region(&char_dev, 0, 1, "char"))
{    //尝试分配一个新的设备号(用于代表由该驱动程序支持的一类设备的主设备号，该主设备号将出现在/dev/char中)
                goto error;
}
        if (0 == (char_cdev = cdev_alloc()))
        { //cdev结构将为内核提供字符设备及其属性的内存中表现，cdev中包含一个kobject
    //kobject用于引用计数由多少内核的不同部分和任务引用(以某种方式使用)cdev
      //使用char_dev是为了提高可读性和有助于以后的调试
     //旧内核中，还将显示在/sys/cdev中      
     goto error;
    }
    //cdev结构还包括一个指向文件操作char_fops的指针，它与针对设备文件的各种不同类型的IO相关联。
        kobject_set_name(&char_cdev->kobj,"char_cdev");
        char_cdev->ops = &char_fops; /* wire up file ops */ //包装针对设备文件的不同类型的IO操作
    // 这里定义的文件操作包括打开、释放和读取设备文件
    // cdev_add来将已经分配好的cdev与主设备号关联。
        if (cdev_add(char_cdev, char_dev, 1)) {
                kobject_put(&char_cdev->kobj);
                unregister_chrdev_region(char_dev, 1);
                goto error;
        }
    // 在/sys/class/class/plp目录中创建一个新的"plp"sysfs的class类以容纳通过sysfs输出的驱动程序的所有属性
        plp_class = class_create(THIS_MODULE, "plp");
        if (IS_ERR(plp_class)) {
                printk(KERN_ERR "Error creating PLP class.\n");
                cdev_del(char_cdev);
                unregister_chrdev_region(char_dev, 1);
                goto error;
        }
        class_device_create(plp_class, NULL, char_dev, NULL, "char"); //创建dev文件条目，触发用户空间的udev守护进程创建/dev/char设备节点

        return 0;


error:
        printk(KERN_ERR "char: could not register device.\n");
        return 1;
}

cdev结构:

   5struct cdev {
   6        struct kobject kobj;
   7        struct module *owner;
   8        const struct file_operations *ops;
   9        struct list_head list;
  10        dev_t dev;
  11        unsigned int count;
  12};

modprobe/rmmod可以从内核中移除驱动程序，char_exit()负责释放class类、cdev结构、从系统中取消对主设备节点的注册。

块设备
    /dev文件列表中，第一列为b的文件就是块设备。包括完全随机寻址的任何设备，如磁盘驱动器。块驱动程序中，不能像字符设备中一样直接处理用户请求，而是通过实现一系列函数来允许内核以一种更有效的方式来执行面向块的IO。用户需要在块设备上装载文件系统或执行一些其他类型的块设备相关活动时，内核负责将该块设备呈现给用户。
    需要查看大量源码以找出其他磁盘/块设备驱动程序的例子，例如磁盘控制器的驱动程序如何实现对包含你的文件系统的块设备的抽象。

一切都是文件
    使用文件抽象来表示Linux系统中的每一个底层设备既不实际也不可能。Linux网络设备/包过滤栈的不同。唯一被广泛采用而没有使用某种形式的文件抽象来表示的设备就是网络设备。网络栈使用特殊的接口(netlink)和特殊的系统调用来建立和拆除网络配置。libpcap可以读取出原始网络数据包。

机制与策略
    Linux内核程序员的基本目标：实现机制而非策略。文件抽象并不能表示所有必需的功能。Ioctl()系统调用就是用于对设备进行特殊的控制，被设计用于弥补在使用普通文件IO操作来控制设备时偶尔带来的不足。

    传统上proc伪文件系统是在系统调用之外进入linux内核的主命令和控制接口。用户应用程序和工具可以通过/proc来确定当前系统状态或调整系统参数。所有工作通过读写简单的文件来完成。
    例如echo 1>/proc/sys/net/ipv4/ip_forward 启用IP转发。
    cat /proc/meminfo && cat /proc/slabinfo 查看大量内存分配和管理功能的当前性能

[root@develop sys]# ls /proc/1
attr  clear_refs  coredump_filter  environ  fd      limits    maps  mounts      oom_adj    root   smaps  statm   task
auxv  cmdline     cwd              exe      fdinfo  loginuid  mem   mountstats  oom_score  sched  stat   status  wchan

init进程的消息，使用当前文件描述符的目录(fd)、用于调用该命令的命令行(cmdline)该进程的内存映射(maps)。访问proc系统需要有root权限。

procfs的使用

/*
 * procfs.c - Demonstrate making a file in procfs.
 *
 * Copyright (C) 2006 Jon Masters <jcm@jonmasters.org>
 *
 * This program is free software; you can redistribute it and/or
 * modify it under the terms of the GNU General Public License as
 * published by the Free Software Foundation.
 *
 */

#include <linux/init.h>
#include <linux/module.h>
#include <linux/proc_fs.h>

/* function prototypes */

static int procfs_read_proc(char *page, char **start, off_t off,
                            int count, int *eof, void *data);
/* global variables */

static struct proc_dir_entry *procfs_file;

/*
 * procfs_read_proc: populate a single page buffer with example data.
 * @page: A single 4K page (on most Linux systems) used as a buffer.
 * @start: beginning of the returned data
 * @off: current offset into proc file
 * @count: amount of data to read
 * @eof: eof marker
 * @data: data passed that was registered earlier
 */

static int procfs_read_proc(char *page, char **start, off_t off,
                            int count, int *eof, void *data)
{

        char payload[] = "hello, world!\n";
        int len = strlen(payload);

        if (count < len)
                return -EFAULT;

        strncpy(page,payload,len);

        return len;
}


/*
 * procfs_init: initialize the phony device
 * Description: This function allocates a new procfs entry.
 */


static int __init procfs_init(void)
{

        procfs_file = create_proc_read_entry("plp", 0, NULL,
                                             procfs_read_proc, NULL);

        if (!procfs_file)
                return -ENOMEM;
        return 0;
}

/*
 * procfs_exit: uninitialize the phony device
 * Description: This function frees up the procfs entry.
 */


static void __exit procfs_exit(void)
{

        remove_proc_entry("plp", NULL);
}

/* declare init/exit functions here */

module_init(procfs_init);
module_exit(procfs_exit);

/* define module meta data */

MODULE_AUTHOR("Jon Masters <jcm@jonmasters.org>");
MODULE_DESCRIPTION("A simple driver populating a procfs file");
MODULE_LICENSE("GPL");

     include/linux/proc_fs.h头文件中可以找到全系列的procfs函数及对每个函数的简短解释。
    procfs基于页进行工作，这意味着当你写出数据时，任一时刻最多只能有4K的缓冲区。

sysfs
    最初作为一种代表linux内核电源管理子系统状态的机制被创建的。/sys目录下的设备树中表示物理设备，系统不仅能跟踪一个设备的当前状态，而且还可以跟踪它与系统中其他设备之间的关系。以使得Linux确定各种操作之间的顺序。
    便于管理不同系统设备之间的关联性。

内核事件
    2.6.10出现的时候被添加到内核中。事件以消息的形式存在并被传递到用户空间，事件指向一个特定的sysfs路径。用户空间将根据存在的任何规则来处理消息，通常这是一个守护进程的工程，例如udev动态设备进程的工作。

忽略内核保护
直接内存映射：

/*
 * peekpoke.c
 * Jon Masters <jcm@jonmasters.org>
 */

#include <linux/stddef.h>

#include <stdio.h>
#include <stdlib.h>
#include <unistd.h>
#include <string.h>
#include <sys/mman.h>
#include <sys/types.h>
#include <sys/stat.h>
#include <fcntl.h>

#define MMAP_FILE "/dev/mem"   /* physical direct. */
#define MMAP_SIZE 4096         /* 4K. */

/* #define DEBUG 1 */

int use_hex = 0;

void display_help(void);
int valid_flag(char *flag_str);
void peek(char *address_str);
void poke(char *address_str, char *value_str);
unsigned long *map_memory(unsigned long address);

void display_help() {

  printf("Usage information: \n"
         "\n"
         "    peekpoke [FLAG] ADDRESS [DATA]\n"
         "\n"
         "Valid Flags: \n"
         "\n"
         "    -x    Use Hexadecimal.\n");
  exit(0);
}

int valid_flag(char *flag_str) {

  if (strncmp(flag_str,"-x",2) == 0) {
    use_hex = 1;
#ifdef DEBUG
    printf("DEBUG: using hexadecimal.\n");
#endif
    return 1;
  }

#ifdef DEBUG
  printf("DEBUG: no valid flags found.\n");
#endif
  return 0;

}

void peek(char *address_str) {

  unsigned long address = 0;

  unsigned long offset = 0;
  unsigned long *mem = 0;

#ifdef DEBUG
  printf("DEBUG: peek(%s).\n",address_str);
#endif

  if (use_hex) {
    sscanf(address_str,"0x%lx",&address);
    /* printf("hexadecimal support is missing.\n"); */
  } else {
    address = atoi(address_str);
  }

#ifdef DEBUG
  printf("DEBUG: address is 0x%x.\n",address);
#endif

  offset = address - (address & ~4095);
  address = (address & ~4095);

#ifdef DEBUG
  printf("DEBUG: address is 0x%x.\n",address);
  printf("DEBUG: offset is 0x%x.\n",offset);
#endif

  mem = map_memory(address);

  printf("0x%lx\n",mem[offset]);

}

void poke(char *address_str, char *value_str) {

  unsigned long address = 0;
  unsigned long value = 0;

  unsigned long offset = 0;
  unsigned long *mem = 0;

#ifdef DEBUG
  printf("DEBUG: poke(%s,%s).\n",address_str,value_str);
#endif

  if (use_hex) {
    sscanf(address_str,"0x%lx",&address);
    sscanf(value_str,"0x%lx",&value);
    /* printf("hexadecimal support is missing.\n"); */
  } else {
    address = atoi(address_str);
    value = atoi(value_str);
  }

#ifdef DEBUG
  printf("DEBUG: address is 0x%x.\n",address);
  printf("DEBUG: value is 0x%x.\n",value);
#endif

  offset = address - (address & ~4095);
  address = (address & ~4095);

#ifdef DEBUG
  printf("DEBUG: address is 0x%x.\n",address);
  printf("DEBUG: offset is 0x%x.\n",offset);
#endif

  mem = map_memory(address);

  mem[offset] = value;

}

unsigned long *map_memory(unsigned long address) {

  int fd = 0;
  unsigned long *mem = 0;

#ifdef DEBUG
  printf("DEBUG: opening device.\n");
#endif

  if ((fd = open(MMAP_FILE,O_RDWR|O_SYNC)) < 0) {
    printf("Cannot open device file.\n");
    exit(1);
  }

  if (MAP_FAILED == (mem = mmap(NULL, MMAP_SIZE, PROT_READ|PROT_WRITE, MAP_SHARED, fd, address))) {
    printf("Cannot map device file.\n");
    exit(1);
  }

  return mem;

} /* map_memory */

int main(int argc, char **argv) {

  /* test we got a sensible invocation. */

  switch(argc) {

  case 0:
    printf("Impossibility Reached.\n");
    exit(1);
  case 1:
    display_help();
    break;
  case 2:
    peek(argv[1]);
    break;
  case 3:
    if (valid_flag(argv[1])) {
      peek(argv[2]);
    } else {
      poke(argv[1],argv[2]);
    }
    break;
  case 4:
    if (valid_flag(argv[1])) {
      poke(argv[2],argv[3]);
    } else {
      printf("Sorry that feature is not supported.\n");
      display_help();
    }
    break;
  default:
    printf("Sorry that option is not supported.\n");
    display_help();
    break;
  }

  exit(0);

} /* main */

直接点亮LED(GPIO区域中的LED)，向0xe1000000的虚构的控制寄存器中写入数据。

./peekpoke -x 0xe1000000 1
$ while true; do
    ./peekpoke -x 0xe1000000 1
    sleep 1
    ./peekpoke -x 0xe1000000 0
done

闪烁的LED。
关键： map_memory()，接受一个特定的物理内存地址并尝试通过mmap()调用将它映射到程序的虚拟地址空间中。MMAP_FILE标记来打开设备文件/dev/mem，然后调用mmap()在新打开的fd中建立一个需要的映射。peek()/poke()只是对映射内存范围的读取和写入进行了简单的包装。

内部API:
EXPORT_SYMBOL/EXPORT_SYMBOL_GPL,前者为第三方模块提供了一个可用的符号，而后者为输出的函数加了限制，不能提供给未使用GPL许可证的模块。
内核ABI

ABI应用程序二进制接口。redhat和novell已经开始在软件包装机制的软件包级别中使用kABI跟踪机制。