Библиотека сайта rus-linux.net
Linux Device Drivers, 2nd EditionBy Alessandro Rubini & Jonathan Corbet2nd Edition June 2001 0-59600-008-1, Order Number: 0081 586 pages, $39.95 |
Chapter 2
Building and Running ModulesContents:
Kernel Modules Versus Applications
Compiling and Loading
The Kernel Symbol Table
Initialization and Shutdown
Using Resources
Automatic and Manual Configuration
Doing It in User Space
Backward Compatibility
Quick ReferenceIt's high time now to begin programming. This chapter introduces all the essential concepts about modules and kernel programming. In these few pages, we build and run a complete module. Developing such expertise is an essential foundation for any kind of modularized driver. To avoid throwing in too many concepts at once, this chapter talks only about modules, without referring to any specific device class.
All the kernel items (functions, variables, header files, and macros) that are introduced here are described in a reference section at the end of the chapter.
For the impatient reader, the following code is a complete "Hello, World" module (which does nothing in particular). This code will compile and run under Linux kernel versions 2.0 through 2.4.[4]
[4]This example, and all the others presented in this book, is available on the O'Reilly FTP site, as explained in Chapter 1, "An Introduction to Device Drivers".
#define MODULE #include <linux/module.h> int init_module(void) { printk("<1>Hello, world\n"); return 0; } void cleanup_module(void) { printk("<1>Goodbye cruel world\n"); }The printk function is defined in the Linux kernel and behaves similarly to the standard C library function printf. The kernel needs its own printing function because it runs by itself, without the help of the C library. The module can call printk because, after insmod has loaded it, the module is linked to the kernel and can access the kernel's public symbols (functions and variables, as detailed in the next section). The string
<1>
is the priority of the message. We've specified a high priority (low cardinal number) in this module because a message with the default priority might not show on the console, depending on the kernel version you are running, the version of the klogd daemon, and your configuration. You can ignore this issue for now; we'll explain it in the section "printk" in Chapter 4, "Debugging Techniques".You can test the module by calling insmodand rmmod, as shown in the screen dump in the following paragraph. Note that only the superuser can load and unload a module.
[5]If you are new to building kernels, Alessandro has posted an article at http://www.linux.it/kerneldocs/kconf that should help you get started.
root#gcc -c hello.c
root#insmod ./hello.o
Hello, world root#rmmod hello
Goodbye cruel world root#According to the mechanism your system uses to deliver the message lines, your output may be different. In particular, the previous screen dump was taken from a text console; if you are running insmod and rmmodfrom an xterm, you won't see anything on your TTY. Instead, it may go to one of the system log files, such as /var/log/messages (the name of the actual file varies between Linux distributions). The mechanism used to deliver kernel messages is described in "How Messages Get Logged" in Chapter 4, "Debugging Techniques".
As you can see, writing a module is not as difficult as you might expect. The hard part is understanding your device and how to maximize performance. We'll go deeper into modularization throughout this chapter and leave device-specific issues to later chapters.
Kernel Modules Versus Applications
Before we go further, it's worth underlining the various differences between a kernel module and an application.
Whereas an application performs a single task from beginning to end, a module registers itself in order to serve future requests, and its "main" function terminates immediately. In other words, the task of the function init_module (the module's entry point) is to prepare for later invocation of the module's functions; it's as though the module were saying, "Here I am, and this is what I can do." The second entry point of a module, cleanup_module, gets invoked just before the module is unloaded. It should tell the kernel, "I'm not there anymore; don't ask me to do anything else." The ability to unload a module is one of the features of modularization that you'll most appreciate, because it helps cut down development time; you can test successive versions of your new driver without going through the lengthy shutdown/reboot cycle each time.
[6]The implementation found in Linux 2.0 and 2.2 has no support for the
L
andZ
qualifiers. They have been introduced in 2.4, though.Figure 2-1 shows how function calls and function pointers are used in a module to add new functionality to a running kernel.
Figure 2-1. Linking a module to the kernel
Because no library is linked to modules, source files should never include the usual header files. Only functions that are actually part of the kernel itself may be used in kernel modules. Anything related to the kernel is declared in headers found in include/linux and include/asm inside the kernel sources (usually found in /usr/src/linux). Older distributions (based on libc version 5 or earlier) used to carry symbolic links from /usr/include/linuxand /usr/include/asm to the actual kernel sources, so your libc include tree could refer to the headers of the actual kernel source you had installed. These symbolic links made it convenient for user-space applications to include kernel header files, which they occasionally need to do.
Developers working on any large software system (such as the kernel) must be aware of and avoid namespace pollution. Namespace pollution is what happens when there are many functions and global variables whose names aren't meaningful enough to be easily distinguished. The programmer who is forced to deal with such an application expends much mental energy just to remember the "reserved" names and to find unique names for new symbols. Namespace collisions can create problems ranging from module loading failures to bizarre failures -- which, perhaps, only happen to a remote user of your code who builds a kernel with a different set of configuration options.
Developers can't afford to fall into such an error when writing kernel code because even the smallest module will be linked to the whole kernel. The best approach for preventing namespace pollution is to declare all your symbols as
static
and to use a prefix that is unique within the kernel for the symbols you leave global. Also note that you, as a module writer, can control the external visibility of your symbols, as described in "The Kernel Symbol Table" later in this chapter.[7][7]Most versions of insmod (but not all of them) export all non-
static
symbols if they find no specific instruction in the module; that's why it's wise to declare asstatic
all the symbols you are not willing to export.Using the chosen prefix for private symbols within the module may be a good practice as well, as it may simplify debugging. While testing your driver, you could export all the symbols without polluting your namespace. Prefixes used in the kernel are, by convention, all lowercase, and we'll stick to the same convention.
The last difference between kernel programming and application programming is in how each environment handles faults: whereas a segmentation fault is harmless during application development and a debugger can always be used to trace the error to the problem in the source code, a kernel fault is fatal at least for the current process, if not for the whole system. We'll see how to trace kernel errors in Chapter 4, "Debugging Techniques", in the section "Debugging System Faults".
User Space and Kernel Space
A module runs in the so-called kernel space, whereas applications run in user space. This concept is at the base of operating systems theory.
Every modern processor is able to enforce this behavior. The chosen approach is to implement different operating modalities (or levels) in the CPU itself. The levels have different roles, and some operations are disallowed at the lower levels; program code can switch from one level to another only through a limited number of gates. Unix systems are designed to take advantage of this hardware feature, using two such levels. All current processors have at least two protection levels, and some, like the x86 family, have more levels; when several levels exist, the highest and lowest levels are used. Under Unix, the kernel executes in the highest level (also called supervisor mode), where everything is allowed, whereas applications execute in the lowest level (the so-called user mode), where the processor regulates direct access to hardware and unauthorized access to memory.
Concurrency in the Kernel
One way in which device driver programming differs greatly from (most) application programming is the issue of concurrency. An application typically runs sequentially, from the beginning to the end, without any need to worry about what else might be happening to change its environment. Kernel code does not run in such a simple world and must be written with the idea that many things can be happening at once.
As a result, Linux kernel code, including driver code, must be reentrant -- it must be capable of running in more than one context at the same time. Data structures must be carefully designed to keep multiple threads of execution separate, and the code must take care to access shared data in ways that prevent corruption of the data. Writing code that handles concurrency and avoids race conditions (situations in which an unfortunate order of execution causes undesirable behavior) requires thought and can be tricky. Every sample driver in this book has been written with concurrency in mind, and we will explain the techniques we use as we come to them.
A common mistake made by driver programmers is to assume that concurrency is not a problem as long as a particular segment of code does not go to sleep (or "block"). It is true that the Linux kernel is nonpreemptive; with the important exception of servicing interrupts, it will not take the processor away from kernel code that does not yield willingly. In past times, this nonpreemptive behavior was enough to prevent unwanted concurrency most of the time. On SMP systems, however, preemption is not required to cause concurrent execution.
The Current Process
Although kernel modules don't execute sequentially as applications do, most actions performed by the kernel are related to a specific process. Kernel code can know the current process driving it by accessing the global item
current
, a pointer tostruct task_struct
, which as of version 2.4 of the kernel is declared in<asm/current.h>
, included by<linux/sched.h>
. Thecurrent
pointer refers to the user process currently executing. During the execution of a system call, such as open or read, the current process is the one that invoked the call. Kernel code can use process-specific information by usingcurrent
, if it needs to do so. An example of this technique is presented in "Access Control on a Device File", in Chapter 5, "Enhanced Char Driver Operations".printk("The process is \"%s\" (pid %i)\n", current->comm, current->pid);The command name stored in
current->comm
is the base name of the program file that is being executed by the current process.Compiling and Loading
[8]We use the word local here to denote personal changes to the system, in the good old Unix tradition of /usr/local.
Before we deal with the roles of init_module and cleanup_module, however, we'll write a makefile that builds object code that the kernel can load.
First, we need to define the
__KERNEL__
symbol in the preprocessor before we include any headers. As mentioned earlier, much of the kernel-specific content in the kernel headers is unavailable without this symbol.If you are compiling for an SMP machine, you also need to define
__SMP__
before including the kernel headers. In version 2.2, the "multiprocessor or uniprocessor" choice was promoted to a proper configuration item, so using these lines as the very first lines of your modules will do the task:#include <linux/config.h> #ifdef CONFIG_SMP # define __SMP__ #endifA module writer must also specify the -Oflag to the compiler, because many functions are declared as
inline
in the header files. gcc doesn't expand inline functions unless optimization is enabled, but it can accept both the -g and -Ooptions, allowing you to debug code that uses inline functions.[9] Because the kernel makes extensive use of inline functions, it is important that they be expanded properly.[9] Note, however, that using any optimization greater than -O2 is risky, because the compiler might inline functions that are not declared as
inline
in the source. This may be a problem with kernel code, because some functions expect to find a standard stack layout when they are called.Finally, in order to prevent unpleasant errors, we suggest that you use the -Wall (all warnings) compiler flag, and also that you fix all features in your code that cause compiler warnings, even if this requires changing your usual programming style. When writing kernel code, the preferred coding style is undoubtedly Linus's own style. Documentation/CodingStyle is amusing reading and a mandatory lesson for anyone interested in kernel hacking.
All the definitions and flags we have introduced so far are best located within the
CFLAGS
variable used by make.# Change it here or specify it on the "make" command line KERNELDIR = /usr/src/linux include $(KERNELDIR)/.config CFLAGS = -D__KERNEL__ -DMODULE -I$(KERNELDIR)/include \ -O -Wall ifdef CONFIG_SMP CFLAGS += -D__SMP__ -DSMP endif all: skull.o skull.o: skull_init.o skull_clean.o $(LD) -r $^ -o $@ clean: rm -f *.o *~ coreAfter the module is built, the next step is loading it into the kernel. As we've already suggested, insmoddoes the job for you. The program is like ld, in that it links any unresolved symbol in the module to the symbol table of the running kernel. Unlike the linker, however, it doesn't modify the disk file, but rather an in-memory copy. insmod accepts a number of command-line options (for details, see the manpage), and it can assign values to integer and string variables in your module before linking it to the current kernel. Thus, if a module is correctly designed, it can be configured at load time; load-time configuration gives the user more flexibility than compile-time configuration, which is still used sometimes. Load-time configuration is explained in "Automatic and Manual Configuration" later in this chapter.
Interested readers may want to look at how the kernel supports insmod: it relies on a few system calls defined in kernel/module.c. The function sys_create_module allocates kernel memory to hold a module (this memory is allocated with vmalloc; see "vmalloc and Friends" in Chapter 7, "Getting Hold of Memory"). The system call get_kernel_syms returns the kernel symbol table so that kernel references in the module can be resolved, and sys_init_module copies the relocated object code to kernel space and calls the module's initialization function.
Version Dependency
Bear in mind that your module's code has to be recompiled for each version of the kernel that it will be linked to. Each module defines a symbol called
__module_kernel_version
, which insmod matches against the version number of the current kernel. This symbol is placed in the.modinfo
Executable Linking and Format (ELF) section, as explained in detail in Chapter 11, "kmod and Advanced Modularization". Please note that this description of the internals applies only to versions 2.2 and 2.4 of the kernel; Linux 2.0 did the same job in a different way.The compiler will define the symbol for you whenever you include
<linux/module.h>
(that's why hello.c earlier didn't need to declare it). This also means that if your module is made up of multiple source files, you have to include<linux/module.h>
from only one of your source files (unless you use__NO_VERSION__
, which we'll introduce in a while).In case of version mismatch, you can still try to load a module against a different kernel version by specifying the -f ("force") switch to insmod, but this operation isn't safe and can fail. It's also difficult to tell in advance what will happen. Loading can fail because of mismatching symbols, in which case you'll get an error message, or it can fail because of an internal change in the kernel. If that happens, you'll get serious errors at runtime and possibly a system panic -- a good reason to be wary of version mismatches. Version mismatches can be handled more gracefully by using versioning in the kernel (a topic that is more advanced and is introduced in "Version Control in Modules" in Chapter 11, "kmod and Advanced Modularization").
If you want to compile your module for a particular kernel version, you have to include the specific header files for that kernel (for example, by declaring a different
KERNELDIR
) in the makefile given previously. This situation is not uncommon when playing with the kernel sources, as most of the time you'll end up with several versions of the source tree. All of the sample modules accompanying this book use theKERNELDIR
variable to point to the correct kernel sources; it can be set in your environment or passed on the command line of make.Sometimes, you'll encounter kernel interfaces that behave differently between versions 2.0.x and 2.4.x of Linux. In this case you'll need to resort to the macros defining the version number of the current source tree, which are defined in the header
<linux/version.h>
. We will point out cases where interfaces have changed as we come to them, either within the chapter or in a specific section about version dependencies at the end, to avoid complicating a 2.4-specific discussion.The header, automatically included by linux/module.h, defines the following macros:
UTS_RELEASE
The macro expands to a string describing the version of this kernel tree. For example,
"2.3.48"
.
LINUX_VERSION_CODE
The macro expands to the binary representation of the kernel version, one byte for each part of the version release number. For example, the code for 2.3.48 is 131888 (i.e., 0x020330).[10] With this information, you can (almost) easily determine what version of the kernel you are dealing with.
[10] This allows up to 256 development versions between stable versions.
KERNEL_VERSION(major,minor,release)
VERSIONFILE = $(INCLUDEDIR)/linux/version.h VERSION = $(shell awk -F\" '/REL/ {print $$2}' $(VERSIONFILE)) INSTALLDIR = /lib/modules/$(VERSION)/miscWe chose to install all of our drivers in the misc directory; this is both the right choice for miscellaneous add-ons and a good way to avoid dealing with the change in the directory structure under /lib/modulesthat was introduced right before version 2.4 of the kernel was released. Even though the new directory structure is more complicated, the misc directory is used by both old and new versions of the modutils package.
With the definition of
INSTALLDIR
just given, the install rule of each makefile, then, is laid out like this:install: install -d $(INSTALLDIR) install -c $(OBJS) $(INSTALLDIR)Platform Dependency
Each computer platform has its peculiarities, and kernel designers are free to exploit all the peculiarities to achieve better performance in the target object file.
The SPARC architecture is a special case that must be handled by the makefiles. User-space programs running on the SPARC64 (SPARC V9) platform are the same binaries you run on SPARC32 (SPARC V8). Therefore, the default compiler running on SPARC64 (gcc) generates SPARC32 object code. The kernel, on the other hand, must run SPARC V9 object code, so a cross compiler is needed. All GNU/Linux distributions for SPARC64 include a suitable cross compiler, which the makefiles select.
The Kernel Symbol Table
We've seen how insmod resolves undefined symbols against the table of public kernel symbols. The table contains the addresses of global kernel items -- functions and variables -- that are needed to implement modularized drivers. The public symbol table can be read in text form from the file /proc/ksyms (assuming, of course, that your kernel has support for the /procfilesystem -- which it really should).
New modules can use symbols exported by your module, and you can stack new modules on top of other modules. Module stacking is implemented in the mainstream kernel sources as well: the msdos filesystem relies on symbols exported by the fat module, and each input USB device module stacks on the usbcore and input modules.
Module stacking is useful in complex projects. If a new abstraction is implemented in the form of a device driver, it might offer a plug for hardware-specific implementations. For example, the video-for-linux set of drivers is split into a generic module that exports symbols used by lower-level device drivers for specific hardware. According to your setup, you load the generic video module and the specific module for your installed hardware. Support for parallel ports and the wide variety of attachable devices is handled in the same way, as is the USB kernel subsystem. Stacking in the parallel port subsystem is shown in Figure 2-2; the arrows show the communications between the modules (with some example functions and data structures) and with the kernel programming interface.
Figure 2-2. Stacking of parallel port driver modules
When using stacked modules, it is helpful to be aware of the modprobeutility. modprobe functions in much the same way as insmod, but it also loads any other modules that are required by the module you want to load. Thus, one modprobe command can sometimes replace several invocations of insmod (although you'll still need insmod when loading your own modules from the current directory, because modprobeonly looks in the tree of installed modules).
In the usual case, a module implements its own functionality without the need to export any symbols at all. You will need to export symbols, however, whenever other modules may benefit from using them. You may also need to include specific instructions to avoid exporting all non-
static
symbols, as most versions (but not all) of modutils export all of them by default.If your module exports no symbols at all, you might want to make that explicit by placing a line with this macro call in your source file:
EXPORT_NO_SYMBOLS;If
EXPORT_SYMTAB
is defined, individual symbols are exported with a couple of macros:EXPORT_SYMBOL (name); EXPORT_SYMBOL_NOVERS (name);Either version of the macro will make the given symbol available outside the module; the second version (
EXPORT_SYMBOL_NOVERS
) exports the symbol with no versioning information (described in Chapter 11, "kmod and Advanced Modularization"). Symbols must be exported outside of any function because the macros expand to the declaration of a variable. (Interested readers can look at<linux/module.h>
for the details, even though the details are not needed to make things work.)Initialization and Shutdown
As already mentioned, init_module registers any facility offered by the module. By facility, we mean a new functionality, be it a whole driver or a new software abstraction, that can be accessed by an application.
There are other facilities that can be registered as add-ons for certain drivers, but their use is so specific that it's not worth talking about them; they use the stacking technique, as described earlier in "The Kernel Symbol Table"." If you want to probe further, you can grep for EXPORT_SYMBOL in the kernel sources and find the entry points offered by different drivers. Most registration functions are prefixed with
register_
, so another possible way to find them is to grep forregister_
in /proc/ksyms.Error Handling in init_module
If any errors occur when you register utilities, you must undo any registration activities performed before the failure. An error can happen, for example, if there isn't enough memory in the system to allocate a new data structure or because a resource being requested is already being used by other drivers. Though unlikely, it might happen, and good program code must be prepared to handle this event.
Linux doesn't keep a per-module registry of facilities that have been registered, so the module must back out of everything itself if init_module fails at some point. If you ever fail to unregister what you obtained, the kernel is left in an unstable state: you can't register your facilities again by reloading the module because they will appear to be busy, and you can't unregister them because you'd need the same pointer you used to register and you're not likely to be able to figure out the address. Recovery from such situations is tricky, and you'll be often forced to reboot in order to be able to load a newer revision of your module.
int init_module(void) { int err; /* registration takes a pointer and a name */ err = register_this(ptr1, "skull"); if (err) goto fail_this; err = register_that(ptr2, "skull"); if (err) goto fail_that; err = register_those(ptr3, "skull"); if (err) goto fail_those; return 0; /* success */ fail_those: unregister_that(ptr2, "skull"); fail_that: unregister_this(ptr1, "skull"); fail_this: return err; /* propagate the error */ }Another option, requiring no hairy
goto
statements, is keeping track of what has been successfully registered and calling cleanup_module in case of any error. The cleanup function will only unroll the steps that have been successfully accomplished. This alternative, however, requires more code and more CPU time, so in fast paths you'll still resort togoto
as the best error-recovery tool. The return value of init_module,err
, is an error code. In the Linux kernel, error codes are negative numbers belonging to the set defined in<linux/errno.h>
. If you want to generate your own error codes instead of returning what you get from other functions, you should include<linux/errno.h>
in order to use symbolic values such as-ENODEV
,-ENOMEM
, and so on. It is always good practice to return appropriate error codes, because user programs can turn them to meaningful strings using perror or similar means. (However, it's interesting to note that several versions of modutils returned a "Device busy" message for any error returned by init_module; the problem has only been fixed in recent releases.)void cleanup_module(void) { unregister_those(ptr3, "skull"); unregister_that(ptr2, "skull"); unregister_this(ptr1, "skull"); return; }If your initialization and cleanup are more complex than dealing with a few items, the
goto
approach may become difficult to manage, because all the cleanup code must be repeated within init_module, with several labels intermixed. Sometimes, therefore, a different layout of the code proves more successful.struct something *item1; struct somethingelse *item2; int stuff_ok; void cleanup_module(void) { if (item1) release_thing(item1); if (item2) release_thing2(item2); if (stuff_ok) unregister_stuff(); return; } int init_module(void) { int err = -ENOMEM; item1 = allocate_thing(arguments); item2 = allocate_thing2(arguments2); if (!item2 || !item2) goto fail; err = register_stuff(item1, item2); if (!err) stuff_ok = 1; else goto fail; return 0; /* success */ fail: cleanup_module(); return err; }As shown in this code, you may or may not need external flags to mark success of the initialization step, depending on the semantics of the registration/allocation function you call. Whether or not flags are needed, this kind of initialization scales well to a large number of items and is often better than the technique shown earlier.
The Usage Count
The system keeps a usage count for every module in order to determine whether the module can be safely removed. The system needs this information because a module can't be unloaded if it is busy: you can't remove a filesystem type while the filesystem is mounted, and you can't drop a char device while a process is using it, or you'll experience some sort of segmentation fault or kernel panic when wild pointers get dereferenced.
The current value of the usage count is found in the third field of each entry in /proc/modules. This file shows the modules currently loaded in the system, with one entry for each module. The fields are the name of the module, the number of bytes of memory it uses, and the current usage count. This is a typical /proc/modules file:
parport_pc 7604 1 (autoclean) lp 4800 0 (unused) parport 8084 1 [parport_probe parport_pc lp] lockd 33256 1 (autoclean) sunrpc 56612 1 (autoclean) [lockd] ds 6252 1 i82365 22304 1 pcmcia_core 41280 0 [ds i82365]Unloading
To unload a module, use the rmmodcommand. Its task is much simpler than loading, since no linking has to be performed. The command invokes the delete_module system call, which calls cleanup_module in the module itself if the usage count is zero or returns an error otherwise.
Explicit Initialization and Cleanup Functions
As we have seen, the kernel calls init_module to initialize a newly loaded module, and calls cleanup_module just before module removal. In modern kernels, however, these functions often have different names. As of kernel 2.3.13, a facility exists for explicitly naming the module initialization and cleanup routines; using this facility is the preferred programming style.
module_init(my_init); module_exit(my_cleanup);Note that your code must include
<linux/init.h>
to use module_init and module_exit.static int __init my_init(void) { .... } static void __exit my_cleanup(void) { .... }The attribute
__init
, when used in this way, will cause the initialization function to be discarded, and its memory reclaimed, after initialization is complete. It only works, however, for built-in drivers; it has no effect on modules.__exit
, instead, causes the omission of the marked function when the driver is not built as a module; again, in modules, it has no effect.Using Resources
A module can't accomplish its task without using system resources such as memory, I/O ports, I/O memory, and interrupt lines, as well as DMA channels if you use old-fashioned DMA controllers like the Industry Standard Architecture (ISA) one.
As a programmer, you are already accustomed to managing memory allocation; writing kernel code is no different in this regard. Your program obtains a memory area using kmalloc and releases it using kfree. These functions behave like malloc and free, except that kmalloc takes an additional argument, the priority. Usually, a priority of
GFP_KERNEL
orGFP_USER
will do. TheGFP
acronym stands for "get free page." (Memory allocation is covered in detail in Chapter 7, "Getting Hold of Memory".)[11]The memory areas that reside on the peripheral device are commonly called I/O memory to differentiate them from system RAM, which is customarily called memory).
I/O Ports and I/O Memory
The job of a typical driver is, for the most part, writing and reading I/O ports and I/O memory. Access to I/O ports and I/O memory (collectively called I/O regions) happens both at initialization time and during normal operations.
The developers of Linux chose to implement a request/free mechanism for I/O regions, mainly as a way to prevent collisions between different devices. The mechanism has long been in use for I/O ports and was recently generalized to manage resource allocation at large. Note that this mechanism is just a software abstraction that helps system housekeeping, and may or may not be enforced by hardware features. For example, unauthorized access to I/O ports doesn't produce any error condition equivalent to "segmentation fault" -- the hardware can't enforce port registration.
Information about registered resources is available in text form in the files /proc/ioports and /proc/iomem, although the latter was only introduced during 2.3 development. We'll discuss version 2.4 now, introducing portability issues at the end of the chapter.
Ports
0000-001f : dma1 0020-003f : pic1 0040-005f : timer 0060-006f : keyboard 0080-008f : dma page reg 00a0-00bf : pic2 00c0-00df : dma2 00f0-00ff : fpu 0170-0177 : ide1 01f0-01f7 : ide0 02f8-02ff : serial(set) 0300-031f : NE2000 0376-0376 : ide1 03c0-03df : vga+ 03f6-03f6 : ide0 03f8-03ff : serial(set) 1000-103f : Intel Corporation 82371AB PIIX4 ACPI 1000-1003 : acpi 1004-1005 : acpi 1008-100b : acpi 100c-100f : acpi 1100-110f : Intel Corporation 82371AB PIIX4 IDE 1300-131f : pcnet_cs 1400-141f : Intel Corporation 82371AB PIIX4 ACPI 1800-18ff : PCI CardBus #02 1c00-1cff : PCI CardBus #04 5800-581f : Intel Corporation 82371AB PIIX4 USB d000-dfff : PCI Bus #01 d000-d0ff : ATI Technologies Inc 3D Rage LT Pro AGP-133The programming interface used to access the I/O registry is made up of three functions:
int check_region(unsigned long start, unsigned long len); struct resource *request_region(unsigned long start, unsigned long len, char *name); void release_region(unsigned long start, unsigned long len);[12]The actual pointer is used only when the function is called internally by the resource management subsystem of the kernel.
The three functions are actually macros, and they are declared in
<linux/ioport.h>
.#include <linux/ioport.h> #include <linux/errno.h> static int skull_detect(unsigned int port, unsigned int range) { int err; if ((err = check_region(port,range)) < 0) return err; /* busy */ if (skull_probe_hw(port,range) != 0) return -ENODEV; /* not found */ request_region(port,range,"skull"); /* "Can't fail" */ return 0; }This code first looks to see if the required range of ports is available; if the ports cannot be allocated, there is no point in looking for the hardware. The actual allocation of the ports is deferred until after the device is known to exist. The request_region call should never fail; the kernel only loads a single module at a time, so there should not be a problem with other modules slipping in and stealing the ports during the detection phase. Paranoid code can check, but bear in mind that kernels prior to 2.4 define request_region as returning
void
.static void skull_release(unsigned int port, unsigned int range) { release_region(port,range); }Memory
Similar to what happens for I/O ports, I/O memory information is available in the /proc/iomem file. This is a fraction of the file as it appears on a personal computer:
00000000-0009fbff : System RAM 0009fc00-0009ffff : reserved 000a0000-000bffff : Video RAM area 000c0000-000c7fff : Video ROM 000f0000-000fffff : System ROM 00100000-03feffff : System RAM 00100000-0022c557 : Kernel code 0022c558-0024455f : Kernel data 20000000-2fffffff : Intel Corporation 440BX/ZX - 82443BX/ZX Host bridge 68000000-68000fff : Texas Instruments PCI1225 68001000-68001fff : Texas Instruments PCI1225 (#2) e0000000-e3ffffff : PCI Bus #01 e4000000-e7ffffff : PCI Bus #01 e4000000-e4ffffff : ATI Technologies Inc 3D Rage LT Pro AGP-133 e6000000-e6000fff : ATI Technologies Inc 3D Rage LT Pro AGP-133 fffc0000-ffffffff : reservedTo obtain and relinquish access to a certain I/O memory region, the driver should use the following calls:
int check_mem_region(unsigned long start, unsigned long len); int request_mem_region(unsigned long start, unsigned long len, char *name); int release_mem_region(unsigned long start, unsigned long len);if (check_mem_region(mem_addr, mem_size)) { printk("drivername: memory already in use\n"); return -EBUSY; } request_mem_region(mem_addr, mem_size, "drivername");Resource Allocation in Linux 2.4
The current resource allocation mechanism was introduced in Linux 2.3.11 and provides a flexible way of controlling system resources. This section briefly describes the mechanism. However, the basic resource allocation functions (request_region and the rest) are still implemented (via macros) and are still universally used because they are backward compatible with earlier kernel versions. Most module programmers will not need to know about what is really happening under the hood, but those working on more complex drivers may be interested.
Resource ranges are described via a resourcestructure, declared in
<linux/ioport.h>
:struct resource { const char *name; unsigned long start, end; unsigned long flags; struct resource *parent, *sibling, *child; };struct resource ioport_resource = { "PCI IO", 0x0000, IO_SPACE_LIMIT, IORESOURCE_IO };Subranges of a given resource may be created with allocate_resource. For example, during PCI initialization a new resource is created for a region that is actually assigned to a physical device. When the PCI code reads those port or memory assignments, it creates a new resource for just those regions, and allocates them under
ioport_resource
oriomem_resource
.e800-e8ff : Adaptec AHA-2940U2/W / 7890 e800-e8be : aic7xxxThe other advantage to controlling resources in this way is that it partitions the port space into distinct subranges that reflect the hardware of the underlying system. Since the resource allocator will not allow an allocation to cross subranges, it can block a buggy driver (or one looking for hardware that does not exist on the system) from allocating ports that belong to more than range -- even if some of those ports are unallocated at the time.
Automatic and Manual Configuration
Several parameters that a driver needs to know can change from system to system. For instance, the driver must know the hardware's actual I/O addresses, or memory range (this is not a problem with well-designed bus interfaces and only applies to ISA devices). Sometimes you'll need to pass parameters to a driver to help it in finding its own device or to enable/disable specific features.
Parameter values can be assigned at load time by insmod or modprobe; the latter can also read parameter assignment from a configuration file (typically /etc/modules.conf). The commands accept the specification of integer and string values on the command line. Thus, if your module were to provide an integer parameter called skull_ival and a string parameter skull_sval, the parameters could be set at module load time with an insmod command like:
insmod skull skull_ival=666 skull_sval="the beast"int skull_ival=0; char *skull_sval; MODULE_PARM (skull_ival, "i"); MODULE_PARM (skull_sval, "s");Five types are currently supported for module parameters:
b
, one byte;h
, a short (two bytes);i
, an integer;l
, a long; ands
, a string. In the case of string values, a pointer variable should be declared; insmod will allocate the memory for the user-supplied parameter and set the variable accordingly. An integer value preceding the type indicates an array of a given length; two numbers, separated by a hyphen, give a minimum and maximum number of values. If you want to find the author's description of this feature, you should refer to the header file<linux/module.h>
.int skull_array[4]; MODULE_PARM (skull_array, "2-4i");int base_port = 0x300; MODULE_PARM (base_port, "i"); MODULE_PARM_DESC (base_port, "The base I/O port (default 0x300)");All module parameters should be given a default value; insmod will change the value only if explicitly told to by the user. The module can check for explicit parameters by testing parameters against their default values. Automatic configuration, then, can be designed to work this way: if the configuration variables have the default value, perform autodetection; otherwise, keep the current value. In order for this technique to work, the "default" value should be one that the user would never actually want to specify at load time.
/* * port ranges: the device can reside between * 0x280 and 0x300, in steps of 0x10. It uses 0x10 ports. */ #define SKULL_PORT_FLOOR 0x280 #define SKULL_PORT_CEIL 0x300 #define SKULL_PORT_RANGE 0x010 /* * the following function performs autodetection, unless a specific * value was assigned by insmod to "skull_port_base" */ static int skull_port_base=0; /* 0 forces autodetection */ MODULE_PARM (skull_port_base, "i"); MODULE_PARM_DESC (skull_port_base, "Base I/O port for skull"); static int skull_find_hw(void) /* returns the # of devices */ { /* base is either the load-time value or the first trial */ int base = skull_port_base ? skull_port_base : SKULL_PORT_FLOOR; int result = 0; /* loop one time if value assigned; try them all if autodetecting */ do { if (skull_detect(base, SKULL_PORT_RANGE) == 0) { skull_init_board(base); result++; } base += SKULL_PORT_RANGE; /* prepare for next trial */ } while (skull_port_base == 0 && base < SKULL_PORT_CEIL); return result; }
MODULE_AUTHOR (name)
MODULE_DESCRIPTION (desc)
MODULE_SUPPORTED_DEVICE (dev)
Places an entry describing what device is supported by this module. Comments in the kernel source suggest that this parameter may eventually be used to help with automated module loading, but no such use is made at this time.
Doing It in User Space
A Unix programmer who's addressing kernel issues for the first time might well be nervous about writing a module. Writing a user program that reads and writes directly to the device ports is much easier.
The advantages of user-space drivers can be summarized as follows:
Backward Compatibility
The Linux kernel is a moving target -- many things change over time as new features are developed. The interface that we have described in this chapter is that provided by the 2.4 kernel; if your code needs to work on older releases, you will need to take various steps to make that happen.
This is the first of many "backward compatibility" sections in this book. At the end of each chapter we'll cover the things that have changed since version 2.0 of the kernel, and what needs to be done to make your code portable.
For starters, the
KERNEL_VERSION
macro was introduced in kernel 2.1.90. The sysdep.h header file contains a replacement for kernels that need it.Changes in Resource Management
The new resource management scheme brings in a few portability problems if you want to write a driver that can run with kernel versions older than 2.4. This section discusses the portability problems you'll encounter and how the sysdep.hheader tries to hide them.
The most apparent change brought about by the new resource management code is the addition of request_mem_region and related functions. Their role is limited to accessing the I/O memory database, without performing specific operations on any hardware. What you can do with earlier kernels, thus, is to simply not call the functions. The sysdep.h header easily accomplishes that by defining the functions as macros that return
0
for kernels earlier than 2.4.Another difference between 2.4 and earlier kernel versions is in the actual prototypes of request_region and related functions.
Compiling for Multiprocessor Systems
Version 2.0 of the kernel didn't use the
CONFIG_SMP
configuration option to build for SMP systems; instead, choice was made a global assignment in the main kernel makefile. Note that modules compiled for an SMP machine will not work in a uniprocessor kernel, and vice versa, so it is important to get this one right.Exporting Symbols in Linux 2.0
The Linux 2.0 symbol export mechanism was built around a function called register_symtab. A Linux 2.0 module would build a table describing all of the symbols to be exported, then would call register_symtab from its initialization function. Only symbols that were listed in the explicit symbol table were exported to the kernel. If, instead, the function was not called at all, all global symbols were exported.
If your module doesn't need to export any symbols, and you don't want to declare everything as
static
, just hide global symbols by adding the following line to init_module. This call to register_symtab simply overwrites the module's default symbol table with an empty one:register_symtab(NULL);static struct symbol_table skull_syms = { #include <linux/symtab_begin.h> X(skull_fn1), X(skull_fn2), X(skull_variable), #include <linux/symtab_end.h> }; register_symtab(&skull_syms);Writing portable code that controls symbol visibility takes an explicit effort from the device driver programmer. This is a case where it is not sufficient to define a few compatibility macros; instead, portability requires a fair amount of conditional preprocessor code, but the concepts are simple. The first step is to identify the kernel version in use and to define some symbols accordingly. What we chose to do in sysdep.h is define a macro
REGISTER_SYMTAB()
that expands to nothing on version 2.2 and later and expands to register_symtab on version 2.0. Also,__USE_OLD_SYMTAB__
is defined if the old code must be used.#ifdef __USE_OLD_SYMTAB__ static struct symbol_table export_syms = { #include <linux/symtab_begin.h> X(export_function), #include <linux/symtab_end.h> }; #else EXPORT_SYMBOL(export_function); #endif int export_init(void) { REGISTER_SYMTAB(&export_syms); return 0; }Module Configuration Parameters
MODULE_PARM
was introduced in kernel version 2.1.18. With the 2.0 kernel, no parameters were declared explicitly; instead, insmod was able to change the value of any variable within the module. This method had the disadvantage of providing user access to variables for which this mode of access had not been intended; there was also no type checking of parameters.MODULE_PARM
makes module parameters much cleaner and safer, but also makes Linux 2.2 modules incompatible with 2.0 kernels.Quick Reference
This section summarizes the kernel functions, variables, macros, and /proc files that we've touched on in this chapter. It is meant to act as a reference. Each item is listed after the relevant header file, if any. A similar section appears at the end of every chapter from here on, summarizing the new symbols introduced in the chapter.
__KERNEL__
MODULE
Preprocessor symbols, which must both be defined to compile modularized kernel code.
__SMP__
int init_module(void);
void cleanup_module(void);
Module entry points, which must be defined in the module object file.
#include <linux/init.h>
module_init(init_function);
module_exit(cleanup_function);
The modern mechanism for marking a module's initialization and cleanup functions.
#include <linux/module.h>
MOD_INC_USE_COUNT;
MOD_DEC_USE_COUNT;
MOD_IN_USE;
- /proc/modules
The list of currently loaded modules. Entries contain the module name, the amount of memory each module occupies, and the usage count. Extra strings are appended to each line to specify flags that are currently active for the module.
EXPORT_SYMTAB;
Preprocessor macro, required for modules that export symbols.
EXPORT_NO_SYMBOLS;
Macro used to specify that the module exports no symbols to the kernel.
EXPORT_SYMBOL (symbol);
EXPORT_SYMBOL_NOVERS (symbol);
Macro used to export a symbol to the kernel. The second form exports without using versioning information.
int register_symtab(struct symbol_table *);
Function used to specify the set of public symbols in the module. Used in 2.0 kernels only.
#include <linux/symtab_begin.h>
X(symbol),
#include <linux/symtab_end.h>
Headers and preprocessor macro used to declare a symbol table in the 2.0 kernel.
MODULE_PARM(variable, type);
MODULE_PARM_DESC (variable, description);
Macros that make a module variable available as a parameter that may be adjusted by the user at module load time.
MODULE_AUTHOR(author);
MODULE_DESCRIPTION(description);
MODULE_SUPPORTED_DEVICE(device);
#include <linux/version.h>
Required header. It is included by
<linux/module.h>
, unless__NO_VERSION__
is defined (see later in this list).
LINUX_VERSION_CODE
char kernel_version[] = UTS_RELEASE;
__NO_VERSION__
Preprocessor symbol. Prevents declaration of
kernel_version
in<linux/module.h>
.
#include <linux/sched.h>
One of the most important header files. This file contains definitions of much of the kernel API used by the driver, including functions for sleeping and numerous variable declarations.
struct task_struct *current;
current->pid
current->comm
#include <linux/kernel.h>
int printk(const char * fmt, ...);
#include <linux/malloc.h>
void *kmalloc(unsigned int size, int priority);
void kfree(void *obj);
Analogue of malloc and freefor kernel code. Use the value of
GFP_KERNEL
as the priority.
#include <linux/ioport.h>
int check_region(unsigned long from, unsigned long extent);
struct resource *request_region(unsigned long from, unsigned long extent, const char *name);
void release_region(unsigned long from, unsigned long extent);
int check_mem_region (unsigned long start, unsigned long extent);
struct resource *request_mem_region (unsigned long start, unsigned long extent, const char *name);
void release_mem_region (unsigned long start, unsigned long extent);
- /proc/ksyms
- /proc/ioports
- /proc/iomem
Back to: Linux Device Drivers, 2nd Edition
oreilly.com Home | O'Reilly Bookstores | How to Order | O'Reilly Contacts
International | About O'Reilly | Affiliated Companies | Privacy Policy
╘ 2001, O'Reilly & Associates, Inc.